Efficient local planning with linear function approximation

JD Chang, W Zhan, O Oertell, K Brantley… - arXiv preprint arXiv …, 2024 - arxiv.org

Reinforcement Learning (RL) from Human Preference-based feedback is a popular
paradigm for fine-tuning generative models, which has produced impressive models such as …

被引用次数：20 相关文章所有 2 个版本

[PDF] mlr.press

Distributionally robust model-based reinforcement learning with large state spaces

SS Ramesh, PG Sessa, Y Hu… - International …, 2024 - proceedings.mlr.press

Three major challenges in reinforcement learning are the complex dynamical systems with
large state spaces, the costly data acquisition processes, and the deviation of real-world …

被引用次数：10 相关文章所有 4 个版本

[PDF] neurips.cc

Minimax-optimal multi-agent RL in Markov games with a generative model

G Li, Y Chi, Y Wei, Y Chen - Advances in Neural …, 2022 - proceedings.neurips.cc

This paper studies multi-agent reinforcement learning in Markov games, with the goal of
learning Nash equilibria or coarse correlated equilibria (CCE) sample-optimally. All prior …

被引用次数：22 相关文章所有 9 个版本

[PDF] mlr.press

Hardness of independent learning and sparse equilibrium computation in markov games

DJ Foster, N Golowich… - … Conference on Machine …, 2023 - proceedings.mlr.press

We consider the problem of decentralized multi-agent reinforcement learning in Markov
games. A fundamental question is whether there exist algorithms that, when run …

被引用次数：14 相关文章所有 9 个版本

[PDF] neurips.cc

Online RL in Linearly -Realizable MDPs Is as Easy as in Linear MDPs If You Learn What to Ignore

G Weisz, A György… - Advances in Neural …, 2024 - proceedings.neurips.cc

We consider online reinforcement learning (RL) in episodic Markov decision processes
(MDPs) under the linear $ q^\pi $-realizability assumption, where it is assumed that the …

被引用次数：4 相关文章所有 5 个版本

[PDF] mlr.press

Exponential hardness of reinforcement learning with linear function approximation

S Liu, G Mahajan, D Kane, S Lovett… - The Thirty Sixth …, 2023 - proceedings.mlr.press

A fundamental question in reinforcement learning theory is: suppose the optimal value
functions are linear in given features, can we learn them efficiently? This problem's …

被引用次数：5 相关文章

[PDF] neurips.cc

Sample-efficient reinforcement learning is feasible for linearly realizable MDPs with limited revisiting

G Li, Y Chen, Y Chi, Y Gu… - Advances in Neural …, 2021 - proceedings.neurips.cc

Low-complexity models such as linear function representation play a pivotal role in enabling
sample-efficient reinforcement learning (RL). The current paper pertains to a scenario with …

被引用次数：32 相关文章所有 12 个版本

[PDF] neurips.cc

Confident Approximate Policy Iteration for Efficient Local Planning in -realizable MDPs

G Weisz, A György, T Kozuno… - Advances in Neural …, 2022 - proceedings.neurips.cc

We consider approximate dynamic programming in $\gamma $-discounted Markov decision
processes and apply it to approximate planning with linear value-function approximation …

被引用次数：11 相关文章所有 8 个版本

[PDF] mlr.press

Efficient global planning in large MDPs via stochastic primal-dual optimization

G Neu, N Okolo - International Conference on Algorithmic …, 2023 - proceedings.mlr.press

We propose a new stochastic primal-dual optimization algorithm for planning in a large
discounted Markov decision process with a generative model and linear function …

被引用次数：10 相关文章所有 4 个版本

[PDF] arxiv.org

Can agents run relay race with strangers? generalization of RL to out-of-distribution trajectories

LC Lan, H Zhang, CJ Hsieh - arXiv preprint arXiv:2304.13424, 2023 - arxiv.org

In this paper, we define, evaluate, and improve the``relay-generalization''performance of
reinforcement learning (RL) agents on the out-of-distribution``controllable''states. Ideally, an …

被引用次数：9 相关文章所有 4 个版本