- 学术资源搜索

Toward a theoretical foundation of policy optimization for learning control policies

B Hu, K Zhang, N Li, M Mesbahi… - Annual Review of …, 2023 - annualreviews.org

Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …

被引用次数：84 相关文章所有 6 个版本

[PDF] mlr.press

Revisiting fundamentals of experience replay

W Fedus, P Ramachandran… - International …, 2020 - proceedings.mlr.press

Experience replay is central to off-policy algorithms in deep reinforcement learning (RL), but
there remain significant gaps in our understanding. We therefore present a systematic and …

被引用次数：330 相关文章所有 12 个版本

[PDF] mlr.press

Policy Optimization for Linear Control with Robustness Guarantee: Implicit Regularization and Global Convergence

K Zhang, B Hu, T Basar - Learning for Dynamics and Control, 2020 - proceedings.mlr.press

Policy optimization (PO) is a key ingredient for modern reinforcement learning (RL). For
control design, certain constraints are usually enforced on the policies to optimize …

被引用次数：126 相关文章所有 7 个版本

Physical-informed neural network for MPC-based trajectory tracking of vehicles with noise considered

L Jin, L Liu, X Wang, M Shang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

The trajectory tracking plays a vital role in unmanned driving technology. Although
traditional control schemes may yield satisfactory outcomes in dealing with simple linear …

被引用次数：35 相关文章

[PDF] utdallas.edu

Learning optimal controllers for linear systems with multiplicative noise via policy gradient

B Gravell, PM Esfahani… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

The linear quadratic regulator (LQR) problem has reemerged as an important theoretical
benchmark for reinforcement learning-based control of complex dynamical systems with …

被引用次数：81 相关文章所有 5 个版本

[PDF] arxiv.org

Policy gradient methods for the noisy linear quadratic regulator over a finite horizon

B Hambly, R Xu, H Yang - SIAM Journal on Control and Optimization, 2021 - SIAM

We explore reinforcement learning methods for finding the optimal policy in the linear
quadratic regulator (LQR) problem. In particular we consider the convergence of policy …

被引用次数：81 相关文章所有 14 个版本

[PDF] mlr.press

Learning the globally optimal distributed LQ regulator

L Furieri, Y Zheng… - Learning for Dynamics …, 2020 - proceedings.mlr.press

We study model-free learning methods for the output-feedback Linear Quadratic (LQ) control
problem in finite-horizon subject to subspace constraints on the control policy. Subspace …

被引用次数：95 相关文章所有 8 个版本

[PDF] neurips.cc

On the stability and convergence of robust adversarial reinforcement learning: A case study on linear quadratic systems

K Zhang, B Hu, T Basar - Advances in Neural Information …, 2020 - proceedings.neurips.cc

Reinforcement learning (RL) algorithms can fail to generalize due to the gap between the
simulation and the real world. One standard remedy is to use robust adversarial RL (RARL) …

被引用次数：62 相关文章所有 7 个版本

[PDF] neurips.cc

Complexity of Derivative-Free Policy Optimization for Structured Control

X Guo, D Keivan, G Dullerud… - Advances in Neural …, 2024 - proceedings.neurips.cc

The applications of direct policy search in reinforcement learning and continuous control
have received increasing attention. In this work, we present novel theoretical results on the …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

Model-free learning with heterogeneous dynamical systems: A federated LQR approach

H Wang, LF Toso, A Mitra, J Anderson - arXiv preprint arXiv:2308.11743, 2023 - arxiv.org

We study a model-free federated linear quadratic regulator (LQR) problem where M agents
with unknown, distinct yet similar dynamics collaboratively learn an optimal policy to …

被引用次数：16 相关文章所有 4 个版本