Toward a theoretical foundation of policy optimization for learning control policies
Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …
diverse application domains. Recently, there has been a renewed interest in studying …
Statistical learning theory for control: A finite-sample perspective
Learning algorithms have become an integral component to modern engineering solutions.
Examples range from self-driving cars and recommender systems to finance and even …
Examples range from self-driving cars and recommender systems to finance and even …
Complexity of Derivative-Free Policy Optimization for Structured Control
The applications of direct policy search in reinforcement learning and continuous control
have received increasing attention. In this work, we present novel theoretical results on the …
have received increasing attention. In this work, we present novel theoretical results on the …
Model-free learning with heterogeneous dynamical systems: A federated LQR approach
We study a model-free federated linear quadratic regulator (LQR) problem where M agents
with unknown, distinct yet similar dynamics collaboratively learn an optimal policy to …
with unknown, distinct yet similar dynamics collaboratively learn an optimal policy to …
Global Convergence of Direct Policy Search for State-Feedback Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential
Direct policy search has been widely applied in modern reinforcement learning and
continuous control. However, the theoretical properties of direct policy search on nonsmooth …
continuous control. However, the theoretical properties of direct policy search on nonsmooth …
Learning the Kalman filter with fine-grained sample complexity
We develop the first end-to-end sample complexity of model-free policy gradient (PG)
methods in discrete-time infinite-horizon Kalman filtering. Specifically, we introduce the …
methods in discrete-time infinite-horizon Kalman filtering. Specifically, we introduce the …
Global convergence of policy gradient primal–dual methods for risk-constrained LQRs
While the techniques in optimal control theory are often model-based, the policy optimization
(PO) approach directly optimizes the performance metric of interest. Even though it has been …
(PO) approach directly optimizes the performance metric of interest. Even though it has been …
Revisiting LQR control from the perspective of receding-horizon policy gradient
We revisit in this letter the discrete-time linear quadratic regulator (LQR) problem from the
perspective of receding-horizon policy gradient (RHPG), a newly developed model-free …
perspective of receding-horizon policy gradient (RHPG), a newly developed model-free …
Convergence and sample complexity of policy gradient methods for stabilizing linear systems
System stabilization via policy gradient (PG) methods has drawn increasing attention in both
control and machine learning communities. In this paper, we study their convergence and …
control and machine learning communities. In this paper, we study their convergence and …
How are policy gradient methods affected by the limits of control?
We study stochastic policy gradient methods from the perspective of control-theoretic
limitations. Our main result is that ill-conditioned linear systems in the sense of Doyle …
limitations. Our main result is that ill-conditioned linear systems in the sense of Doyle …