Constrained reinforcement learning via dissipative saddle flow dynamics

文章

学术资源搜索

获得 4 条结果（用时0.02秒）

我的图书馆

Constrained reinforcement learning via dissipative saddle flow dynamics

在引用文章中搜索

[PDF] arxiv.org

A review of safe reinforcement learning: Methods, theory and applications

S Gu, L Yang, Y Du, G Chen, F Walter, J Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …

被引用次数：291 相关文章所有 2 个版本

[PDF] neurips.cc

Last-iterate convergent policy gradient primal-dual methods for constrained mdps

D Ding, CY Wei, K Zhang… - Advances in Neural …, 2024 - proceedings.neurips.cc

We study the problem of computing an optimal policy of an infinite-horizon discounted
constrained Markov decision process (constrained MDP). Despite the popularity of …

被引用次数：25 相关文章所有 6 个版本

[PDF] arxiv.org

Last-iterate global convergence of policy gradients for constrained reinforcement learning

A Montenegro, M Mussi, M Papini… - arXiv preprint arXiv …, 2024 - arxiv.org

Constrained Reinforcement Learning (CRL) tackles sequential decision-making problems
where agents are required to achieve goals by maximizing the expected return while …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

Dissipative Gradient Descent Ascent Method: A Control Theory Inspired Algorithm for Min-max Optimization

T Zheng, N Loizou, P You… - IEEE Control Systems …, 2024 - ieeexplore.ieee.org

Gradient Descent Ascent (GDA) methods for min-max optimization problems typically
produce oscillatory behavior that can lead to instability, eg, in bilinear settings. To address …

被引用次数：1 相关文章所有 4 个版本