查看文章

Continuous control with deep reinforcement learning

作者

Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra

发表日期

2015/9/9

期刊

arXiv preprint arXiv:1509.02971

简介

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic problems such as cartpole swing-up, dexterous manipulation, legged locomotion and car driving. Our algorithm is able to find policies whose performance is competitive with those found by a planning algorithm with full access to the dynamics of the domain and its derivatives. We further demonstrate that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.

引用总数

被引用次数：16240

201620172018201920202021202220232024116 366 888 1598 2176 2730 3153 3497 1638

学术搜索中的文章

Continuous control with deep reinforcement learning

TP Lillicrap, JJ Hunt, A Pritzel, N Heess, T Erez… - arXiv preprint arXiv:1509.02971, 2015

被引用次数：16240 相关文章所有 15 个版本