作者
Ziyu Wang, Victor Bapst, Nicolas Heess, Volodymyr Mnih, Remi Munos, Koray Kavukcuoglu, Nando de Freitas
发表日期
2016/11/3
期刊
arXiv preprint arXiv:1611.01224
简介
This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems. To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochastic dueling network architectures, and a new trust region policy optimization method.
引用总数
201720182019202020212022202320244110613715116818812863
学术搜索中的文章
Z Wang, V Bapst, N Heess, V Mnih, R Munos… - arXiv preprint arXiv:1611.01224, 2016