查看文章

neurips.cc 中的 [PDF]

Distral: Robust multitask reinforcement learning

作者

Yee Teh, Victor Bapst, Wojciech M Czarnecki, John Quan, James Kirkpatrick, Raia Hadsell, Nicolas Heess, Razvan Pascanu

发表日期

2017

研讨会论文

Advances in Neural Information Processing Systems

页码范围

4496-4506

简介

Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network parameters, where efficiency may be improved through transfer across related tasks. In practice, however, this is not usually observed, because gradients from different tasks can interfere negatively, making learning unstable and sometimes even less data efficient. Another issue is the different reward schemes between tasks, which can easily lead to one task dominating the learning of a shared model. We propose a new approach for joint training of multiple tasks, which we refer to as Distral (DIStill & TRAnsfer Learning). Instead of sharing parameters between the different workers, we propose to share a distilled policy that captures common behaviour across tasks. Each worker is trained to solve its own task while constrained to stay close to the shared policy, while the shared policy is trained by distillation to be the centroid of all task policies. Both aspects of the learning process are derived by optimizing a joint objective function. We show that our approach supports efficient transfer on complex 3D environments, outperforming several related methods. Moreover, the proposed learning process is more robust and more stable---attributes that are critical in deep reinforcement learning.

引用总数

被引用次数：604

2017201820192020202120222023202413 60 88 82 105 117 87 51

学术搜索中的文章

Distral: Robust multitask reinforcement learning

Y Teh, V Bapst, WM Czarnecki, J Quan, J Kirkpatrick… - Advances in neural information processing systems, 2017

被引用次数：604 相关文章所有 8 个版本