Dqn-tamer: Human-in-the-loop reinforcement learning with intractable feedback

R Arakawa, S Kobayashi, Y Unno, Y Tsuboi… - arXiv preprint arXiv …, 2018 - arxiv.org
Exploration has been one of the greatest challenges in reinforcement learning (RL), which is
a large obstacle in the application of RL to robotics. Even with state-of-the-art RL algorithms,
building a well-learned agent often requires too many trials, mainly due to the difficulty of
matching its actions with rewards in the distant future. A remedy for this is to train an agent
with real-time feedback from a human observer who immediately gives rewards for some
actions. This study tackles a series of challenges for introducing such a human-in-the-loop …

[引用][C] Dqn-tamer: Human-in-the-loop reinforcement learning with intractable feedback. arXiv preprint arXiv: 181011748

R Arakawa, S Kobayashi, Y Unno, Y Tsuboi, SI Maeda - 2018
以上显示的是最相近的搜索结果。 查看全部搜索结果