Sample-efficient deep reinforcement learning for dialog control

PH Su, P Budzianowski, S Ultes, M Gasic… - arXiv preprint arXiv …, 2017 - arxiv.org

Deep reinforcement learning (RL) methods have significant potential for dialogue policy
optimisation. However, they suffer from a poor performance in the early stages of learning …

被引用次数：154 相关文章所有 10 个版本

[PDF] arxiv.org

Sample efficient deep reinforcement learning for dialogue systems with large action spaces

G Weisz, P Budzianowski, PH Su… - IEEE/ACM Transactions …, 2018 - ieeexplore.ieee.org

In spoken dialogue systems, we aim to deploy artificial intelligence to build automated
dialogue agents that can converse with humans. A part of this effort is the policy optimization …

被引用次数：101 相关文章所有 7 个版本

[PDF] kanaad.me

[PDF][PDF] Flow: Deep reinforcement learning for control in sumo

N Kheterpal, K Parvate, C Wu, A Kreidieh… - EPiC Series in …, 2018 - kanaad.me

We detail the motivation and design decisions underpinning Flow, a computational
framework integrating SUMO with the deep reinforcement learning libraries rllab and RLlib …

被引用次数：65 相关文章所有 5 个版本

[PDF] neurips.cc

Generalized off-policy actor-critic

S Zhang, W Boehmer… - Advances in neural …, 2019 - proceedings.neurips.cc

We propose a new objective, the counterfactual objective, unifying existing objectives for off-
policy policy gradient algorithms in the continuing reinforcement learning (RL) setting …

被引用次数：57 相关文章所有 12 个版本

[PDF] arxiv.org

Dynamic planning in open-ended dialogue using reinforcement learning

D Cohen, M Ryu, Y Chow, O Keller, I Greenberg… - arXiv preprint arXiv …, 2022 - arxiv.org

Despite recent advances in natural language understanding and generation, and decades
of research on the development of conversational bots, building automated agents that can …

被引用次数：19 相关文章所有 3 个版本

[PDF] arxiv.org

A mixture-of-expert approach to rl-based dialogue management

Y Chow, A Tulepbergenov, O Nachum, MK Ryu… - arXiv preprint arXiv …, 2022 - arxiv.org

Despite recent advancements in language models (LMs), their application to dialogue
management (DM) problems and ability to carry on rich conversations remain a challenge …

被引用次数：13 相关文章所有 10 个版本

[PDF] mdpi.com

Indoor scene change captioning based on multimodality data

Y Qiu, Y Satoh, R Suzuki, K Iwata, H Kataoka - Sensors, 2020 - mdpi.com

This study proposes a framework for describing a scene change using natural language text
based on indoor scene observations conducted before and after a scene change. The …

被引用次数：22 相关文章所有 9 个版本

[PDF] arxiv.org

Reduced robust random cut forest for out-of-distribution detection in machine learning models

H Vardhan, J Sztipanovits - arXiv preprint arXiv:2206.09247, 2022 - arxiv.org

Most machine learning-based regressors extract information from data collected via past
observations of limited length to make predictions in the future. Consequently, when input to …

被引用次数：10 相关文章所有 6 个版本

[PDF] uni-heidelberg.de

[PDF][PDF] Mean actor critic

K Asadi, C Allen, M Roderick, A Mohamed, G Konidaris… - stat, 2017 - cl.uni-heidelberg.de

We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state
reinforcement learning. MAC is a policy gradient algorithm that uses the agent's explicit …

被引用次数：29 相关文章

[PDF] arxiv.org

Towards solving text-based games by producing adaptive action spaces

RY Tao, MA Côté, X Yuan, LE Asri - arXiv preprint arXiv:1812.00855, 2018 - arxiv.org

To solve a text-based game, an agent needs to formulate valid text commands for a given
context and find the ones that lead to success. Recent attempts at solving text-based games …

被引用次数：14 相关文章所有 2 个版本