End-to-end optimization of task-oriented dialogue model with deep reinforcement learning

T Zahavy, M Haroush, N Merlis… - Advances in neural …, 2018 - proceedings.neurips.cc

Learning how to act when there are many available actions in each state is a challenging
task for Reinforcement Learning (RL) agents, especially when many of the actions are …

被引用次数：234 相关文章所有 10 个版本

[PDF] arxiv.org

Deep dyna-q: Integrating planning for task-completion dialogue policy learning

B Peng, X Li, J Gao, J Liu, KF Wong, SY Su - arXiv preprint arXiv …, 2018 - arxiv.org

Training a task-completion dialogue agent via reinforcement learning (RL) is costly because
it requires many interactions with real users. One common alternative is to use a user …

被引用次数：234 相关文章所有 6 个版本

[PDF] arxiv.org

Building a conversational agent overnight with dialogue self-play

P Shah, D Hakkani-Tür, G Tür, A Rastogi… - arXiv preprint arXiv …, 2018 - arxiv.org

We propose Machines Talking To Machines (M2M), a framework combining automation and
crowdsourcing to rapidly bootstrap end-to-end dialogue agents for goal-oriented dialogues …

被引用次数：221 相关文章所有 10 个版本

[PDF] arxiv.org

Dialogue learning with human teaching and feedback in end-to-end trainable task-oriented dialogue systems

B Liu, G Tur, D Hakkani-Tur, P Shah, L Heck - arXiv preprint arXiv …, 2018 - arxiv.org

In this work, we present a hybrid learning method for training task-oriented dialogue systems
through online user interactions. Popular methods for learning task-oriented dialogues …

被引用次数：206 相关文章所有 6 个版本

[PDF] aclanthology.org

Bootstrapping a neural conversational agent with dialogue self-play, crowdsourcing and on-line reinforcement learning

P Shah, D Hakkani-Tur, B Liu, G Tür - Proceedings of the 2018 …, 2018 - aclanthology.org

End-to-end neural models show great promise towards building conversational agents that
are trained from data and on-line experience using supervised and reinforcement learning …

被引用次数：175 相关文章所有 6 个版本

[PDF] arxiv.org

An end-to-end approach for handling unknown slot values in dialogue state tracking

P Xu, Q Hu - arXiv preprint arXiv:1805.01555, 2018 - arxiv.org

We highlight a practical yet rarely discussed problem in dialogue state tracking (DST),
namely handling unknown slot values. Previous approaches generally assume predefined …

被引用次数：160 相关文章所有 4 个版本

[PDF] mlr.press

Learning to prove theorems via interacting with proof assistants

K Yang, J Deng - International Conference on Machine …, 2019 - proceedings.mlr.press

Humans prove theorems by relying on substantial high-level reasoning and problem-
specific insights. Proof assistants offer a formalism that resembles human mathematical …

被引用次数：135 相关文章所有 14 个版本

[PDF] springer.com

A survey on deep reinforcement learning for audio-based applications

S Latif, H Cuayáhuitl, F Pervez, F Shamshad… - Artificial Intelligence …, 2023 - Springer

Deep reinforcement learning (DRL) is poised to revolutionise the field of artificial intelligence
(AI) by endowing autonomous systems with high levels of understanding of the real world …

被引用次数：72 相关文章所有 10 个版本

[PDF] arxiv.org

Transferable dialogue systems and user simulators

BH Tseng, Y Dai, F Kreyssig, B Byrne - arXiv preprint arXiv:2107.11904, 2021 - arxiv.org

One of the difficulties in training dialogue systems is the lack of training data. We explore the
possibility of creating dialogue data through the interaction between a dialogue system and …

被引用次数：57 相关文章所有 4 个版本

[PDF] aclanthology.org

Airdialogue: An environment for goal-oriented dialogue research

W Wei, Q Le, A Dai, J Li - Proceedings of the 2018 Conference on …, 2018 - aclanthology.org

Recent progress in dialogue generation has inspired a number of studies on dialogue
systems that are capable of accomplishing tasks through natural language interactions. A …

被引用次数：102 相关文章所有 4 个版本