Learn what not to learn: Action elimination with deep reinforcement learning

T Zahavy, M Haroush, N Merlis… - Advances in neural …, 2018 - proceedings.neurips.cc
Learning how to act when there are many available actions in each state is a challenging
task for Reinforcement Learning (RL) agents, especially when many of the actions are …

Deep dyna-q: Integrating planning for task-completion dialogue policy learning

B Peng, X Li, J Gao, J Liu, KF Wong, SY Su - arXiv preprint arXiv …, 2018 - arxiv.org
Training a task-completion dialogue agent via reinforcement learning (RL) is costly because
it requires many interactions with real users. One common alternative is to use a user …

Building a conversational agent overnight with dialogue self-play

P Shah, D Hakkani-Tür, G Tür, A Rastogi… - arXiv preprint arXiv …, 2018 - arxiv.org
We propose Machines Talking To Machines (M2M), a framework combining automation and
crowdsourcing to rapidly bootstrap end-to-end dialogue agents for goal-oriented dialogues …

Dialogue learning with human teaching and feedback in end-to-end trainable task-oriented dialogue systems

B Liu, G Tur, D Hakkani-Tur, P Shah, L Heck - arXiv preprint arXiv …, 2018 - arxiv.org
In this work, we present a hybrid learning method for training task-oriented dialogue systems
through online user interactions. Popular methods for learning task-oriented dialogues …

Bootstrapping a neural conversational agent with dialogue self-play, crowdsourcing and on-line reinforcement learning

P Shah, D Hakkani-Tur, B Liu, G Tür - Proceedings of the 2018 …, 2018 - aclanthology.org
End-to-end neural models show great promise towards building conversational agents that
are trained from data and on-line experience using supervised and reinforcement learning …

An end-to-end approach for handling unknown slot values in dialogue state tracking

P Xu, Q Hu - arXiv preprint arXiv:1805.01555, 2018 - arxiv.org
We highlight a practical yet rarely discussed problem in dialogue state tracking (DST),
namely handling unknown slot values. Previous approaches generally assume predefined …

Learning to prove theorems via interacting with proof assistants

K Yang, J Deng - International Conference on Machine …, 2019 - proceedings.mlr.press
Humans prove theorems by relying on substantial high-level reasoning and problem-
specific insights. Proof assistants offer a formalism that resembles human mathematical …

A survey on deep reinforcement learning for audio-based applications

S Latif, H Cuayáhuitl, F Pervez, F Shamshad… - Artificial Intelligence …, 2023 - Springer
Deep reinforcement learning (DRL) is poised to revolutionise the field of artificial intelligence
(AI) by endowing autonomous systems with high levels of understanding of the real world …

Transferable dialogue systems and user simulators

BH Tseng, Y Dai, F Kreyssig, B Byrne - arXiv preprint arXiv:2107.11904, 2021 - arxiv.org
One of the difficulties in training dialogue systems is the lack of training data. We explore the
possibility of creating dialogue data through the interaction between a dialogue system and …

Airdialogue: An environment for goal-oriented dialogue research

W Wei, Q Le, A Dai, J Li - Proceedings of the 2018 Conference on …, 2018 - aclanthology.org
Recent progress in dialogue generation has inspired a number of studies on dialogue
systems that are capable of accomplishing tasks through natural language interactions. A …