A Versatile Adaptive Curriculum Learning Framework for Task-oriented Dialogue Policy Learning

Y Zhao, H Qin, W Zhenyu, C Zhu… - Findings of the …, 2022 - aclanthology.org
Training a deep reinforcement learning-based dialogue policy with brute-force random
sampling is costly. A new training paradigm was proposed to improve learning performance …

Anti-Overestimation Dialogue Policy Learning for Task-Completion Dialogue System

C Tian, W Yin, MF Moens - arXiv preprint arXiv:2207.11762, 2022 - arxiv.org
A dialogue policy module is an essential part of task-completion dialogue systems. Recently,
increasing interest has focused on reinforcement learning (RL)-based dialogue policy. Its …

Reward estimation with scheduled knowledge distillation for dialogue policy learning

J Qiu, H Zhang, Y Yang - Connection Science, 2023 - Taylor & Francis
Formulating dialogue policy as a reinforcement learning (RL) task enables a dialogue
system to act optimally by interacting with humans. However, typical RL-based methods …

A Controllable Lifestyle Simulator for Use in Deep Reinforcement Learning Algorithms

LG Braz, A Susaiyah - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
Deep learning, especially deep reinforcement learning (DRL), has become one of the most
useful tools for solving sequential decision-making problems. This is particularly relevant to …

Rescue Conversations from Dead-ends: Efficient Exploration for Task-oriented Dialogue Policy Optimization

Y Zhao, M Dastani, J Long, Z Wang… - Transactions of the …, 2024 - direct.mit.edu
Training a task-oriented dialogue policy using deep reinforcement learning is promising but
requires extensive environment exploration. The amount of wasted invalid exploration …

Cold-started curriculum learning for task-oriented dialogue policy

H Zhu, Y Zhao, H Qin - 2021 IEEE International Conference on …, 2021 - ieeexplore.ieee.org
Training a satisfactory dialogue policy via Reinforcement Learning (RL) requires significant
interaction costs because of delayed and sparse rewards in task-oriented dialogue tasks …

Learning Dialogue Policy Efficiently Through Dyna Proximal Policy Optimization

C Huang, B Cao - International Conference on Collaborative Computing …, 2022 - Springer
Many methods have been proposed to use reinforcement learning to train dialogue policy
for task-oriented dialogue systems in recent years. However, the high cost of interacting with …

Deep Reinforcement Learning based Insight Selection Policy

LG Braz, APS Susaiyah, M Petkovic, A Härmä - 2022 - openreview.net
We live in the era of ubiquitous sensing and computing. More and more data is being
collected and processed from devices, sensors and systems. This opens up opportunities to …

Deep Reinforcement Learning-based Dialogue Policy with Graph Convolutional Q-network

K Xu, Z Wang, Y Long, Q Zhao - Proceedings of the 2024 Joint …, 2024 - aclanthology.org
Deep Reinforcement learning (DRL) has been successfully applied to the dialogue policy of
task-oriented dialogue systems. However, one challenge in the existing DRL-based …

Systèmes de dialogue apprenant tout au long de leur vie: de l'élaboration à l'évaluation

M Veron - 2022 - theses.hal.science
Les systèmes de dialogue orientés tâche, plus communément appelés chatbots, ont pour
but de réaliser des tâches et de fournir des informations à la demande d'un utilisateur dans …