Survey on reinforcement learning for language processing

V Uc-Cetina, N Navarro-Guerrero… - Artificial Intelligence …, 2023 - Springer
In recent years some researchers have explored the use of reinforcement learning (RL)
algorithms as key components in the solution of various natural language processing (NLP) …

A survey on recent advances and challenges in reinforcement learning methods for task-oriented dialogue policy learning

WC Kwan, HR Wang, HM Wang, KF Wong - Machine Intelligence …, 2023 - Springer
Dialogue policy learning (DPL) is a key component in a task-oriented dialogue (TOD)
system. Its goal is to decide the next action of the dialogue system, given the dialogue state …

Multi-agent reinforcement learning: Methods, applications, visionary prospects, and challenges

Z Zhou, G Liu, Y Tang - arXiv preprint arXiv:2305.10091, 2023 - arxiv.org
Multi-agent reinforcement learning (MARL) is a widely used Artificial Intelligence (AI)
technique. However, current studies and applications need to address its scalability, non …

Learning rewards from linguistic feedback

TR Sumers, MK Ho, RD Hawkins… - Proceedings of the …, 2021 - ojs.aaai.org
We explore unconstrained natural language feedback as a learning signal for artificial
agents. Humans use rich and varied language to teach, yet most prior work on interactive …

AgentGraph: Toward universal dialogue management with structured deep reinforcement learning

L Chen, Z Chen, B Tan, S Long… - … /ACM Transactions on …, 2019 - ieeexplore.ieee.org
Dialogue policy plays an important role in task-oriented spoken dialogue systems. It
determines how to respond to users. The recently proposed deep reinforcement learning …

Agent-aware dropout dqn for safe and efficient on-line dialogue policy learning

L Chen, X Zhou, C Chang, R Yang… - Proceedings of the 2017 …, 2017 - aclanthology.org
Hand-crafted rules and reinforcement learning (RL) are two popular choices to obtain
dialogue policy. The rule-based policy is often reliable within predefined scope but not self …

Structured dialogue policy with graph neural networks

L Chen, B Tan, S Long, K Yu - Proceedings of the 27th …, 2018 - aclanthology.org
Recently, deep reinforcement learning (DRL) has been used for dialogue policy
optimization. However, many DRL-based policies are not sample-efficient. Most recent …

Efficient dialogue complementary policy learning via deep q-network policy and episodic memory policy

Y Zhao, Z Wang, C Zhu, S Wang - Proceedings of the 2021 …, 2021 - aclanthology.org
Deep reinforcement learning has shown great potential in training dialogue policies.
However, its favorable performance comes at the cost of many rounds of interaction. Most of …

Automatic curriculum learning with over-repetition penalty for dialogue policy learning

Y Zhao, Z Wang, Z Huang - Proceedings of the AAAI Conference on …, 2021 - ojs.aaai.org
Dialogue policy learning based on reinforcement learning is difficult to be applied to real
users to train dialogue agents from scratch because of the high cost. User simulators, which …

Decomposed Deep Q-Network for Coherent Task-Oriented Dialogue Policy Learning

Y Zhao, K Yin, Z Wang, M Dastani… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
Reinforcement learning (RL) has emerged as a key technique for designing dialogue
policies. However, action space inflation in dialogue tasks has led to a heavy decision …