On-line dialogue policy learning with companion teaching

V Uc-Cetina, N Navarro-Guerrero… - Artificial Intelligence …, 2023 - Springer

In recent years some researchers have explored the use of reinforcement learning (RL)
algorithms as key components in the solution of various natural language processing (NLP) …

被引用次数：119 相关文章所有 12 个版本

[PDF] springer.com

A survey on recent advances and challenges in reinforcement learning methods for task-oriented dialogue policy learning

WC Kwan, HR Wang, HM Wang, KF Wong - Machine Intelligence …, 2023 - Springer

Dialogue policy learning (DPL) is a key component in a task-oriented dialogue (TOD)
system. Its goal is to decide the next action of the dialogue system, given the dialogue state …

被引用次数：27 相关文章所有 6 个版本

[PDF] arxiv.org

Multi-agent reinforcement learning: Methods, applications, visionary prospects, and challenges

Z Zhou, G Liu, Y Tang - arXiv preprint arXiv:2305.10091, 2023 - arxiv.org

Multi-agent reinforcement learning (MARL) is a widely used Artificial Intelligence (AI)
technique. However, current studies and applications need to address its scalability, non …

被引用次数：16 相关文章所有 2 个版本

[PDF] aaai.org

Learning rewards from linguistic feedback

TR Sumers, MK Ho, RD Hawkins… - Proceedings of the …, 2021 - ojs.aaai.org

We explore unconstrained natural language feedback as a learning signal for artificial
agents. Humans use rich and varied language to teach, yet most prior work on interactive …

被引用次数：52 相关文章所有 9 个版本

[PDF] arxiv.org

AgentGraph: Toward universal dialogue management with structured deep reinforcement learning

L Chen, Z Chen, B Tan, S Long… - … /ACM Transactions on …, 2019 - ieeexplore.ieee.org

Dialogue policy plays an important role in task-oriented spoken dialogue systems. It
determines how to respond to users. The recently proposed deep reinforcement learning …

被引用次数：46 相关文章所有 6 个版本

[PDF] aclanthology.org

Agent-aware dropout dqn for safe and efficient on-line dialogue policy learning

L Chen, X Zhou, C Chang, R Yang… - Proceedings of the 2017 …, 2017 - aclanthology.org

Hand-crafted rules and reinforcement learning (RL) are two popular choices to obtain
dialogue policy. The rule-based policy is often reliable within predefined scope but not self …

被引用次数：47 相关文章所有 4 个版本

[PDF] aclanthology.org

Structured dialogue policy with graph neural networks

L Chen, B Tan, S Long, K Yu - Proceedings of the 27th …, 2018 - aclanthology.org

Recently, deep reinforcement learning (DRL) has been used for dialogue policy
optimization. However, many DRL-based policies are not sample-efficient. Most recent …

被引用次数：34 相关文章

[PDF] aclanthology.org

Efficient dialogue complementary policy learning via deep q-network policy and episodic memory policy

Y Zhao, Z Wang, C Zhu, S Wang - Proceedings of the 2021 …, 2021 - aclanthology.org

Deep reinforcement learning has shown great potential in training dialogue policies.
However, its favorable performance comes at the cost of many rounds of interaction. Most of …

被引用次数：10 相关文章所有 4 个版本

[PDF] aaai.org

Automatic curriculum learning with over-repetition penalty for dialogue policy learning

Y Zhao, Z Wang, Z Huang - Proceedings of the AAAI Conference on …, 2021 - ojs.aaai.org

Dialogue policy learning based on reinforcement learning is difficult to be applied to real
users to train dialogue agents from scratch because of the high cost. User simulators, which …

被引用次数：14 相关文章所有 7 个版本

Decomposed Deep Q-Network for Coherent Task-Oriented Dialogue Policy Learning

Y Zhao, K Yin, Z Wang, M Dastani… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org

Reinforcement learning (RL) has emerged as a key technique for designing dialogue
policies. However, action space inflation in dialogue tasks has led to a heavy decision …

被引用次数：1 相关文章所有 4 个版本