Fatal or not? Finding errors that lead to dialogue breakdowns in chat-oriented dialogue systems

S Gehrmann, E Clark, T Sellam - Journal of Artificial Intelligence Research, 2023 - jair.org

Abstract Evaluation practices in natural language generation (NLG) have many known flaws,
but improved evaluation approaches are rarely widely adopted. This issue has become …

被引用次数：121 相关文章所有 6 个版本

[HTML] springer.com

[HTML][HTML] Survey on reinforcement learning for language processing

V Uc-Cetina, N Navarro-Guerrero… - Artificial Intelligence …, 2023 - Springer

In recent years some researchers have explored the use of reinforcement learning (RL)
algorithms as key components in the solution of various natural language processing (NLP) …

被引用次数：114 相关文章所有 12 个版本

[PDF] academia.edu

Overview of the dialogue breakdown detection challenge 4

R Higashinaka, LF D'Haro, B Abu Shawar… - … and Flexibility in …, 2021 - Springer

To promote the research and development of dialogue breakdown detection for dialogue
systems, we have been organizing a series of dialogue breakdown detection challenges to …

被引用次数：89 相关文章所有 4 个版本

[PDF] arxiv.org

Underreporting of errors in NLG output, and what to do about it

E Van Miltenburg, MA Clinciu, O Dušek… - arXiv preprint arXiv …, 2021 - arxiv.org

We observe a severe under-reporting of the different kinds of errors that Natural Language
Generation systems make. This is a problem, because mistakes are an important indicator of …

被引用次数：33 相关文章所有 18 个版本

Overview of the sixth dialog system technology challenge: DSTC6

C Hori, J Perez, R Higashinaka, T Hori… - Computer Speech & …, 2019 - Elsevier

This paper describes the experimental setups and the evaluation results of the sixth Dialog
System Technology Challenges (DSTC6) aiming to develop end-to-end dialogue systems …

被引用次数：65 相关文章所有 3 个版本

[PDF] nii.ac.jp

[PDF][PDF] Overview of the NTCIR-12 Short Text Conversation Task.

L Shang, T Sakai, Z Lu, H Li, R Higashinaka, Y Miyao - NTCIR, 2016 - research.nii.ac.jp

We give an overview of the NII Testbeds and Community for Information access Research
(NTCIR)-13 Short Text Conversation (STC) task, which was a core task of NTCIR-13. At …

被引用次数：91 相关文章所有 5 个版本

[PDF] aclanthology.org

Integrated taxonomy of errors in chat-oriented dialogue systems

R Higashinaka, M Araki, H Tsukahara… - Proceedings of the …, 2021 - aclanthology.org

This paper proposes a taxonomy of errors in chat-oriented dialogue systems. Previously, two
taxonomies were proposed; one is theory-driven and the other data-driven. The former …

被引用次数：25 相关文章所有 5 个版本

[PDF] uva.nl

A taxonomy, data set, and benchmark for detecting and classifying malevolent dialogue responses

Y Zhang, P Ren, M De Rijke - Journal of the Association for …, 2021 - Wiley Online Library

Conversational interfaces are increasingly popular as a way of connecting people to
information. With the increased generative capacity of corpus‐based conversational agents …

被引用次数：15 相关文章所有 8 个版本

[PDF] unito.it

[PDF][PDF] Annotating errors and emotions in human-chatbot interactions in Italian

M Sanguinetti, A Mazzei, V Patti, M Scalerandi… - The 14th Linguistic …, 2020 - iris.unito.it

This paper describes a novel annotation scheme specifically designed for a customer-
service context where written interactions take place between a given user and the chatbot …

被引用次数：12 相关文章所有 8 个版本

[PDF] arxiv.org

A role-selected sharing network for joint machine-human chatting handoff and service satisfaction analysis

J Liu, K Song, Y Kang, G He, Z Jiang, C Sun… - arXiv preprint arXiv …, 2021 - arxiv.org

Chatbot is increasingly thriving in different domains, however, because of unexpected
discourse complexity and training data sparseness, its potential distrust hatches vital …

被引用次数：6 相关文章所有 4 个版本