Chatgpt to replace crowdsourcing of paraphrases for intent classification: Higher diversity and comparable model robustness
The emergence of generative large language models (LLMs) raises the question: what will
be its impact on crowdsourcing? Traditionally, crowdsourcing has been used for acquiring …
be its impact on crowdsourcing? Traditionally, crowdsourcing has been used for acquiring …
Annotation error detection: Analyzing the past and present for a more coherent future
Annotated data is an essential ingredient in natural language processing for training and
evaluating machine learning models. It is therefore very desirable for the annotations to be …
evaluating machine learning models. It is therefore very desirable for the annotations to be …
User utterance acquisition for training task-oriented bots: a review of challenges, techniques and opportunities
MA Yaghoub-Zadeh-Fard, B Benatallah… - IEEE Internet …, 2020 - ieeexplore.ieee.org
Building conversational task-oriented bots requires large and diverse sets of annotated user
utterances to learn mappings between natural language utterances and user intents. Given …
utterances to learn mappings between natural language utterances and user intents. Given …
End-to-end learning of flowchart grounded task-oriented dialogs
We propose a novel problem within end-to-end learning of task-oriented dialogs (TOD), in
which the dialog system mimics a troubleshooting agent who helps a user by diagnosing …
which the dialog system mimics a troubleshooting agent who helps a user by diagnosing …
Leveraging user paraphrasing behavior in dialog systems to automatically collect annotations for long-tail utterances
In large-scale commercial dialog systems, users express the same request in a wide variety
of alternative ways with a long tail of less frequent alternatives. Handling the full range of this …
of alternative ways with a long tail of less frequent alternatives. Handling the full range of this …
Inconsistencies in crowdsourced slot-filling annotations: A typology and identification methods
S Larson, A Cheung, A Mahendran… - Proceedings of the …, 2020 - aclanthology.org
Slot-filling models in task-driven dialog systems rely on carefully annotated training data.
However, annotations by crowd workers are often inconsistent or contain errors. Simple …
However, annotations by crowd workers are often inconsistent or contain errors. Simple …
A natural language processing approach to detect inconsistencies in death investigation notes attributing suicide circumstances
Background Data accuracy is essential for scientific research and policy development. The
National Violent Death Reporting System (NVDRS) data is widely used for discovering the …
National Violent Death Reporting System (NVDRS) data is widely used for discovering the …
ActiveAED: A human in the loop improves annotation error detection
Manually annotated datasets are crucial for training and evaluating Natural Language
Processing models. However, recent work has discovered that even widely-used benchmark …
Processing models. However, recent work has discovered that even widely-used benchmark …
Dynamic word recommendation to obtain diverse crowdsourced paraphrases of user utterances
MA Yaghoub-Zadeh-Fard, B Benatallah… - Proceedings of the 25th …, 2020 - dl.acm.org
Building task-oriented bots requires mapping a user utterance to an intent with its associated
entities to serve the request. Doing so is not easy since it requires large quantities of high …
entities to serve the request. Doing so is not easy since it requires large quantities of high …
Uncovering Misattributed Suicide Causes through Annotation Inconsistency Detection in Death Investigation Notes
Data accuracy is essential for scientific research and policy development. The National
Violent Death Reporting System (NVDRS) data is widely used for discovering the patterns …
Violent Death Reporting System (NVDRS) data is widely used for discovering the patterns …