A survey of language model confidence estimation and calibration
Language models (LMs) have demonstrated remarkable capabilities across a wide range of
tasks in various domains. Despite their impressive performance, the reliability of their output …
tasks in various domains. Despite their impressive performance, the reliability of their output …
Navigating the grey area: How expressions of uncertainty and overconfidence affect language models
The increased deployment of LMs for real-world tasks involving knowledge and facts makes
it important to understand model epistemology: what LMs think they know, and how their …
it important to understand model epistemology: what LMs think they know, and how their …
Uncertainty in natural language processing: Sources, quantification, and applications
As a main field of artificial intelligence, natural language processing (NLP) has achieved
remarkable success via deep neural networks. Plenty of NLP tasks have been addressed in …
remarkable success via deep neural networks. Plenty of NLP tasks have been addressed in …
[HTML][HTML] Understanding and detecting hallucinations in neural machine translation via model introspection
Neural sequence generation models are known to “hallucinate”, by producing outputs that
are unrelated to the source text. These hallucinations are potentially harmful, yet it remains …
are unrelated to the source text. These hallucinations are potentially harmful, yet it remains …
The art of abstention: Selective prediction and error regularization for natural language processing
In selective prediction, a classifier is allowed to abstain from making predictions on low-
confidence examples. Though this setting is interesting and important, selective prediction …
confidence examples. Though this setting is interesting and important, selective prediction …
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration
Despite efforts to expand the knowledge of large language models (LLMs), knowledge gaps-
-missing or outdated information in LLMs--might always persist given the evolving nature of …
-missing or outdated information in LLMs--might always persist given the evolving nature of …
On the calibration of pre-trained language models using mixup guided by area under the margin and saliency
A well-calibrated neural model produces confidence (probability outputs) closely
approximated by the expected accuracy. While prior studies have shown that mixup training …
approximated by the expected accuracy. While prior studies have shown that mixup training …
Challenges of neural machine translation for short texts
Short texts (STs) present in a variety of scenarios, including query, dialog, and entity names.
Most of the exciting studies in neural machine translation (NMT) are focused on tackling …
Most of the exciting studies in neural machine translation (NMT) are focused on tackling …
On compositional generalization of neural machine translation
Modern neural machine translation (NMT) models have achieved competitive performance
in standard benchmarks such as WMT. However, there still exist significant issues such as …
in standard benchmarks such as WMT. However, there still exist significant issues such as …
Self-training sampling with monolingual data uncertainty for neural machine translation
Self-training has proven effective for improving NMT performance by augmenting model
training with synthetic parallel data. The common practice is to construct synthetic data …
training with synthetic parallel data. The common practice is to construct synthetic data …