Critic: Large language models can self-correct with tool-interactive critiquing
Recent developments in large language models (LLMs) have been impressive. However,
these models sometimes show inconsistencies and problematic behavior, such as …
these models sometimes show inconsistencies and problematic behavior, such as …
Bridging the gap: A survey on integrating (human) feedback for natural language generation
Natural language generation has witnessed significant advancements due to the training of
large language models on vast internet-scale datasets. Despite these advancements, there …
large language models on vast internet-scale datasets. Despite these advancements, there …
Uncertainty in natural language processing: Sources, quantification, and applications
As a main field of artificial intelligence, natural language processing (NLP) has achieved
remarkable success via deep neural networks. Plenty of NLP tasks have been addressed in …
remarkable success via deep neural networks. Plenty of NLP tasks have been addressed in …
Shifting attention to relevance: Towards the uncertainty estimation of large language models
Although Large Language Models (LLMs) have shown great potential in Natural Language
Generation, it is still challenging to characterize the uncertainty of model generations, ie …
Generation, it is still challenging to characterize the uncertainty of model generations, ie …
Are references really needed? unbabel-IST 2021 submission for the metrics shared task
In this paper, we present the joint contribution of Unbabel and IST to the WMT 2021 Metrics
Shared Task. With this year's focus on Multidimensional Quality Metric (MQM) as the ground …
Shared Task. With this year's focus on Multidimensional Quality Metric (MQM) as the ground …
Shifting attention to relevance: Towards the predictive uncertainty quantification of free-form large language models
Abstract Large Language Models (LLMs) show promising results in language generation
and instruction following but frequently “hallucinate”, making their outputs less reliable …
and instruction following but frequently “hallucinate”, making their outputs less reliable …
Uncertainty in natural language generation: From theory to applications
Recent advances of powerful Language Models have allowed Natural Language
Generation (NLG) to emerge as an important technology that can not only perform traditional …
Generation (NLG) to emerge as an important technology that can not only perform traditional …
Conformal prediction for natural language processing: A survey
The rapid proliferation of large language models and natural language processing (NLP)
applications creates a crucial need for uncertainty quantification to mitigate risks such as …
applications creates a crucial need for uncertainty quantification to mitigate risks such as …
Detecting hallucinations in large language models using semantic entropy
Large language model (LLM) systems, such as ChatGPT or Gemini, can show impressive
reasoning and question-answering capabilities but often 'hallucinate'false outputs and …
reasoning and question-answering capabilities but often 'hallucinate'false outputs and …
Uncertainty estimation and reduction of pre-trained models for text regression
State-of-the-art classification and regression models are often not well calibrated, and
cannot reliably provide uncertainty estimates, limiting their utility in safety-critical …
cannot reliably provide uncertainty estimates, limiting their utility in safety-critical …