Critic: Large language models can self-correct with tool-interactive critiquing

Z Gou, Z Shao, Y Gong, Y Shen, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent developments in large language models (LLMs) have been impressive. However,
these models sometimes show inconsistencies and problematic behavior, such as …

Bridging the gap: A survey on integrating (human) feedback for natural language generation

P Fernandes, A Madaan, E Liu, A Farinhas… - Transactions of the …, 2023 - direct.mit.edu
Natural language generation has witnessed significant advancements due to the training of
large language models on vast internet-scale datasets. Despite these advancements, there …

Uncertainty in natural language processing: Sources, quantification, and applications

M Hu, Z Zhang, S Zhao, M Huang, B Wu - arXiv preprint arXiv:2306.04459, 2023 - arxiv.org
As a main field of artificial intelligence, natural language processing (NLP) has achieved
remarkable success via deep neural networks. Plenty of NLP tasks have been addressed in …

Shifting attention to relevance: Towards the uncertainty estimation of large language models

J Duan, H Cheng, S Wang, C Wang, A Zavalny… - arXiv preprint arXiv …, 2023 - arxiv.org
Although Large Language Models (LLMs) have shown great potential in Natural Language
Generation, it is still challenging to characterize the uncertainty of model generations, ie …

Are references really needed? unbabel-IST 2021 submission for the metrics shared task

R Rei, AC Farinha, C Zerva, D van Stigt… - Proceedings of the …, 2021 - aclanthology.org
In this paper, we present the joint contribution of Unbabel and IST to the WMT 2021 Metrics
Shared Task. With this year's focus on Multidimensional Quality Metric (MQM) as the ground …

Shifting attention to relevance: Towards the predictive uncertainty quantification of free-form large language models

J Duan, H Cheng, S Wang, A Zavalny… - Proceedings of the …, 2024 - aclanthology.org
Abstract Large Language Models (LLMs) show promising results in language generation
and instruction following but frequently “hallucinate”, making their outputs less reliable …

Uncertainty in natural language generation: From theory to applications

J Baan, N Daheim, E Ilia, D Ulmer, HS Li… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent advances of powerful Language Models have allowed Natural Language
Generation (NLG) to emerge as an important technology that can not only perform traditional …

Conformal prediction for natural language processing: A survey

M Campos, A Farinhas, C Zerva… - Transactions of the …, 2024 - direct.mit.edu
The rapid proliferation of large language models and natural language processing (NLP)
applications creates a crucial need for uncertainty quantification to mitigate risks such as …

Detecting hallucinations in large language models using semantic entropy

S Farquhar, J Kossen, L Kuhn, Y Gal - Nature, 2024 - nature.com
Large language model (LLM) systems, such as ChatGPT or Gemini, can show impressive
reasoning and question-answering capabilities but often 'hallucinate'false outputs and …

Uncertainty estimation and reduction of pre-trained models for text regression

Y Wang, D Beck, T Baldwin, K Verspoor - Transactions of the …, 2022 - direct.mit.edu
State-of-the-art classification and regression models are often not well calibrated, and
cannot reliably provide uncertainty estimates, limiting their utility in safety-critical …