Uncertainty-aware machine translation evaluation

Z Gou, Z Shao, Y Gong, Y Shen, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent developments in large language models (LLMs) have been impressive. However,
these models sometimes show inconsistencies and problematic behavior, such as …

被引用次数：207 相关文章所有 4 个版本

[PDF] mit.edu

Bridging the gap: A survey on integrating (human) feedback for natural language generation

P Fernandes, A Madaan, E Liu, A Farinhas… - Transactions of the …, 2023 - direct.mit.edu

Natural language generation has witnessed significant advancements due to the training of
large language models on vast internet-scale datasets. Despite these advancements, there …

被引用次数：71 相关文章所有 9 个版本

[PDF] arxiv.org

Uncertainty in natural language processing: Sources, quantification, and applications

M Hu, Z Zhang, S Zhao, M Huang, B Wu - arXiv preprint arXiv:2306.04459, 2023 - arxiv.org

As a main field of artificial intelligence, natural language processing (NLP) has achieved
remarkable success via deep neural networks. Plenty of NLP tasks have been addressed in …

被引用次数：35 相关文章所有 2 个版本

[PDF] arxiv.org

Shifting attention to relevance: Towards the uncertainty estimation of large language models

J Duan, H Cheng, S Wang, C Wang, A Zavalny… - arXiv preprint arXiv …, 2023 - arxiv.org

Although Large Language Models (LLMs) have shown great potential in Natural Language
Generation, it is still challenging to characterize the uncertainty of model generations, ie …

被引用次数：51 相关文章所有 4 个版本

[PDF] aclanthology.org

Are references really needed? unbabel-IST 2021 submission for the metrics shared task

R Rei, AC Farinha, C Zerva, D van Stigt… - Proceedings of the …, 2021 - aclanthology.org

In this paper, we present the joint contribution of Unbabel and IST to the WMT 2021 Metrics
Shared Task. With this year's focus on Multidimensional Quality Metric (MQM) as the ground …

被引用次数：80 相关文章所有 5 个版本

[PDF] aclanthology.org

Shifting attention to relevance: Towards the predictive uncertainty quantification of free-form large language models

J Duan, H Cheng, S Wang, A Zavalny… - Proceedings of the …, 2024 - aclanthology.org

Abstract Large Language Models (LLMs) show promising results in language generation
and instruction following but frequently “hallucinate”, making their outputs less reliable …

被引用次数：12 相关文章所有 3 个版本

[PDF] arxiv.org

Uncertainty in natural language generation: From theory to applications

J Baan, N Daheim, E Ilia, D Ulmer, HS Li… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent advances of powerful Language Models have allowed Natural Language
Generation (NLG) to emerge as an important technology that can not only perform traditional …

被引用次数：29 相关文章所有 2 个版本

[PDF] mit.edu

Conformal prediction for natural language processing: A survey

M Campos, A Farinhas, C Zerva… - Transactions of the …, 2024 - direct.mit.edu

The rapid proliferation of large language models and natural language processing (NLP)
applications creates a crucial need for uncertainty quantification to mitigate risks such as …

被引用次数：5 相关文章所有 2 个版本

[PDF] nature.com

Detecting hallucinations in large language models using semantic entropy

S Farquhar, J Kossen, L Kuhn, Y Gal - Nature, 2024 - nature.com

Large language model (LLM) systems, such as ChatGPT or Gemini, can show impressive
reasoning and question-answering capabilities but often 'hallucinate'false outputs and …

被引用次数：113 相关文章所有 3 个版本

[PDF] mit.edu

Uncertainty estimation and reduction of pre-trained models for text regression

Y Wang, D Beck, T Baldwin, K Verspoor - Transactions of the …, 2022 - direct.mit.edu

State-of-the-art classification and regression models are often not well calibrated, and
cannot reliably provide uncertainty estimates, limiting their utility in safety-critical …

被引用次数：31 相关文章所有 9 个版本