Translation quality assessment: A brief survey on manual and automatic methods

L Han, GJF Jones, AF Smeaton - arXiv preprint arXiv:2105.03311, 2021 - arxiv.org
To facilitate effective translation modeling and translation studies, one of the crucial
questions to address is how to assess translation quality. From the perspectives of accuracy …

Shifts: A dataset of real distributional shift across multiple large-scale tasks

A Malinin, N Band, G Chesnokov, Y Gal… - arXiv preprint arXiv …, 2021 - arxiv.org
There has been significant research done on developing methods for improving robustness
to distributional shift and uncertainty estimation. In contrast, only limited work has examined …

Understanding and detecting hallucinations in neural machine translation via model introspection

W Xu, S Agrawal, E Briakou, MJ Martindale… - Transactions of the …, 2023 - direct.mit.edu
Neural sequence generation models are known to “hallucinate”, by producing outputs that
are unrelated to the source text. These hallucinations are potentially harmful, yet it remains …

Shifting attention to relevance: Towards the uncertainty estimation of large language models

J Duan, H Cheng, S Wang, C Wang, A Zavalny… - arXiv preprint arXiv …, 2023 - arxiv.org
Although Large Language Models (LLMs) have shown great potential in Natural Language
Generation, it is still challenging to characterize the uncertainty of model generations, ie …

TransQuest: Translation quality estimation with cross-lingual transformers

T Ranasinghe, C Orasan, R Mitkov - arXiv preprint arXiv:2011.01536, 2020 - arxiv.org
Recent years have seen big advances in the field of sentence-level quality estimation (QE),
largely as a result of using neural-based architectures. However, the majority of these …

Uncertainty estimation in autoregressive structured prediction

A Malinin, M Gales - arXiv preprint arXiv:2002.07650, 2020 - arxiv.org
Uncertainty estimation is important for ensuring safety and robustness of AI systems. While
most research in the area has focused on un-structured prediction tasks, limited work has …

The vendi score: A diversity evaluation metric for machine learning

D Dan Friedman, AB Dieng - Transactions on machine learning …, 2023 - par.nsf.gov
Diversity is an important criterion for many areas of machine learning (ML), including
generative modeling and dataset curation. However, existing metrics for measuring diversity …

Uncertainty in natural language generation: From theory to applications

J Baan, N Daheim, E Ilia, D Ulmer, HS Li… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent advances of powerful Language Models have allowed Natural Language
Generation (NLG) to emerge as an important technology that can not only perform traditional …

Findings of the WMT 2021 shared task on quality estimation

L Specia, F Blain, M Fomicheva, C Zerva… - Proceedings of the …, 2021 - aclanthology.org
We report the results of the WMT 2021 shared task on Quality Estimation, where the
challenge is to predict the quality of the output of neural machine translation systems at the …

MLQE-PE: A multilingual quality estimation and post-editing dataset

M Fomicheva, S Sun, E Fonseca, C Zerva… - arXiv preprint arXiv …, 2020 - arxiv.org
We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE)
and Automatic Post-Editing (APE). The dataset contains eleven language pairs, with human …