COMET-22: Unbabel-IST 2022 submission for the metrics shared task

R Rei, JGC De Souza, D Alves, C Zerva… - Proceedings of the …, 2022 - aclanthology.org
In this paper, we present the joint contribution of Unbabel and IST to the WMT 2022 Metrics
Shared Task. Our primary submission–dubbed COMET-22–is an ensemble between a …

Findings of the 2021 conference on machine translation (WMT21)

A Farhad, A Arkady, B Magdalena, B Ondřej… - Proceedings of the …, 2021 - cris.fbk.eu
This paper presents the results of the news translation task, the multilingual low-resource
translation for Indo-European languages, the triangular translation task, and the automatic …

Quality-aware decoding for neural machine translation

P Fernandes, A Farinhas, R Rei, JGC de Souza… - arXiv preprint arXiv …, 2022 - arxiv.org
Despite the progress in machine translation quality estimation and evaluation in the last
years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers …

UniTE: Unified translation evaluation

Y Wan, D Liu, B Yang, H Zhang, B Chen… - arXiv preprint arXiv …, 2022 - arxiv.org
Translation quality evaluation plays a crucial role in machine translation. According to the
input format, it is mainly separated into three tasks, ie, reference-only, source-only and …

Toward human-like evaluation for natural language generation with error analysis

Q Lu, L Ding, L Xie, K Zhang, DF Wong… - arXiv preprint arXiv …, 2022 - arxiv.org
The state-of-the-art language model-based automatic metrics, eg BARTScore, benefiting
from large-scale contextualized pre-training, have been successfully used in a wide range of …

The eval4nlp 2023 shared task on prompting large language models as explainable metrics

C Leiter, J Opitz, D Deutsch, Y Gao, R Dror… - arXiv preprint arXiv …, 2023 - arxiv.org
With an increasing number of parameters and pre-training data, generative large language
models (LLMs) have shown remarkable capabilities to solve tasks with minimal or no task …

Findings of the WMT 2023 shared task on quality estimation

F Blain, C Zerva, R Rei, NM Guerreiro… - Proceedings of the …, 2023 - aclanthology.org
We report the results of the WMT 2023 shared task on Quality Estimation, in which the
challenge is to predict the quality of the output of neural machine translation systems at the …

On the limitations of reference-free evaluations of generated text

D Deutsch, R Dror, D Roth - arXiv preprint arXiv:2210.12563, 2022 - arxiv.org
There is significant interest in developing evaluation metrics which accurately estimate the
quality of generated text without the aid of a human-written reference text, which can be time …

Optimal transport for unsupervised hallucination detection in neural machine translation

NM Guerreiro, P Colombo, P Piantanida… - arXiv preprint arXiv …, 2022 - arxiv.org
Neural machine translation (NMT) has become the de-facto standard in real-world machine
translation applications. However, NMT models can unpredictably produce severely …

Transformers go for the LOLs: Generating (humourous) titles from scientific abstracts end-to-end

Y Chen, S Eger - arXiv preprint arXiv:2212.10522, 2022 - arxiv.org
We consider the end-to-end abstract-to-title generation problem, exploring seven recent
transformer based models (including ChatGPT) fine-tuned on more than 30k abstract-title …