COMET-22: Unbabel-IST 2022 submission for the metrics shared task

R Rei, JGC De Souza, D Alves, C Zerva… - Proceedings of the …, 2022 - aclanthology.org
In this paper, we present the joint contribution of Unbabel and IST to the WMT 2022 Metrics
Shared Task. Our primary submission–dubbed COMET-22–is an ensemble between a …

Bridging the gap: A survey on integrating (human) feedback for natural language generation

P Fernandes, A Madaan, E Liu, A Farinhas… - Transactions of the …, 2023 - direct.mit.edu
Natural language generation has witnessed significant advancements due to the training of
large language models on vast internet-scale datasets. Despite these advancements, there …

Results of the WMT21 metrics shared task: Evaluating metrics with expert-based human evaluations on TED and news domain

M Freitag, R Rei, N Mathur, C Lo… - Proceedings of the …, 2021 - aclanthology.org
This paper presents the results of the WMT21 Metrics Shared Task. Participants were asked
to score the outputs of the translation systems competing in the WMT21 News Translation …

Efficient methods for natural language processing: A survey

M Treviso, JU Lee, T Ji, B Aken, Q Cao… - Transactions of the …, 2023 - direct.mit.edu
Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …

Results of the WMT20 metrics shared task

N Mathur, J Wei, M Freitag, Q Ma… - Proceedings of the Fifth …, 2020 - aclanthology.org
This paper presents the results of the WMT20 Metrics Shared Task. Participants were asked
to score the outputs of the translation systems competing in the WMT20 News Translation …

Quality-aware decoding for neural machine translation

P Fernandes, A Farinhas, R Rei, JGC de Souza… - arXiv preprint arXiv …, 2022 - arxiv.org
Despite the progress in machine translation quality estimation and evaluation in the last
years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers …

Understanding and detecting hallucinations in neural machine translation via model introspection

W Xu, S Agrawal, E Briakou, MJ Martindale… - Transactions of the …, 2023 - direct.mit.edu
Neural sequence generation models are known to “hallucinate”, by producing outputs that
are unrelated to the source text. These hallucinations are potentially harmful, yet it remains …

Are references really needed? unbabel-IST 2021 submission for the metrics shared task

R Rei, AC Farinha, C Zerva, D van Stigt… - Proceedings of the …, 2021 - aclanthology.org
In this paper, we present the joint contribution of Unbabel and IST to the WMT 2021 Metrics
Shared Task. With this year's focus on Multidimensional Quality Metric (MQM) as the ground …

Learning compact metrics for MT

A Pu, HW Chung, AP Parikh, S Gehrmann… - arXiv preprint arXiv …, 2021 - arxiv.org
Recent developments in machine translation and multilingual text generation have led
researchers to adopt trained metrics such as COMET or BLEURT, which treat evaluation as …

Menli: Robust evaluation metrics from natural language inference

Y Chen, S Eger - Transactions of the Association for Computational …, 2023 - direct.mit.edu
Recently proposed BERT-based evaluation metrics for text generation perform well on
standard benchmarks but are vulnerable to adversarial attacks, eg, relating to information …