xcomet: Transparent machine translation evaluation through fine-grained error detection

NM Guerreiro, R Rei, D van Stigt, L Coheur… - arXiv preprint arXiv …, 2023 - arxiv.org
Widely used learned metrics for machine translation evaluation, such as COMET and
BLEURT, estimate the quality of a translation hypothesis by providing a single sentence …

Towards explainable evaluation metrics for machine translation

C Leiter, P Lertvittayakumjorn, M Fomicheva… - Journal of Machine …, 2024 - jmlr.org
Unlike classical lexical overlap metrics such as BLEU, most current evaluation metrics for
machine translation (for example, COMET or BERTScore) are based on black-box large …

Machine translation meta evaluation through translation accuracy challenge sets

N Moghe, A Fazla, C Amrhein, T Kocmi… - Computational …, 2024 - direct.mit.edu
Recent machine translation (MT) metrics calibrate their effectiveness by correlating with
human judgement. However, these results are often obtained by averaging predictions …

Aligning translation-specific understanding to general understanding in large language models

Y Huang, X Feng, B Li, C Fu, W Huo, T Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Although large language models (LLMs) have shown surprising language understanding
and generation capabilities, they have yet to gain a revolutionary advancement in the field of …

xcomet: Transparent Machine Translation Evaluation through Fine-grained Error Detection

NM Guerreiro, R Rei, D Stigt, L Coheur… - Transactions of the …, 2024 - direct.mit.edu
Widely used learned metrics for machine translation evaluation, such as Comet and Bleurt,
estimate the quality of a translation hypothesis by providing a single sentence-level score …

[PDF][PDF] xCOMET: Transparent Machine Translation Evaluation through Fine-grained Error Detection

P Colombo, N Guerreiro, R Rei, D Van… - Transactions of the …, 2023 - hal.science
Widely used learned metrics for machine translation evaluation, such as COMET and
BLEURT, estimate the quality of a translation hypothesis by providing a single sentence …

xTower: A Multilingual LLM for Explaining and Correcting Translation Errors

M Treviso, NM Guerreiro, S Agrawal, R Rei… - arXiv preprint arXiv …, 2024 - arxiv.org
While machine translation (MT) systems are achieving increasingly strong performance on
benchmarks, they often produce translations with errors and anomalies. Understanding …

Cyber Risks of Machine Translation Critical Errors: Arabic Mental Health Tweets as a Case Study

H Saadany, A Tantawy, C Orasan - arXiv preprint arXiv:2405.11668, 2024 - arxiv.org
With the advent of Neural Machine Translation (NMT) systems, the MT output has reached
unprecedented accuracy levels which resulted in the ubiquity of MT tools on almost all …

Chasing COMET: Leveraging Minimum Bayes Risk Decoding for Self-Improving Machine Translation

K Guttmann, M Pokrywka, A Charkiewicz… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper explores Minimum Bayes Risk (MBR) decoding for self-improvement in machine
translation (MT), particularly for domain adaptation and low-resource languages. We …

Original Research Article A comparative analysis of lexical-based automatic evaluation metrics for different Indic language pairs

K Kaur, S Chauhan - Journal of Autonomous Intelligence, 2024 - jai.front-sci.com
With the rise of machine translation systems, it has become essential to evaluate the quality
of translations produced by these systems. However, the existing evaluation metrics …