Findings of the 2022 conference on machine translation (WMT22)

T Kocmi, R Bawden, O Bojar… - Proceedings of the …, 2022 - aclanthology.org
This paper presents the results of the General Machine Translation Task organised as part
of the Conference on Machine Translation (WMT) 2022. In the general MT task, participants …

A survey on zero pronoun translation

L Wang, S Liu, M Xu, L Song, S Shi, Z Tu - arXiv preprint arXiv:2305.10196, 2023 - arxiv.org
Zero pronouns (ZPs) are frequently omitted in pro-drop languages (eg Chinese, Hungarian,
and Hindi), but should be recalled in non-pro-drop languages (eg English). This …

Sentence-Level or Token-Level? A Comprehensive Study on Knowledge Distillation

J Wei, L Sun, Y Leng, X Tan, B Yu, R Guo - arXiv preprint arXiv …, 2024 - arxiv.org
Knowledge distillation, transferring knowledge from a teacher model to a student model, has
emerged as a powerful technique in neural machine translation for compressing models or …

Granularity is crucial when applying differential privacy to text: An investigation for neural machine translation

DNL Vu, T Igamberdiev, I Habernal - arXiv preprint arXiv:2407.18789, 2024 - arxiv.org
Applying differential privacy (DP) by means of the DP-SGD algorithm to protect individual
data points during training is becoming increasingly popular in NLP. However, the choice of …

xTower: A Multilingual LLM for Explaining and Correcting Translation Errors

M Treviso, NM Guerreiro, S Agrawal, R Rei… - arXiv preprint arXiv …, 2024 - arxiv.org
While machine translation (MT) systems are achieving increasingly strong performance on
benchmarks, they often produce translations with errors and anomalies. Understanding …

Can Automatic Metrics Assess High-Quality Translations?

S Agrawal, A Farinhas, R Rei, AFT Martins - arXiv preprint arXiv …, 2024 - arxiv.org
Automatic metrics for evaluating translation quality are typically validated by measuring how
well they correlate with human assessments. However, correlation methods tend to capture …

Dialogue Quality and Emotion Annotations for Customer Support Conversations

J Mendonça, P Pereira, M Menezes… - arXiv preprint arXiv …, 2023 - arxiv.org
Task-oriented conversational datasets often lack topic variability and linguistic diversity.
However, with the advent of Large Language Models (LLMs) pretrained on extensive …

MQM-Chat: Multidimensional Quality Metrics for Chat Translation

Y Li, J Suzuki, M Morishita, K Abe, K Inui - arXiv preprint arXiv:2408.16390, 2024 - arxiv.org
The complexities of chats pose significant challenges for machine translation models.
Recognizing the need for a precise evaluation metric to address the issues of chat …

[PDF][PDF] Document-Level Pretraining for Neural Chat Translation

T Kaiser - 2023 - ai4lt.anthropomatik.kit.edu
This study aims to investigate methods for inducing document-level context awareness in
Neural Machine Translation (NMT) models for bilingual chat data. Previous research …