Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?

A Riabi, B Sagot, D Seddah - arXiv preprint arXiv:2110.13658, 2021 - arxiv.org
Recent impressive improvements in NLP, largely based on the success of contextual neural
language models, have been mostly demonstrated on at most a couple dozen high-resource …

The impact of indirect machine translation on sentiment classification

A Poncelas, P Lohar, A Way, J Hadley - arXiv preprint arXiv:2008.11257, 2020 - arxiv.org
Sentiment classification has been crucial for many natural language processing (NLP)
applications, such as the analysis of movie reviews, tweets, or customer feedback. A …

Comparing statistical and neural machine translation performance on hindi-to-tamil and english-to-tamil

A Ramesh, VB Parthasarathy, R Haque, A Way - Digital, 2021 - mdpi.com
Phrase-based statistical machine translation (PB-SMT) has been the dominant paradigm in
machine translation (MT) research for more than two decades. Deep neural MT models have …

RoCS-MT: Robustness Challenge Set for Machine Translation

R Bawden, B Sagot - WMT23-Eighth Conference on Machine …, 2023 - hal.science
RoCS-MT, a Robust Challenge Set for Machine Translation (MT), is designed to test MT
systems' ability to translate user-generated content (UGC) that displays non-standard …

Limsi@ wmt 2020

S Abdul-Rauf, JC Rosales, MQ Pham… - Conference on Machine …, 2020 - hal.science
This paper describes LIMSI's submissions to the translation shared tasks at WMT'20. This
year we have focused our efforts on the biomedical translation task, developing a resource …

Understanding the impact of UGC specificities on translation quality

JCR Núñez, D Seddah, G Wisniewski - arXiv preprint arXiv:2110.12551, 2021 - arxiv.org
This work takes a critical look at the evaluation of user-generated content automatic
translation, the well-known specificities of which raise many challenges for MT. Our analyses …

An error-based investigation of statistical and neural machine translation performance on Hindi-to-Tamil and English-to-Tamil

A Ramesh, VB Parthasarathy, R Haque, A Way - 2020 - doras.dcu.ie
Statistical machine translation (SMT) was the state-of-the-art in machine translation (MT)
research for more than two decades, but has since been superseded by neural MT (NMT) …

Effects of different types of noise in user-generated reviews on human and machine translations including ChatGPT

M Popović, E Lapshinova-Koltunski… - Proceedings of the …, 2024 - aclanthology.org
This paper investigates effects of noisy source texts (containing spelling and grammar
errors, informal words or expressions, etc.) on human and machine translations, namely …

Phonetic normalization for machine translation of user generated content

JCR Núñez, D Seddah… - Proceedings of the 5th …, 2019 - aclanthology.org
We present an approach to correct noisy User Generated Content (UGC) in French aiming to
produce a pretreatement pipeline to improve Machine Translation for this kind of non …

Multi-way Variational NMT for UGC: Improving Robustness in Zero-shot Scenarios via Mixture Density Networks

JCR Núñez, D Seddah, G Wisniewski - NoDaLiDa 2023-24th Nordic …, 2023 - hal.science
This work presents a novel Variational Neural Machine Translation (VNMT) architecture with
enhanced robustness properties, which we investigate through a detailed case-study …