How to evaluate machine translation: A review of automated and human metrics

E Chatzikoumi - Natural Language Engineering, 2020 - cambridge.org
This article presents the most up-to-date, influential automated, semiautomated and human
metrics used to evaluate the quality of machine translation (MT) output and provides the …

A comprehensive survey on various fully automatic machine translation evaluation metrics

S Chauhan, P Daniel - Neural Processing Letters, 2023 - Springer
The fast advancement in machine translation models necessitates the development of
accurate evaluation metrics that would allow researchers to track the progress in text …

Experts, errors, and context: A large-scale study of human evaluation for machine translation

M Freitag, G Foster, D Grangier, V Ratnakar… - Transactions of the …, 2021 - direct.mit.edu
Human evaluation of modern high-quality machine translation systems is a difficult problem,
and there is increasing evidence that inadequate evaluation procedures can lead to …

A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM

MSU Miah, MM Kabir, TB Sarwar, M Safran… - Scientific Reports, 2024 - nature.com
Sentiment analysis is an essential task in natural language processing that involves
identifying a text's polarity, whether it expresses positive, negative, or neutral sentiments …

IndicMT eval: A dataset to meta-evaluate machine translation metrics for Indian languages

T Dixit, V Nagarajan, A Kunchukuttan… - Proceedings of the …, 2023 - aclanthology.org
The rapid growth of machine translation (MT) systems necessitates meta-evaluations of
evaluation metrics to enable selection of those that best reflect MT quality. Unfortunately …

adaptmllm: Fine-tuning multilingual language models on low-resource languages with integrated llm playgrounds

S Lankford, H Afli, A Way - Information, 2023 - mdpi.com
The advent of Multilingual Language Models (MLLMs) and Large Language Models (LLMs)
has spawned innovation in many areas of natural language processing. Despite the exciting …

Gpt-4 vs. human translators: A comprehensive evaluation of translation quality across languages, domains, and expertise levels

J Yan, P Yan, Y Chen, J Li, X Zhu, Y Zhang - arXiv preprint arXiv …, 2024 - arxiv.org
This study comprehensively evaluates the translation quality of Large Language Models
(LLMs), specifically GPT-4, against human translators of varying expertise levels across …

A product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation output

M Koponen, L Salmi, M Nikulin - Machine Translation, 2019 - Springer
This paper presents a comparison of post-editing (PE) changes performed on English-to-
Finnish neural (NMT), rule-based (RBMT) and statistical machine translation (SMT) output …

Translation quality and error recognition in professional neural machine translation post-editing

J Vardaro, M Schaeffer, S Hansen-Schirra - Informatics, 2019 - mdpi.com
This study aims to analyse how translation experts from the German department of the
European Commission's Directorate-General for Translation (DGT) identify and correct …

Analysing terminology translation errors in statistical and neural machine translation

R Haque, M Hasanuzzaman, A Way - Machine Translation, 2020 - Springer
Terminology translation plays a critical role in domain-specific machine translation (MT).
Phrase-based statistical MT (PB-SMT) has been the dominant approach to MT for the past …