Exploring human-like translation strategy with large language models

Z He, T Liang, W Jiao, Z Zhang, Y Yang… - Transactions of the …, 2024 - direct.mit.edu
Large language models (LLMs) have demonstrated impressive capabilities in general
scenarios, exhibiting a level of aptitude that approaches, in some aspects even surpasses …

Findings of the WMT 2023 shared task on quality estimation

F Blain, C Zerva, R Rei, NM Guerreiro… - Proceedings of the …, 2023 - aclanthology.org
We report the results of the WMT 2023 shared task on Quality Estimation, in which the
challenge is to predict the quality of the output of neural machine translation systems at the …

Assessing large language models on climate information

J Bulian, MS Schäfer, A Amini, H Lam… - arXiv preprint arXiv …, 2023 - arxiv.org
As Large Language Models (LLMs) rise in popularity, it is necessary to assess their
capability in critically relevant domains. We present a comprehensive evaluation framework …

Adapting large language models for document-level machine translation

M Wu, TT Vu, L Qu, G Foster, G Haffari - arXiv preprint arXiv:2401.06468, 2024 - arxiv.org
Large language models (LLMs) have made significant strides in various natural language
processing (NLP) tasks. Recent research shows that the moderately-sized LLMs often …

Machine translation and the evaluation of its quality

MS Maučec, G Donaj - Recent trends in computational …, 2019 - books.google.com
Abstract Machine translation has already become part of our everyday life. This chapter
gives an overview of machine translation approaches. Statistical machine translation was a …

Bhasa: A holistic southeast asian linguistic and cultural evaluation suite for large language models

WQ Leong, JG Ngui, Y Susanto, H Rengarajan… - arXiv preprint arXiv …, 2023 - arxiv.org
The rapid development of Large Language Models (LLMs) and the emergence of novel
abilities with scale have necessitated the construction of holistic, diverse and challenging …

DHP Benchmark: Are LLMs Good NLG Evaluators?

Y Wang, J Yuan, YN Chuang, Z Wang, Y Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) are increasingly serving as evaluators in Natural Language
Generation (NLG) tasks. However, the capabilities of LLMs in scoring NLG quality remain …

Metric score landscape challenge (MSLC23): Understanding metrics' performance on a wider landscape of translation quality

C Lo, S Larkin, R Knowles - … of the Eighth Conference on Machine …, 2023 - aclanthology.org
Abstract The Metric Score Landscape Challenge (MSLC23) dataset aims to gain insight into
metric scores on a broader/wider landscape of machine translation (MT) quality. It provides a …

Multi-view fusion for universal translation quality estimation

H Huang, S Wu, K Chen, X Liang, H Di, M Yang… - Information …, 2024 - Elsevier
Abstract Machine translation quality estimation (QE) aims to evaluate the result of translation
without reference. Despite the progress it has made, state-of-the-art QE models are proven …

MT-Ranker: Reference-free machine translation evaluation by inter-system ranking

IM Moosa, R Zhang, W Yin - arXiv preprint arXiv:2401.17099, 2024 - arxiv.org
Traditionally, Machine Translation (MT) Evaluation has been treated as a regression
problem--producing an absolute translation-quality score. This approach has two limitations …