Exploring human-like translation strategy with large language models
Large language models (LLMs) have demonstrated impressive capabilities in general
scenarios, exhibiting a level of aptitude that approaches, in some aspects even surpasses …
scenarios, exhibiting a level of aptitude that approaches, in some aspects even surpasses …
Findings of the WMT 2023 shared task on quality estimation
We report the results of the WMT 2023 shared task on Quality Estimation, in which the
challenge is to predict the quality of the output of neural machine translation systems at the …
challenge is to predict the quality of the output of neural machine translation systems at the …
Assessing large language models on climate information
As Large Language Models (LLMs) rise in popularity, it is necessary to assess their
capability in critically relevant domains. We present a comprehensive evaluation framework …
capability in critically relevant domains. We present a comprehensive evaluation framework …
Adapting large language models for document-level machine translation
Large language models (LLMs) have made significant strides in various natural language
processing (NLP) tasks. Recent research shows that the moderately-sized LLMs often …
processing (NLP) tasks. Recent research shows that the moderately-sized LLMs often …
Machine translation and the evaluation of its quality
Abstract Machine translation has already become part of our everyday life. This chapter
gives an overview of machine translation approaches. Statistical machine translation was a …
gives an overview of machine translation approaches. Statistical machine translation was a …
Bhasa: A holistic southeast asian linguistic and cultural evaluation suite for large language models
WQ Leong, JG Ngui, Y Susanto, H Rengarajan… - arXiv preprint arXiv …, 2023 - arxiv.org
The rapid development of Large Language Models (LLMs) and the emergence of novel
abilities with scale have necessitated the construction of holistic, diverse and challenging …
abilities with scale have necessitated the construction of holistic, diverse and challenging …
DHP Benchmark: Are LLMs Good NLG Evaluators?
Large Language Models (LLMs) are increasingly serving as evaluators in Natural Language
Generation (NLG) tasks. However, the capabilities of LLMs in scoring NLG quality remain …
Generation (NLG) tasks. However, the capabilities of LLMs in scoring NLG quality remain …
Metric score landscape challenge (MSLC23): Understanding metrics' performance on a wider landscape of translation quality
Abstract The Metric Score Landscape Challenge (MSLC23) dataset aims to gain insight into
metric scores on a broader/wider landscape of machine translation (MT) quality. It provides a …
metric scores on a broader/wider landscape of machine translation (MT) quality. It provides a …
Multi-view fusion for universal translation quality estimation
Abstract Machine translation quality estimation (QE) aims to evaluate the result of translation
without reference. Despite the progress it has made, state-of-the-art QE models are proven …
without reference. Despite the progress it has made, state-of-the-art QE models are proven …
MT-Ranker: Reference-free machine translation evaluation by inter-system ranking
Traditionally, Machine Translation (MT) Evaluation has been treated as a regression
problem--producing an absolute translation-quality score. This approach has two limitations …
problem--producing an absolute translation-quality score. This approach has two limitations …