Poor Man's Quality Estimation: Predicting Reference-Based MT Metrics Without the Reference

Z Wang, G Zhang, K Yang, N Shi, W Zhou… - arXiv preprint arXiv …, 2023 - arxiv.org

Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within
the field of NLP, aimed at addressing limitations in existing frameworks while aligning with …

被引用次数：43 相关文章所有 5 个版本

[PDF] jmlr.org

Towards explainable evaluation metrics for machine translation

C Leiter, P Lertvittayakumjorn, M Fomicheva… - Journal of Machine …, 2024 - jmlr.org

Unlike classical lexical overlap metrics such as BLEU, most current evaluation metrics for
machine translation (for example, COMET or BERTScore) are based on black-box large …

被引用次数：7 相关文章所有 5 个版本

[PDF] aclanthology.org

Characterizing the Confidence of Large Language Model-Based Automatic Evaluation Metrics

R Stureborg, D Alikaniotis, Y Suhara - Proceedings of the 18th …, 2024 - aclanthology.org

There has recently been a growing interest in using Large Language Models (LLMs) to
evaluate NLP tasks automatically. Considerable research effort has been put into improving …

被引用次数：1 相关文章

[PDF] arxiv.org

Tailoring domain adaptation for machine translation quality estimation

JPR Sharami, D Shterionov, F Blain… - arXiv preprint arXiv …, 2023 - arxiv.org

While quality estimation (QE) can play an important role in the translation process, its
effectiveness relies on the availability and quality of training data. For QE in particular, high …

被引用次数：2 相关文章所有 9 个版本

[PDF] arxiv.org

Predicting Machine Translation Performance on Low-Resource Languages: The Role of Domain Similarity

E Khiu, H Toossi, D Anugraha, J Liu, J Li… - arXiv preprint arXiv …, 2024 - arxiv.org

Fine-tuning and testing a multilingual large language model is expensive and challenging
for low-resource languages (LRLs). While previous studies have predicted the performance …

被引用次数：1 相关文章所有 8 个版本

[PDF] arxiv.org

Fine-tuned machine translation metrics struggle in unseen domains

V Zouhar, S Ding, A Currey, T Badeka, J Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

We introduce a new, extensive multidimensional quality metrics (MQM) annotated dataset
covering 11 language pairs in the biomedical domain. We use this dataset to investigate …