MEANT 2.0: Accurate semantic MT evaluation for any output language

H Zhang, H Song, S Li, M Zhou, D Song - ACM Computing Surveys, 2023 - dl.acm.org

Controllable Text Generation (CTG) is an emerging area in the field of natural language
generation (NLG). It is regarded as crucial for the development of advanced text generation …

被引用次数：279 相关文章所有 3 个版本

[PDF] arxiv.org

A survey of evaluation metrics used for NLG systems

AB Sai, AK Mohankumar, MM Khapra - ACM Computing Surveys (CSUR …, 2022 - dl.acm.org

In the last few years, a large number of automatic evaluation metrics have been proposed for
evaluating Natural Language Generation (NLG) systems. The rapid development and …

被引用次数：240 相关文章所有 4 个版本

[PDF] arxiv.org

Bertscore: Evaluating text generation with bert

T Zhang, V Kishore, F Wu, KQ Weinberger… - arXiv preprint arXiv …, 2019 - arxiv.org

We propose BERTScore, an automatic evaluation metric for text generation. Analogously to
common metrics, BERTScore computes a similarity score for each token in the candidate …

被引用次数：4836 相关文章所有 4 个版本

[PDF] arxiv.org

MoverScore: Text generation evaluating with contextualized embeddings and earth mover distance

W Zhao, M Peyrard, F Liu, Y Gao, CM Meyer… - arXiv preprint arXiv …, 2019 - arxiv.org

A robust evaluation metric has a profound impact on the development of text generation
systems. A desirable metric compares system output against references based on their …

被引用次数：593 相关文章所有 9 个版本

[PDF] arxiv.org

BERT: a review of applications in natural language processing and understanding

MV Koroteev - arXiv preprint arXiv:2103.11943, 2021 - arxiv.org

In this review, we describe the application of one of the most popular deep learning-based
language models-BERT. The paper describes the mechanism of operation of this model, the …

被引用次数：211 相关文章所有 2 个版本

[PDF] arxiv.org

Automatic machine translation evaluation in many languages via zero-shot paraphrasing

B Thompson, M Post - arXiv preprint arXiv:2004.14564, 2020 - arxiv.org

We frame the task of machine translation evaluation as one of scoring machine translation
output with a sequence-to-sequence paraphraser, conditioned on a human reference. We …

被引用次数：180 相关文章所有 5 个版本

[PDF] mdpi.com

End-to-end transformer-based models in textual-based NLP

A Rahali, MA Akhloufi - AI, 2023 - mdpi.com

Transformer architectures are highly expressive because they use self-attention
mechanisms to encode long-range dependencies in the input sequences. In this paper, we …

被引用次数：58 相关文章所有 5 个版本

[PDF] aclanthology.org

Are references really needed? unbabel-IST 2021 submission for the metrics shared task

R Rei, AC Farinha, C Zerva, D van Stigt… - Proceedings of the …, 2021 - aclanthology.org

In this paper, we present the joint contribution of Unbabel and IST to the WMT 2021 Metrics
Shared Task. With this year's focus on Multidimensional Quality Metric (MQM) as the ground …

被引用次数：73 相关文章所有 5 个版本

[PDF] aclanthology.org

YiSi-a unified semantic MT quality evaluation and estimation metric for languages with different levels of available resources

C Lo - Proceedings of the Fourth Conference on Machine …, 2019 - aclanthology.org

We present YiSi, a unified automatic semantic machine translation quality evaluation and
estimation metric for languages with different levels of available resources. Underneath the …

被引用次数：156 相关文章所有 4 个版本

[PDF] aclanthology.org

[PDF][PDF] Results of the wmt16 metrics shared task

O Bojar, Y Graham, A Kamran… - Proceedings of the First …, 2016 - aclanthology.org

This paper presents the results of the WMT16 Metrics Shared Task. We asked participants of
this task to score the outputs of the MT systems involved in the WMT16 Shared Translation …

被引用次数：254 相关文章所有 7 个版本