A survey of evaluation metrics used for NLG systems

AB Sai, AK Mohankumar, MM Khapra - ACM Computing Surveys (CSUR …, 2022 - dl.acm.org
In the last few years, a large number of automatic evaluation metrics have been proposed for
evaluating Natural Language Generation (NLG) systems. The rapid development and …

A comprehensive survey on various fully automatic machine translation evaluation metrics

S Chauhan, P Daniel - Neural Processing Letters, 2023 - Springer
The fast advancement in machine translation models necessitates the development of
accurate evaluation metrics that would allow researchers to track the progress in text …

Semeval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation

E Agirre, C Banea, D Cer, M Diab… - SemEval-2016. 10th …, 2016 - repositori.upf.edu
Semantic Textual Similarity (STS) seeks to measure the degree of semantic equivalence
between two snippets of text. Similarity is expressed on an ordinal scale that spans from …

Semantic structural evaluation for text simplification

E Sulem, O Abend, A Rappoport - arXiv preprint arXiv:1810.05022, 2018 - arxiv.org
Current measures for evaluating text simplification systems focus on evaluating lexical text
aspects, neglecting its structural aspects. In this paper we propose the first measure to …

On the limitations of cross-lingual encoders as exposed by reference-free machine translation evaluation

W Zhao, G Glavaš, M Peyrard, Y Gao, R West… - arXiv preprint arXiv …, 2020 - arxiv.org
Evaluation of cross-lingual encoders is usually performed either via zero-shot cross-lingual
transfer in supervised downstream tasks or via unsupervised cross-lingual textual similarity …

[PDF][PDF] MEANT 2.0: Accurate semantic MT evaluation for any output language

C Lo - Proceedings of the second conference on machine …, 2017 - aclanthology.org
We describe a new version of MEANT, which participated in the metrics task of the Second
Conference on Machine Translation (WMT 2017). MEANT 2.0 uses idfweighted …

Adequacy–fluency metrics: Evaluating mt in the continuous space model framework

RE Banchs, LF D'Haro, H Li - IEEE/ACM Transactions on Audio …, 2015 - ieeexplore.ieee.org
This work extends and evaluates a two-dimensional automatic evaluation metric for machine
translation, which is designed to operate at the sentence level. The metric is based on the …

SemSyn: Semantic-Syntactic Similarity Based Automatic Machine Translation Evaluation Metric

S Chauhan, R Kumar, S Saxena, A Kaur… - IETE journal of …, 2024 - Taylor & Francis
Machine translation evaluation is difficult and challenging for natural languages because
different languages behave differently for the same dataset. Lexical-based metrics have …

On the evaluation of semantic phenomena in neural machine translation using natural language inference

A Poliak, Y Belinkov, J Glass, B Van Durme - arXiv preprint arXiv …, 2018 - arxiv.org
We propose a process for investigating the extent to which sentence representations arising
from neural machine translation (NMT) systems encode distinct semantic phenomena. We …

Reference-less measure of faithfulness for grammatical error correction

L Choshen, O Abend - arXiv preprint arXiv:1804.03824, 2018 - arxiv.org
We propose USim, a semantic measure for Grammatical Error Correction (GEC) that
measures the semantic faithfulness of the output to the source, thereby complementing …