Accurate semantic textual similarity for cleaning noisy parallel corpora using semantic machine...

MV Koroteev - arXiv preprint arXiv:2103.11943, 2021 - arxiv.org

In this review, we describe the application of one of the most popular deep learning-based
language models-BERT. The paper describes the mechanism of operation of this model, the …

被引用次数：197 相关文章所有 2 个版本

[PDF] arxiv.org

Bertscore: Evaluating text generation with bert

T Zhang, V Kishore, F Wu, KQ Weinberger… - arXiv preprint arXiv …, 2019 - arxiv.org

We propose BERTScore, an automatic evaluation metric for text generation. Analogously to
common metrics, BERTScore computes a similarity score for each token in the candidate …

被引用次数：4620 相关文章所有 4 个版本

[PDF] aclanthology.org

YiSi-a unified semantic MT quality evaluation and estimation metric for languages with different levels of available resources

C Lo - Proceedings of the Fourth Conference on Machine …, 2019 - aclanthology.org

We present YiSi, a unified automatic semantic machine translation quality evaluation and
estimation metric for languages with different levels of available resources. Underneath the …

被引用次数：152 相关文章所有 4 个版本

[PDF] ed.ac.uk

Findings of the WMT 2018 shared task on parallel corpus filtering

P Koehn, H Khayrallah, K Heafield… - EMNLP 2018 Third …, 2018 - research.ed.ac.uk

We posed the shared task of assigning sentence-level quality scores for a very noisy corpus
of sentence pairs crawled from the web, with the goal of sub-selecting 1% and 10% of high …

被引用次数：120 相关文章所有 12 个版本

[PDF] arxiv.org

Automatic text evaluation through the lens of Wasserstein barycenters

P Colombo, G Staerman, C Clavel… - arXiv preprint arXiv …, 2021 - arxiv.org

A new metric\texttt {BaryScore} to evaluate text generation based on deep contextualized
embeddings eg, BERT, Roberta, ELMo) is introduced. This metric is motivated by a new …

被引用次数：50 相关文章所有 12 个版本

[PDF] aclanthology.org

Results of the WMT18 metrics shared task: Both characters and embeddings achieve good performance

Q Ma, O Bojar, Y Graham - Proceedings of the third conference on …, 2018 - aclanthology.org

This paper presents the results of the WMT18 Metrics Shared Task. We asked participants of
this task to score the outputs of the MT systems involved in the WMT18 News Translation …

被引用次数：114 相关文章所有 4 个版本

[PDF] arxiv.org

Parallel corpus filtering via pre-trained language models

B Zhang, A Nagesh, K Knight - arXiv preprint arXiv:2005.06166, 2020 - arxiv.org

Web-crawled data provides a good source of parallel corpora for training machine
translation models. It is automatically obtained, but extremely noisy, and recent work shows …

被引用次数：32 相关文章所有 3 个版本

[PDF] aclanthology.org

Towards reference-free text simplification evaluation with a BERT siamese network architecture

X Zhao, E Durmus, DY Yeung - Findings of the Association for …, 2023 - aclanthology.org

Text simplification (TS) aims to modify sentences to make their both content and structure
easier to understand. Traditional n-gram matching-based TS evaluation metrics heavily rely …

被引用次数：2 相关文章所有 3 个版本

[PDF] aclanthology.org

Fully unsupervised crosslingual semantic textual similarity metric based on BERT for identifying parallel data

C Lo, M Simard - Proceedings of the 23rd Conference on …, 2019 - aclanthology.org

We present a fully unsupervised crosslingual semantic textual similarity (STS) metric, based
on contextual embeddings extracted from BERT–Bidirectional Encoder Representations …

被引用次数：20 相关文章所有 2 个版本

[PDF] aclanthology.org

[PDF][PDF] NRC parallel corpus filtering system for WMT 2019

G Bernier-Colborne, C Lo - … of the Fourth Conference on Machine …, 2019 - aclanthology.org

NRC Parallel Corpus Filtering System for WMT 2019 Page 1 Proceedings of the Fourth
Conference on Machine Translation (WMT), Volume 3: Shared Task Papers (Day 2) pages …

被引用次数：14 相关文章所有 6 个版本