BERT: a review of applications in natural language processing and understanding

MV Koroteev - arXiv preprint arXiv:2103.11943, 2021 - arxiv.org
In this review, we describe the application of one of the most popular deep learning-based
language models-BERT. The paper describes the mechanism of operation of this model, the …

Bertscore: Evaluating text generation with bert

T Zhang, V Kishore, F Wu, KQ Weinberger… - arXiv preprint arXiv …, 2019 - arxiv.org
We propose BERTScore, an automatic evaluation metric for text generation. Analogously to
common metrics, BERTScore computes a similarity score for each token in the candidate …

YiSi-a unified semantic MT quality evaluation and estimation metric for languages with different levels of available resources

C Lo - Proceedings of the Fourth Conference on Machine …, 2019 - aclanthology.org
We present YiSi, a unified automatic semantic machine translation quality evaluation and
estimation metric for languages with different levels of available resources. Underneath the …

Findings of the WMT 2018 shared task on parallel corpus filtering

P Koehn, H Khayrallah, K Heafield… - EMNLP 2018 Third …, 2018 - research.ed.ac.uk
We posed the shared task of assigning sentence-level quality scores for a very noisy corpus
of sentence pairs crawled from the web, with the goal of sub-selecting 1% and 10% of high …

Automatic text evaluation through the lens of Wasserstein barycenters

P Colombo, G Staerman, C Clavel… - arXiv preprint arXiv …, 2021 - arxiv.org
A new metric\texttt {BaryScore} to evaluate text generation based on deep contextualized
embeddings eg, BERT, Roberta, ELMo) is introduced. This metric is motivated by a new …

Results of the WMT18 metrics shared task: Both characters and embeddings achieve good performance

Q Ma, O Bojar, Y Graham - Proceedings of the third conference on …, 2018 - aclanthology.org
This paper presents the results of the WMT18 Metrics Shared Task. We asked participants of
this task to score the outputs of the MT systems involved in the WMT18 News Translation …

Parallel corpus filtering via pre-trained language models

B Zhang, A Nagesh, K Knight - arXiv preprint arXiv:2005.06166, 2020 - arxiv.org
Web-crawled data provides a good source of parallel corpora for training machine
translation models. It is automatically obtained, but extremely noisy, and recent work shows …

Towards reference-free text simplification evaluation with a BERT siamese network architecture

X Zhao, E Durmus, DY Yeung - Findings of the Association for …, 2023 - aclanthology.org
Text simplification (TS) aims to modify sentences to make their both content and structure
easier to understand. Traditional n-gram matching-based TS evaluation metrics heavily rely …

Fully unsupervised crosslingual semantic textual similarity metric based on BERT for identifying parallel data

C Lo, M Simard - Proceedings of the 23rd Conference on …, 2019 - aclanthology.org
We present a fully unsupervised crosslingual semantic textual similarity (STS) metric, based
on contextual embeddings extracted from BERT–Bidirectional Encoder Representations …

[PDF][PDF] NRC parallel corpus filtering system for WMT 2019

G Bernier-Colborne, C Lo - … of the Fourth Conference on Machine …, 2019 - aclanthology.org
NRC Parallel Corpus Filtering System for WMT 2019 Page 1 Proceedings of the Fourth
Conference on Machine Translation (WMT), Volume 3: Shared Task Papers (Day 2) pages …