Are we learning yet? a meta review of evaluation failures across machine learning

T Liao, R Taori, ID Raji, L Schmidt - Thirty-fifth Conference on …, 2021 - openreview.net
Many subfields of machine learning share a common stumbling block: evaluation. Advances
in machine learning often evaporate under closer scrutiny or turn out to be less widely …

[PDF][PDF] Findings of the 2014 workshop on statistical machine translation

O Bojar, C Buck, C Federmann, B Haddow… - Proceedings of the …, 2014 - aclanthology.org
This paper presents the results of the WMT14 shared tasks, which included a standard news
translation task, a separate medical translation task, a task for run-time estimation of …

[PDF][PDF] Findings of the 2011 workshop on statistical machine translation

C Callison-Burch, P Koehn, C Monz… - Proceedings of the sixth …, 2011 - aclanthology.org
This paper presents the results of the WMT11 shared tasks, which included a translation
task, a system combination task, and a task for machine translation evaluation metrics. We …

The efficacy of human post-editing for language translation

S Green, J Heer, CD Manning - … of the SIGCHI conference on human …, 2013 - dl.acm.org
Language translation is slow and expensive, so various forms of machine assistance have
been devised. Automatic machine translation systems process text quickly and cheaply, but …

Can machine translation systems be evaluated by the crowd alone

Y Graham, T Baldwin, A Moffat, J Zobel - Natural Language …, 2017 - cambridge.org
Crowd-sourced assessments of machine translation quality allow evaluations to be carried
out cheaply and on a large scale. It is essential, however, that the crowd's work be filtered to …

Method and system for automatic management of reputation of translators

D Marcu, M Dreyer - US Patent 10,261,994, 2019 - Google Patents
The present invention provides a method that includes receiving a result word set in a target
language representing a translation of a test word set in a source language. When the result …

Cognitive effort in post-editing of machine translation: evidence from eye movements, subjective ratings, and think-aloud protocols

LN Vieira - 2016 - research-information.bris.ac.uk
This thesis investigates the expenditure of cognitive effort in post-editing of machine
translation. A mixed-method approach involving the use of eye movements, subjective …

Judge the judges: A large-scale evaluation study of neural language models for online review generation

C Garbacea, S Carton, S Yan, Q Mei - arXiv preprint arXiv:1901.00398, 2019 - arxiv.org
We conduct a large-scale, systematic study to evaluate the existing evaluation methods for
natural language generation in the context of generating online product reviews. We …

Automatic metric validation for grammatical error correction

L Choshen, O Abend - arXiv preprint arXiv:1804.11225, 2018 - arxiv.org
Metric validation in Grammatical Error Correction (GEC) is currently done by observing the
correlation between human and metric-induced rankings. However, such correlation studies …

Systems, methods, and media for executing and optimizing online marketing initiatives

D Erasmus, L Møllebjerg, RP Faber - US Patent 10,580,015, 2020 - Google Patents
US PATENT D ()(" UMENTS 7.191. 447 B1 3/2007 Ellis et al. 7.207, 005 B2 4/2007 I aktritz
5,644,775 A 7/1997 Thompson et al. 7.209. 875 B2 4 2007 Quirk 5,675,802 A 10/1997 Allen …