Active learning and crowdsourcing for machine translation in low resource scenarios
V Ambati - 2012 - search.proquest.com
Corpus based approaches to automatic translation such as Example Based and Statistical
Machine Translation systems use large amounts of parallel data created by humans to train …
Machine Translation systems use large amounts of parallel data created by humans to train …
[PDF][PDF] Improving statistical machine translation performance by training data selection and optimization
Parallel corpus is an indispensable resource for translation model training in statistical
machine translation (SMT). Instead of collecting more and more parallel training corpora …
machine translation (SMT). Instead of collecting more and more parallel training corpora …
[PDF][PDF] Instance selection for machine translation using feature decay algorithms
We present an empirical study of instance selection techniques for machine translation. In
an active learning setting, instance selection minimizes the human effort by identifying the …
an active learning setting, instance selection minimizes the human effort by identifying the …
[PDF][PDF] Low cost portability for statistical machine translation based on n-gram frequency and tf-idf
Statistical machine translation relies heavily on the available training data. In some cases it
is necessary to limit the amount of training data that can be created for or actually used by …
is necessary to limit the amount of training data that can be created for or actually used by …
Optimizing instance selection for statistical machine translation with feature decay algorithms
We introduce FDA5 for efficient parameterization, optimization, and implementation of
feature decay algorithms (FDA), a class of instance selection algorithms that use feature …
feature decay algorithms (FDA), a class of instance selection algorithms that use feature …
Active learning for neural machine translation
P Zhang, X Xu, D Xiong - 2018 International Conference on …, 2018 - ieeexplore.ieee.org
Neural machine translation (NMT) normally requires a large bilingual corpus to train a high-
translation-quality model. However, building such parallel corpora for many low-resource …
translation-quality model. However, building such parallel corpora for many low-resource …
[PDF][PDF] Translation model adaptation for statistical machine translation with monolingual topic information
To adapt a translation model trained from the data in one domain to another, previous works
paid more attention to the studies of parallel corpus while ignoring the in-domain …
paid more attention to the studies of parallel corpus while ignoring the in-domain …
[PDF][PDF] Multi-strategy approaches to active learning for statistical machine translation
This paper investigates active learning to improve statistical machine translation (SMT) for
low-resource language pairs, ie, when there is very little pre-existing parallel text. Since …
low-resource language pairs, ie, when there is very little pre-existing parallel text. Since …
Improved feature decay algorithms for statistical machine translation
A Poncelas, GM de Buy Wenniger… - Natural Language …, 2022 - cambridge.org
In machine-learning applications, data selection is of crucial importance if good runtime
performance is to be achieved. In a scenario where the test set is accessible when the …
performance is to be achieved. In a scenario where the test set is accessible when the …
Efficient data selection for machine translation
Performance of statistical machine translation (SMT) systems relies on the availability of a
large parallel corpus which is used to estimate translation probabilities. However, the …
large parallel corpus which is used to estimate translation probabilities. However, the …