Active learning and crowdsourcing for machine translation in low resource scenarios

V Ambati - 2012 - search.proquest.com
Corpus based approaches to automatic translation such as Example Based and Statistical
Machine Translation systems use large amounts of parallel data created by humans to train …

[PDF][PDF] Improving statistical machine translation performance by training data selection and optimization

Y Lü, J Huang, Q Liu - Proceedings of the 2007 Joint Conference …, 2007 - aclanthology.org
Parallel corpus is an indispensable resource for translation model training in statistical
machine translation (SMT). Instead of collecting more and more parallel training corpora …

[PDF][PDF] Instance selection for machine translation using feature decay algorithms

E Biçici, D Yuret - Proceedings of the Sixth Workshop on …, 2011 - aclanthology.org
We present an empirical study of instance selection techniques for machine translation. In
an active learning setting, instance selection minimizes the human effort by identifying the …

[PDF][PDF] Low cost portability for statistical machine translation based on n-gram frequency and tf-idf

M Eck, S Vogel, A Waibel - Proceedings of the Second …, 2005 - aclanthology.org
Statistical machine translation relies heavily on the available training data. In some cases it
is necessary to limit the amount of training data that can be created for or actually used by …

Optimizing instance selection for statistical machine translation with feature decay algorithms

E Biçici, D Yuret - IEEE/ACM Transactions on Audio, Speech …, 2014 - ieeexplore.ieee.org
We introduce FDA5 for efficient parameterization, optimization, and implementation of
feature decay algorithms (FDA), a class of instance selection algorithms that use feature …

Active learning for neural machine translation

P Zhang, X Xu, D Xiong - 2018 International Conference on …, 2018 - ieeexplore.ieee.org
Neural machine translation (NMT) normally requires a large bilingual corpus to train a high-
translation-quality model. However, building such parallel corpora for many low-resource …

[PDF][PDF] Translation model adaptation for statistical machine translation with monolingual topic information

J Su, H Wu, H Wang, Y Chen, X Shi… - Proceedings of the …, 2012 - aclanthology.org
To adapt a translation model trained from the data in one domain to another, previous works
paid more attention to the studies of parallel corpus while ignoring the in-domain …

[PDF][PDF] Multi-strategy approaches to active learning for statistical machine translation

V Ambati, S Vogel, JG Carbonell - Proceedings of Machine …, 2011 - aclanthology.org
This paper investigates active learning to improve statistical machine translation (SMT) for
low-resource language pairs, ie, when there is very little pre-existing parallel text. Since …

Improved feature decay algorithms for statistical machine translation

A Poncelas, GM de Buy Wenniger… - Natural Language …, 2022 - cambridge.org
In machine-learning applications, data selection is of crucial importance if good runtime
performance is to be achieved. In a scenario where the test set is accessible when the …

Efficient data selection for machine translation

A Mandal, D Vergyri, W Wang, J Zheng… - 2008 IEEE Spoken …, 2008 - ieeexplore.ieee.org
Performance of statistical machine translation (SMT) systems relies on the availability of a
large parallel corpus which is used to estimate translation probabilities. However, the …