A comprehensive survey of grammatical error correction
Grammatical error correction (GEC) is an important application aspect of natural language
processing techniques, and GEC system is a kind of very important intelligent system that …
processing techniques, and GEC system is a kind of very important intelligent system that …
The united nations parallel corpus v1. 0
M Ziemski, M Junczys-Dowmunt… - Proceedings of the …, 2016 - aclanthology.org
This paper describes the creation process and statistics of the official United Nations Parallel
Corpus, the first parallel corpus composed from United Nations documents published by the …
Corpus, the first parallel corpus composed from United Nations documents published by the …
On the impact of various types of noise on neural machine translation
H Khayrallah, P Koehn - arXiv preprint arXiv:1805.12282, 2018 - arxiv.org
We examine how various types of noise in the parallel training data impact the quality of
neural machine translation systems. We create five types of artificial noise and analyze how …
neural machine translation systems. We create five types of artificial noise and analyze how …
A comprehensive survey of grammar error correction
Y Wang, Y Wang, J Liu, Z Liu - arXiv preprint arXiv:2005.06600, 2020 - arxiv.org
Grammar error correction (GEC) is an important application aspect of natural language
processing techniques. The past decade has witnessed significant progress achieved in …
processing techniques. The past decade has witnessed significant progress achieved in …
Is neural machine translation ready for deployment? A case study on 30 translation directions
In this paper we provide the largest published comparison of translation quality for phrase-
based SMT and neural machine translation across 30 translation directions. For ten …
based SMT and neural machine translation across 30 translation directions. For ten …
N-gram counts and language models from the common crawl
C Buck, K Heafield, B Van Ooyen - Proceedings of the Language …, 2014 - research.ed.ac.uk
We contribute 5-gram counts and language models trained on the Common Crawl corpus, a
collection over 9 billion web pages. This release improves upon the Google n-gram counts …
collection over 9 billion web pages. This release improves upon the Google n-gram counts …
Self-attention with cross-lingual position representation
Position encoding (PE), an essential part of self-attention networks (SANs), is used to
preserve the word order information for natural language processing tasks, generating fixed …
preserve the word order information for natural language processing tasks, generating fixed …
Incremental decoding and training methods for simultaneous translation in neural machine translation
We address the problem of simultaneous translation by modifying the Neural MT decoder to
operate with dynamically built encoder and attention. We propose a tunable agent which …
operate with dynamically built encoder and attention. We propose a tunable agent which …
Phrase-based machine translation is state-of-the-art for automatic grammatical error correction
M Junczys-Dowmunt, R Grundkiewicz - arXiv preprint arXiv:1605.06353, 2016 - arxiv.org
In this work, we study parameter tuning towards the M^ 2 metric, the standard metric for
automatic grammar error correction (GEC) tasks. After implementing M^ 2 as a scorer in the …
automatic grammar error correction (GEC) tasks. After implementing M^ 2 as a scorer in the …
[PDF][PDF] Integrating an unsupervised transliteration model into statistical machine translation
We investigate three methods for integrating an unsupervised transliteration model into an
end-to-end SMT system. We induce a transliteration model from parallel data and use it to …
end-to-end SMT system. We induce a transliteration model from parallel data and use it to …