On the linguistic representational power of neural machine translation models

Y Belinkov, N Durrani, F Dalvi, H Sajjad… - Computational …, 2020 - direct.mit.edu
Despite the recent success of deep neural networks in natural language processing and
other spheres of artificial intelligence, their interpretability remains a challenge. We analyze …

A scenario-generic neural machine translation data augmentation method

X Liu, J He, M Liu, Z Yin, L Yin, W Zheng - Electronics, 2023 - mdpi.com
Amid the rapid advancement of neural machine translation, the challenge of data sparsity
has been a major obstacle. To address this issue, this study proposes a general data …

[PDF][PDF] Neural machine translation of rare words with subword units

R Sennrich - arXiv preprint arXiv:1508.07909, 2015 - research.ed.ac.uk
Neural machine translation (NMT) models typically operate with a fixed vocabulary, but
translation is an open-vocabulary problem. Previous work addresses the translation of out-of …

What do neural machine translation models learn about morphology?

Y Belinkov, N Durrani, F Dalvi, H Sajjad… - arXiv preprint arXiv …, 2017 - arxiv.org
Neural machine translation (MT) models obtain state-of-the-art performance while
maintaining a simple, end-to-end architecture. However, little is known about what these …

[PDF][PDF] Farasa: A fast and furious segmenter for arabic

A Abdelali, K Darwish, N Durrani… - Proceedings of the 2016 …, 2016 - aclanthology.org
In this paper, we present Farasa, a fast and accurate Arabic segmenter. Our approach is
based on SVM-rank using linear kernels. We measure the performance of the segmenter in …

N-gram counts and language models from the common crawl

C Buck, K Heafield, B Van Ooyen - Proceedings of the Language …, 2014 - research.ed.ac.uk
We contribute 5-gram counts and language models trained on the Common Crawl corpus, a
collection over 9 billion web pages. This release improves upon the Google n-gram counts …

How grammatical is character-level neural machine translation? Assessing MT quality with contrastive translation pairs

R Sennrich - arXiv preprint arXiv:1612.04629, 2016 - arxiv.org
Analysing translation quality in regards to specific linguistic phenomena has historically
been difficult and time-consuming. Neural machine translation has the attractive property …

When being unseen from mBERT is just the beginning: Handling new languages with multilingual language models

B Muller, A Anastasopoulos, B Sagot… - arXiv preprint arXiv …, 2020 - arxiv.org
Transfer learning based on pretraining language models on a large amount of raw data has
become a new norm to reach state-of-the-art performance in NLP. Still, it remains unclear …

[PDF][PDF] A comparative quality evaluation of PBSMT and NMT using professional translators

S Castilho, J Moorkens, F Gaspari… - … XVI: Research Track, 2017 - aclanthology.org
Interactive machine translation research has focused primarily on predictive typing, which
requires a human to type parts of the translation. This paper explores an interactive setting in …

Aksharantar: Open Indic-language transliteration datasets and models for the next billion users

Y Madhani, S Parthan, P Bedekar, G Nc… - Findings of the …, 2023 - aclanthology.org
Transliteration is very important in the Indian language context due to the usage of multiple
scripts and the widespread use of romanized inputs. However, few training and evaluation …