On the linguistic representational power of neural machine translation models

Y Belinkov, N Durrani, F Dalvi, H Sajjad… - Computational …, 2020 - direct.mit.edu
Despite the recent success of deep neural networks in natural language processing and
other spheres of artificial intelligence, their interpretability remains a challenge. We analyze …

Adversarial attacks on deep-learning models in natural language processing: A survey

WE Zhang, QZ Sheng, A Alhazmi, C Li - ACM Transactions on Intelligent …, 2020 - dl.acm.org
With the development of high computational devices, deep neural networks (DNNs), in
recent years, have gained significant popularity in many Artificial Intelligence (AI) …

Are all languages created equal in multilingual BERT?

S Wu, M Dredze - arXiv preprint arXiv:2005.09093, 2020 - arxiv.org
Multilingual BERT (mBERT) trained on 104 languages has shown surprisingly good cross-
lingual performance on several NLP tasks, even without explicit cross-lingual signals …

UDPipe 2.0 prototype at CoNLL 2018 UD shared task

M Straka - Proceedings of the CoNLL 2018 shared task …, 2018 - aclanthology.org
UDPipe is a trainable pipeline which performs sentence segmentation, tokenization, POS
tagging, lemmatization and dependency parsing. We present a prototype for UDPipe 2.0 …

[PDF][PDF] JW300: A wide-coverage parallel corpus for low-resource languages

Ž Agic, I Vulic - 2019 - repository.cam.ac.uk
Viable cross-lingual transfer critically depends on the availability of parallel texts. Shortage
of such resources imposes a development and evaluation bottleneck in multilingual …

Lossy‐context surprisal: An information‐theoretic model of memory effects in sentence processing

R Futrell, E Gibson, RP Levy - Cognitive science, 2020 - Wiley Online Library
A key component of research on human sentence processing is to characterize the
processing difficulty associated with the comprehension of words in context. Models that …

Small and practical BERT models for sequence labeling

H Tsai, J Riesa, M Johnson, N Arivazhagan… - arXiv preprint arXiv …, 2019 - arxiv.org
We propose a practical scheme to train a single multilingual sequence labeling model that
yields state of the art results and is small and fast enough to run on a single CPU. Starting …

MuTual: A dataset for multi-turn dialogue reasoning

L Cui, Y Wu, S Liu, Y Zhang, M Zhou - arXiv preprint arXiv:2004.04494, 2020 - arxiv.org
Non-task oriented dialogue systems have achieved great success in recent years due to
largely accessible conversation data and the development of deep learning techniques …

Data augmentation via dependency tree morphing for low-resource languages

GG Şahin, M Steedman - arXiv preprint arXiv:1903.09460, 2019 - arxiv.org
Neural NLP systems achieve high scores in the presence of sizable training dataset. Lack of
such datasets leads to poor system performances in the case low-resource languages. We …

A primer on pretrained multilingual language models

S Doddapaneni, G Ramesh, MM Khapra… - arXiv preprint arXiv …, 2021 - arxiv.org
Multilingual Language Models (\MLLMs) such as mBERT, XLM, XLM-R,\textit {etc.} have
emerged as a viable option for bringing the power of pretraining to a large number of …