Cross-lingual language model pretraining

A Conneau, G Lample - Advances in neural information …, 2019 - proceedings.neurips.cc
Recent studies have demonstrated the efficiency of generative pretraining for English
natural language understanding. In this work, we extend this approach to multiple …

Cross-lingual language model pretraining

G Lample, A Conneau - arXiv preprint arXiv:1901.07291, 2019 - arxiv.org
Recent studies have demonstrated the efficiency of generative pretraining for English
natural language understanding. In this work, we extend this approach to multiple …

Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond

M Artetxe, H Schwenk - … of the association for computational linguistics, 2019 - direct.mit.edu
We introduce an architecture to learn joint multilingual sentence representations for 93
languages, belonging to more than 30 different families and written in 28 different scripts …

The state of the art in semantic representation

O Abend, A Rappoport - Proceedings of the 55th Annual Meeting …, 2017 - aclanthology.org
Semantic representation is receiving growing attention in NLP in the past few years, and
many proposals for semantic schemes (eg, AMR, UCCA, GMB, UDS) have been put forth …

XNLI: Evaluating cross-lingual sentence representations

A Conneau, G Lample, R Rinott, A Williams… - arXiv preprint arXiv …, 2018 - arxiv.org
State-of-the-art natural language processing systems rely on supervision in the form of
annotated data to learn competent models. These models are generally trained on data in a …

Word translation without parallel data

A Conneau, G Lample, MA Ranzato, L Denoyer… - arXiv preprint arXiv …, 2017 - arxiv.org
State-of-the-art methods for learning cross-lingual word embeddings have relied on
bilingual dictionaries or parallel corpora. Recent studies showed that the need for parallel …

[PDF][PDF] Word translation without parallel data

G Lample, A Conneau, MA Ranzato… - International …, 2018 - openreview.net
State-of-the-art methods for learning cross-lingual word embeddings have relied on
bilingual dictionaries or parallel corpora. Recent studies showed that the need for parallel …

A survey of cross-lingual word embedding models

S Ruder, I Vulić, A Søgaard - Journal of Artificial Intelligence Research, 2019 - jair.org
Cross-lingual representations of words enable us to reason about word meaning in
multilingual contexts and are a key facilitator of cross-lingual transfer when developing …

An overview of word and sense similarity

R Navigli, F Martelli - Natural Language Engineering, 2019 - cambridge.org
Over the last two decades, determining the similarity between words as well as between
their meanings, that is, word senses, has been proven to be of vital importance in the field of …

Offline bilingual word vectors, orthogonal transformations and the inverted softmax

SL Smith, DHP Turban, S Hamblin… - arXiv preprint arXiv …, 2017 - arxiv.org
Usually bilingual word vectors are trained" online". Mikolov et al. showed they can also be
found" offline", whereby two pre-trained embeddings are aligned with a linear …