Many languages, one parser

W Ammar, G Mulcaire, M Ballesteros, C Dyer… - Transactions of the …, 2016 - direct.mit.edu
We train one multilingual model for dependency parsing and use it to parse sentences in
several languages. The parsing model uses (i) multilingual word clusters and …

Learning the curriculum with bayesian optimization for task-specific word representation learning

Y Tsvetkov, M Faruqui, W Ling, B MacWhinney… - arXiv preprint arXiv …, 2016 - arxiv.org
We use Bayesian optimization to learn curricula for word representation learning, optimizing
performance on downstream tasks that depend on the learned representations as features …

Correlation-based intrinsic evaluation of word vector representations

Y Tsvetkov, M Faruqui, C Dyer - arXiv preprint arXiv:1606.06710, 2016 - arxiv.org
We introduce QVEC-CCA--an intrinsic evaluation metric for word vector representations
based on correlations of learned vectors with features extracted from linguistic resources …

Memory and locality in natural language

RLJ Futrell - 2017 - dspace.mit.edu
I explore the hypothesis that the universal properties of human languages can be explained
in terms of efficient communication given fixed human information processing constraints. I …

[PDF][PDF] Slavic languages in universal dependencies

D Zeman - Natural Language Processing, Corpus Linguistics, E …, 2015 - ufal.mff.cuni.cz
Universal Dependencies (UD) is a project that is developing crosslinguistically consistent
treebank annotation for many languages, with the goal of facilitating multilingual parser …

[PDF][PDF] Diverse context for learning word representations

M Faruqui - 2016 - manaalfaruqui.com
Word representations are mathematical objects that capture a word's meaning and its
grammatical properties in a way that can be read and understood by computers. Word …

Morpho-syntactic lexicon generation using graph-based semi-supervised learning

M Faruqui, R McDonald, R Soricut - Transactions of the Association …, 2016 - direct.mit.edu
Morpho-syntactic lexicons provide information about the morphological and syntactic roles
of words in a language. Such lexicons are not available for all languages and even when …

[PDF][PDF] Linguistic knowledge in data-driven natural language processing

Y Tsvetkov - The Requirements for the Degree of Doctor of …, 2016 - cs.cmu.edu
The central goal of this thesis is to bridge the divide between theoretical linguistics—the
scientific inquiry of language—and applied data-driven statistical language processing, to …

[PDF][PDF] Multi-source cross-lingual delexicalized parser transfer: Prague or Stanford?

R Rosa - Proceedings of the Third International Conference on …, 2015 - aclanthology.org
We compare two annotation styles, Prague dependencies and Universal Stanford
Dependencies, in their adequacy for parsing. We specifically focus on comparing the …

Tag based models for Arabic text compression

IS Alkhazi, MA Alghamdi… - 2017 Intelligent Systems …, 2017 - ieeexplore.ieee.org
Text compression is needed to reduce the space required to store information contained in
the text and the amount of time needed to transmit that information. Compression-based …