Bertje: A dutch bert model

W De Vries, A van Cranenburgh, A Bisazza… - arXiv preprint arXiv …, 2019 - arxiv.org
The transformer-based pre-trained language model BERT has helped to improve state-of-
the-art performance on many natural language processing (NLP) tasks. Using the same …

Creating the european literary text collection (eltec): Challenges and perspectives

C Schöch, T Erjavec, R Patras, D Santos - Modern Languages Open, 2021 - duo.uio.no
The aim of this contribution is to reflect on the process of building the multilingual European
Literary Text Collection (ELTeC) that is being created in the framework of the networking …

[HTML][HTML] The ParlaMint corpora of parliamentary proceedings

T Erjavec, M Ogrodniczuk, P Osenova… - Language resources …, 2023 - Springer
This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17
European national parliaments with half a billion words. The corpora are uniformly encoded …

75 languages, 1 model: Parsing universal dependencies universally

D Kondratyuk, M Straka - arXiv preprint arXiv:1904.02099, 2019 - arxiv.org
We present UDify, a multilingual multi-task model capable of accurately predicting universal
part-of-speech, morphological features, lemmas, and dependency trees simultaneously for …

Systematic Inequalities in Language Technology Performance across the World's Languages

D Blasi, A Anastasopoulos, G Neubig - arXiv preprint arXiv:2110.06733, 2021 - arxiv.org
Natural language processing (NLP) systems have become a central technology in
communication, education, medicine, artificial intelligence, and many other domains of …

[PDF][PDF] Machine learning for ancient languages: A survey

T Sommerschield, Y Assael, J Pavlopoulos… - Computational …, 2023 - direct.mit.edu
Ancient languages preserve the cultures and histories of the past. However, their study is
fraught with difficulties, and experts must tackle a range of challenging text-based tasks, from …

[HTML][HTML] Factors affecting attitudes towards COVID-19 vaccination: an online survey in Slovenia

L Petravić, R Arh, T Gabrovec, L Jazbec, N Rupčić… - Vaccines, 2021 - mdpi.com
While the problem of vaccine hesitancy is not new, it has become more pronounced with the
new COVID-19 vaccines and represents an obstacle to resolving the crisis. Even people …

UDapter: Language adaptation for truly Universal Dependency parsing

A Üstün, A Bisazza, G Bouma, G van Noord - arXiv preprint arXiv …, 2020 - arxiv.org
Recent advances in multilingual dependency parsing have brought the idea of a truly
universal parser closer to reality. However, cross-language interference and restrained …

RobeCzech: Czech RoBERTa, a monolingual contextualized language representation model

M Straka, J Náplava, J Straková, D Samuel - Text, Speech, and Dialogue …, 2021 - Springer
We present RobeCzech, a monolingual RoBERTa language representation model trained
on Czech data. RoBERTa is a robustly optimized Transformer-based pretraining approach …

Massive choice, ample tasks (MaChAmp): A toolkit for multi-task learning in NLP

R Van Der Goot, A Üstün, A Ramponi, I Sharaf… - arXiv preprint arXiv …, 2020 - arxiv.org
Transfer learning, particularly approaches that combine multi-task learning with pre-trained
contextualized embeddings and fine-tuning, have advanced the field of Natural Language …