Pythainlp: Thai natural language processing in python

W Phatthiyaphaibun, K Chaovavanich… - arXiv preprint arXiv …, 2023 - arxiv.org
We present PyThaiNLP, a free and open-source natural language processing (NLP) library
for Thai language implemented in Python. It provides a wide range of software, models, and …

Multilingual is not enough: BERT for Finnish

A Virtanen, J Kanerva, R Ilo, J Luoma… - arXiv preprint arXiv …, 2019 - arxiv.org
Deep learning-based language models pretrained on large unannotated text corpora have
been demonstrated to allow efficient transfer learning for natural language processing, with …

75 languages, 1 model: Parsing universal dependencies universally

D Kondratyuk, M Straka - arXiv preprint arXiv:1904.02099, 2019 - arxiv.org
We present UDify, a multilingual multi-task model capable of accurately predicting universal
part-of-speech, morphological features, lemmas, and dependency trees simultaneously for …

Cross-lingual transfer learning for multilingual task oriented dialog

S Schuster, S Gupta, R Shah, M Lewis - arXiv preprint arXiv:1810.13327, 2018 - arxiv.org
One of the first steps in the utterance interpretation pipeline of many task-oriented
conversational AI systems is to identify user intents and the corresponding slots. Since data …

[PDF][PDF] Machine learning for ancient languages: A survey

T Sommerschield, Y Assael, J Pavlopoulos… - Computational …, 2023 - direct.mit.edu
Ancient languages preserve the cultures and histories of the past. However, their study is
fraught with difficulties, and experts must tackle a range of challenging text-based tasks, from …

Small and practical BERT models for sequence labeling

H Tsai, J Riesa, M Johnson, N Arivazhagan… - arXiv preprint arXiv …, 2019 - arxiv.org
We propose a practical scheme to train a single multilingual sequence labeling model that
yields state of the art results and is small and fast enough to run on a single CPU. Starting …

A survey of syntactic-semantic parsing based on constituent and dependency structures

MS Zhang - Science China Technological Sciences, 2020 - Springer
Syntactic and semantic parsing has been investigated for decades, which is one primary
topic in the natural language processing community. This article aims for a brief survey on …

The second multilingual surface realisation shared task (SR'19): Overview and evaluation results

S Mille, A Belz, B Bohnet, Y Graham… - Proceedings of the …, 2019 - research.brighton.ac.uk
We report results from the SR'19 Shared Task, the second edition of a multilingual surface
realisation task organised as part of the EMNLP'19 Workshop on Multilingual Surface …

UDapter: Language adaptation for truly Universal Dependency parsing

A Üstün, A Bisazza, G Bouma, G van Noord - arXiv preprint arXiv …, 2020 - arxiv.org
Recent advances in multilingual dependency parsing have brought the idea of a truly
universal parser closer to reality. However, cross-language interference and restrained …

Latin bert: A contextual language model for classical philology

D Bamman, PJ Burns - arXiv preprint arXiv:2009.10053, 2020 - arxiv.org
We present Latin BERT, a contextual language model for the Latin language, trained on
642.7 million words from a variety of sources spanning the Classical era to the 21st century …