Low-resource languages: A review of past work and future challenges
A Magueresse, V Carles, E Heetderks - arXiv preprint arXiv:2006.07264, 2020 - arxiv.org
A current problem in NLP is massaging and processing low-resource languages which lack
useful training attributes such as supervised data, number of native speakers or experts, etc …
useful training attributes such as supervised data, number of native speakers or experts, etc …
A generalized constraint approach to bilingual dictionary induction for low-resource language families
The lack or absence of parallel and comparable corpora makes bilingual lexicon extraction
a difficult task for low-resource languages. The pivot language and cognate recognition …
a difficult task for low-resource languages. The pivot language and cognate recognition …
Bilingual lexicon induction across orthographically-distinct under-resourced Dravidian languages
BR Chakravarthi, N Rajasekaran, M Arcan… - Proceedings of the …, 2020 - aclanthology.org
Bilingual lexicons are a vital tool for under-resourced languages and recent state-of-the-art
approaches to this leverage pretrained monolingual word embeddings using supervised or …
approaches to this leverage pretrained monolingual word embeddings using supervised or …
Plan optimization to bilingual dictionary induction for low-resource language families
Creating bilingual dictionary is the first crucial step in enriching low-resource languages.
Especially for the closely related ones, it has been shown that the constraint-based …
Especially for the closely related ones, it has been shown that the constraint-based …
Designing a collaborative process to create bilingual dictionaries of Indonesian ethnic languages
The constraint-based approach has been proven useful for inducing bilingual dictionary for
closely-related low-resource languages. When we want to create multiple bilingual …
closely-related low-resource languages. When we want to create multiple bilingual …
Neural Approaches to Historical Words Reconstruction
C Fourrier - 2022 - theses.hal.science
In historical linguistics, cognates are words that descend in direct line from a common
ancestor, called their proto-form, and therefore are representative of their respective …
ancestor, called their proto-form, and therefore are representative of their respective …
Determining Intermediary Closely Related Languages to Find a Mediator for Intertribal Conflict Resolution
Indonesia has a diverse ethnic and cultural background. However, this diversity sometimes
creates social problems, such as intertribal conflict. Because of the large differences among …
creates social problems, such as intertribal conflict. Because of the large differences among …
Pivot-based hybrid machine translation to support multilingual communication
Machine Translation (MT) is very useful in supporting multicultural communication. Existing
Statistical Machine Translation (SMT) which requires high quality and quantity of corpora …
Statistical Machine Translation (SMT) which requires high quality and quantity of corpora …
Neural network-based bilingual lexicon induction for Indonesian ethnic languages
K Resiandi, Y Murakami, AH Nasution - Applied Sciences, 2023 - mdpi.com
Indonesia has a variety of ethnic languages, most of which belong to the same language
family: the Austronesian languages. Due to the shared language family, words in Indonesian …
family: the Austronesian languages. Due to the shared language family, words in Indonesian …
Linguistic resources for Bhojpuri, Magahi, and Maithili: statistics about them, their similarity estimates, and baselines for three applications
Corpus preparation for low-resource languages and for development of human language
technology to analyze or computationally process them is a laborious task, primarily due to …
technology to analyze or computationally process them is a laborious task, primarily due to …