Low-resource languages: A review of past work and future challenges

A Magueresse, V Carles, E Heetderks - arXiv preprint arXiv:2006.07264, 2020 - arxiv.org
A current problem in NLP is massaging and processing low-resource languages which lack
useful training attributes such as supervised data, number of native speakers or experts, etc …

A generalized constraint approach to bilingual dictionary induction for low-resource language families

AH Nasution, Y Murakami, T Ishida - ACM Transactions on Asian and …, 2017 - dl.acm.org
The lack or absence of parallel and comparable corpora makes bilingual lexicon extraction
a difficult task for low-resource languages. The pivot language and cognate recognition …

Bilingual lexicon induction across orthographically-distinct under-resourced Dravidian languages

BR Chakravarthi, N Rajasekaran, M Arcan… - Proceedings of the …, 2020 - aclanthology.org
Bilingual lexicons are a vital tool for under-resourced languages and recent state-of-the-art
approaches to this leverage pretrained monolingual word embeddings using supervised or …

Plan optimization to bilingual dictionary induction for low-resource language families

AH Nasution, Y Murakami, T Ishida - Transactions on Asian and Low …, 2021 - dl.acm.org
Creating bilingual dictionary is the first crucial step in enriching low-resource languages.
Especially for the closely related ones, it has been shown that the constraint-based …

Designing a collaborative process to create bilingual dictionaries of Indonesian ethnic languages

AH Nasution, Y Murakami, T Ishida - 2018 - repository.uir.ac.id
The constraint-based approach has been proven useful for inducing bilingual dictionary for
closely-related low-resource languages. When we want to create multiple bilingual …

Neural Approaches to Historical Words Reconstruction

C Fourrier - 2022 - theses.hal.science
In historical linguistics, cognates are words that descend in direct line from a common
ancestor, called their proto-form, and therefore are representative of their respective …

Determining Intermediary Closely Related Languages to Find a Mediator for Intertribal Conflict Resolution

AH Nasution, SE Fitri, R Saian, W Monika, N Badruddin - Information, 2022 - mdpi.com
Indonesia has a diverse ethnic and cultural background. However, this diversity sometimes
creates social problems, such as intertribal conflict. Because of the large differences among …

Pivot-based hybrid machine translation to support multilingual communication

AH Nasution, N Syafitri, PR Setiawan… - … conference on culture …, 2017 - ieeexplore.ieee.org
Machine Translation (MT) is very useful in supporting multicultural communication. Existing
Statistical Machine Translation (SMT) which requires high quality and quantity of corpora …

Neural network-based bilingual lexicon induction for Indonesian ethnic languages

K Resiandi, Y Murakami, AH Nasution - Applied Sciences, 2023 - mdpi.com
Indonesia has a variety of ethnic languages, most of which belong to the same language
family: the Austronesian languages. Due to the shared language family, words in Indonesian …

Linguistic resources for Bhojpuri, Magahi, and Maithili: statistics about them, their similarity estimates, and baselines for three applications

RK Mundotiya, MK Singh, R Kapur, S Mishra… - Transactions on Asian …, 2021 - dl.acm.org
Corpus preparation for low-resource languages and for development of human language
technology to analyze or computationally process them is a laborious task, primarily due to …