One country, 700+ languages: NLP challenges for underrepresented languages and dialects in Indonesia

AF Aji, GI Winata, F Koto, S Cahyawijaya… - arXiv preprint arXiv …, 2022 - arxiv.org
NLP research is impeded by a lack of resources and awareness of the challenges presented
by underrepresented languages and dialects. Focusing on the languages spoken in …

Towards Malay named entity recognition: an open-source dataset and a multi-task framework

Y Fu, N Lin, Z Yang, S Jiang - Connection Science, 2023 - Taylor & Francis
Named entity recognition (NER) is a key component of many natural language processing
(NLP) applications. The majority of advanced research, however, has not been widely …

Dataset enhancement and multilingual transfer for named entity recognition in the indonesian language

SO Khairunnisa, Z Chen, M Komachi - ACM Transactions on Asian and …, 2023 - dl.acm.org
Named entity recognition in the Indonesian language has significantly developed in recent
years. However, it still lacks standardized publicly available corpora; a small dataset is …

Towards a standardized dataset on Indonesian named entity recognition

SO Khairunnisa, A Imankulova… - Proceedings of the 1st …, 2020 - aclanthology.org
In recent years, named entity recognition (NER) tasks in the Indonesian language have
undergone extensive development. There are only a few corpora for Indonesian NER; …

Flood monitoring with information extraction approach from social media data

PK Putra, DB Sencaki, GP Dinanta… - 2020 IEEE Asia …, 2020 - ieeexplore.ieee.org
Flood natural disasters that often occur in Jakarta have a bad impact on many sectors.
Countermeasures, fast action, and monitoring need to be done to minimize the impact that …

A multi-pass sieve coreference resolution for Indonesian

VKP Artari, R Mahendra, MA Jiwanggi… - Proceedings of the …, 2021 - aclanthology.org
Coreference resolution is an NLP task to find out whether the set of referring expressions
belong to the same concept in discourse. A multi-pass sieve is a deterministic coreference …

Modified DBpedia entities expansion for tagging automatically NER dataset

I Alfina, S Savitri, MI Fanany - 2017 international conference on …, 2017 - ieeexplore.ieee.org
Developing NER system using machine learning approach needs a big dataset which is
costly if the dataset labeling is done manually. The previous works proposed methods in …

Towards corpus and model: Hierarchical structured-attention-based features for Indonesian named entity recognition

Y Fu, N Lin, X Lin, S Jiang - Journal of Intelligent & Fuzzy …, 2021 - content.iospress.com
Named entity recognition (NER) is fundamental to natural language processing (NLP). Most
state-of-the-art researches on NER are based on pre-trained language models (PLMs) or …

Named entity recognition model for Indonesian tweet using CRF classifier

Y Munarko, MS Sutrisno, WAI Mahardika… - IOP Conference …, 2018 - iopscience.iop.org
Abstract Named Entity Recognition (NER) is a part of Natural Language Processing (NLP)
that acts to recognize the existing word entity in the document. By using NER, it is possible to …

IndQNER: Named Entity Recognition Benchmark Dataset from the Indonesian Translation of the Quran

RH Gusmita, AF Firmansyah, D Moussallem… - … on Applications of …, 2023 - Springer
Indonesian is classified as underrepresented in the Natural Language Processing (NLP)
field, despite being the tenth most spoken language in the world with 198 million speakers …