NusaCrowd: Open source initiative for Indonesian NLP resources

S Cahyawijaya, H Lovenia, AF Aji… - Findings of the …, 2023 - aclanthology.org
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …

One country, 700+ languages: NLP challenges for underrepresented languages and dialects in Indonesia

AF Aji, GI Winata, F Koto, S Cahyawijaya… - arXiv preprint arXiv …, 2022 - arxiv.org
NLP research is impeded by a lack of resources and awareness of the challenges presented
by underrepresented languages and dialects. Focusing on the languages spoken in …

Monkeypox2022tweets: a large-scale twitter dataset on the 2022 monkeypox outbreak, findings from analysis of tweets, and open research questions

N Thakur - Infectious Disease Reports, 2022 - mdpi.com
The mining of Tweets to develop datasets on recent issues, global challenges, pandemics,
virus outbreaks, emerging technologies, and trending matters has been of significant interest …

Text Stemming and Lemmatization of Regional Languages in Indonesia: A Systematic Literature Review

Z Abidin, A Junaidi - Journal of Information Systems …, 2024 - e-journal.unair.ac.id
Background: Stemming is significantly essential in natural language processing (NLP) due
to the ability to minimize word variations to fundamental forms. This procedure facilitates the …

RETRACTED ARTICLE: A review on emotion recognition from dialect speech using feature optimization and classification techniques

S Thimmaiah - Multimedia Tools and Applications, 2024 - Springer
Emotion recognition from speech has gained prominence across various domains due to its
wide-ranging applications. This paper presents a comprehensive review of advancements in …

Pre-trained transformer-based language models for sundanese

W Wongso, H Lucky, D Suhartono - Journal of Big Data, 2022 - Springer
The Sundanese language has over 32 million speakers worldwide, but the language has
reaped little to no benefits from the recent advances in natural language understanding. Like …

Mpox Narrative on Instagram: A Labeled Multilingual Dataset of Instagram Posts on Mpox for Sentiment, Hate Speech, and Anxiety Analysis

N Thakur - arXiv preprint arXiv:2409.05292, 2024 - arxiv.org
The world is currently experiencing an outbreak of mpox, which has been declared a Public
Health Emergency of International Concern by WHO. No prior work related to social media …

Social media emotion analysis in indonesian using fine-tuning bert model

EI Setiawan, L Kristianto, AT Hermawan… - 2021 3rd East …, 2021 - ieeexplore.ieee.org
Social media is the leading platform were users' express opinions and emotions. Emotion
Analysis aims to identify emotions: happy, sad, angry, fear, disgust, shame, and guilt …

Application of SVM and Chi-Square Feature Selection for Sentiment Analysis of Indonesia's National Health Insurance Mobile Application

E Hokijuliandy, H Napitupulu, Firdaniza - Mathematics, 2023 - mdpi.com
(1) Background: sentiment analysis is a computational technique employed to discern
individuals opinions, attitudes, emotions, and intentions concerning a subject by analyzing …

Replicable Benchmarking of Neural Machine Translation (NMT) on Low-Resource Local Languages in Indonesia

L Susanto, R Diandaru, A Krisnadhi… - arXiv preprint arXiv …, 2023 - arxiv.org
Neural machine translation (NMT) for low-resource local languages in Indonesia faces
significant challenges, including the need for a representative benchmark and limited data …