Transliteration Characteristics in Romanized Assamese Language Social Media Text and Machine Transliteration

H Baruah, SR Singh, P Sarmah - ACM Transactions on Asian and Low …, 2024 - dl.acm.org
This article aims to understand different transliteration behaviors of Romanized Assamese
text on social media. Assamese, a language that belongs to the Indo-Aryan language family …

Perceptions of Language Technology Failures from South Asian English Speakers

F Holt, W Held, D Yang - Findings of the Association for …, 2024 - aclanthology.org
NLP systems have empirically worse performance for dialects other than Standard American
English (SAmE). However, how these discrepancies impact use of language technology by …

Hierarchical Attention-enhanced Contextual CapsuleNet for Multilingual Hope Speech Detection

MZU Rehman, D Raghuvanshi, H Pachar… - Expert Systems with …, 2024 - Elsevier
Social media was initially intended for creative purposes, but a notable dissemination of
offensive material adversely affects users of these platforms. It is imperative to spotlight and …

Fine Tuning LLMs for Low Resource Languages

S Joshi, MS Khan, A Dafe, K Singh… - … on Image Processing …, 2024 - ieeexplore.ieee.org
Large Language Models (LLMs) hold immense potential, but their data hunger can limit its
performance in processing languages with limited resources. This research study explores …

Malayalam to English Named Entity Transliteration using Attention based BiLSTM

B Baiju, K Manohar, LG Pillai… - 2024 IEEE Recent …, 2024 - ieeexplore.ieee.org
This research introduces an approach to Malayalam-English named entity transliteration
using a BiLSTM with attention mechanism and compares its perfromance with basic LSTM …

RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models models via Romanization

JA Husain, R Dabre, A Kumar, R Puduppully… - arXiv preprint arXiv …, 2024 - arxiv.org
This study addresses the challenge of extending Large Language Models (LLMs) to non-
English languages, specifically those using non-Latin scripts. We propose an innovative …

Share What You Already Know: Cross-Language-Script Transfer and Alignment for Sentiment Detection in Code-Mixed Data

N Pahari, K Shimada - ACM Transactions on Asian and Low-Resource …, 2024 - dl.acm.org
Code-switching entails mixing multiple languages. It is an increasingly occurring
phenomenon in social media texts. Usually, code-mixed texts are written in a single script …

LAHAJA: A Robust Multi-accent Benchmark for Evaluating Hindi ASR Systems

T Javed, J Nawale, S Joshi, E George… - arXiv preprint arXiv …, 2024 - arxiv.org
Hindi, one of the most spoken language of India, exhibits a diverse array of accents due to
its usage among individuals from diverse linguistic origins. To enable a robust evaluation of …

Romanized to Native Malayalam Script Transliteration Using an Encoder-Decoder Framework

B Baiju, K Manohar, LG Pillai, E Sherly - arXiv preprint arXiv:2412.09957, 2024 - arxiv.org
In this work, we present the development of a reverse transliteration model to convert
romanized Malayalam to native script using an encoder-decoder framework built with …

[PDF][PDF] Tackling the Problem of Multilingualism in Voice Assistants

S Sabharwal, R Sahni - 2024 - aipublications.com
Voice assistants like Alexa and Siri have become increasingly advanced due to
improvements in AI and language processing models like GPT and Gemini. However, these …