A parallel corpus for Vietnamese central-northern dialect text transfer
T Le, A Luu - Findings of the Association for Computational …, 2023 - aclanthology.org
The Vietnamese language embodies dialectal variants closely attached to the nation's three
macro-regions: the Northern, Central and Southern regions. As the northern dialect forms …
macro-regions: the Northern, Central and Southern regions. As the northern dialect forms …
NLP for Counterspeech against Hate: A Survey and How-To Guide
In recent years, counterspeech has emerged as one of the most promising strategies to fight
online hate. These non-escalatory responses tackle online abuse while preserving the …
online hate. These non-escalatory responses tackle online abuse while preserving the …
NAIJAHATE: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data
To address the global issue of online hate, hate speech detection (HSD) systems are
typically developed on datasets from the United States, thereby failing to generalize to …
typically developed on datasets from the United States, thereby failing to generalize to …
“I Searched for a Religious Song in Amharic and Got Sexual Content Instead'': Investigating Online Harm in Low-Resourced Languages on YouTube.
Online social media platforms such as YouTube have a wide, global reach. However, little is
known about the experience of low-resourced language speakers on such platforms; …
known about the experience of low-resourced language speakers on such platforms; …
Evaluating Pixel Language Models on Non-Standardized Languages
We explore the potential of pixel-based models for transfer learning from standard
languages to dialects. These models convert text into images that are divided into patches …
languages to dialects. These models convert text into images that are divided into patches …
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
In multilingual settings, non-Latin scripts and low-resource languages are usually
disadvantaged in terms of language models' utility, efficiency, and cost. Specifically …
disadvantaged in terms of language models' utility, efficiency, and cost. Specifically …
Vicinal risk minimization for few-shot cross-lingual transfer in abusive language detection
G De la Peña Sarracén, P Rosso… - Proceedings of the …, 2023 - aclanthology.org
Cross-lingual transfer learning from high-resource to medium and low-resource languages
has shown encouraging results. However, the scarcity of resources in target languages …
has shown encouraging results. However, the scarcity of resources in target languages …
Triple-0: Zero-shot denoising and dereverberation on an end-to-end frozen anechoic speech separation network
S Gul, MS Khan, A Ur-Rehman - Plos one, 2024 - journals.plos.org
Speech enhancement is crucial both for human and machine listening applications. Over the
last decade, the use of deep learning for speech enhancement has resulted in tremendous …
last decade, the use of deep learning for speech enhancement has resulted in tremendous …
HateDay: Insights from a Global Hate Speech Dataset Representative of a Day on Twitter
To tackle the global challenge of online hate speech, a large body of research has
developed detection models to flag hate speech in the sea of online content. Yet, due to …
developed detection models to flag hate speech in the sea of online content. Yet, due to …
The# Somos600M Project: Generating NLP resources that represent the diversity of the languages from LATAM, the Caribbean, and Spain
M Grandury - arXiv preprint arXiv:2407.17479, 2024 - arxiv.org
We are 600 million Spanish speakers. We launched the# Somos600M Project because the
diversity of the languages from LATAM, the Caribbean and Spain needs to be represented …
diversity of the languages from LATAM, the Caribbean and Spain needs to be represented …