A survey of current datasets for code-switching research

N Jose, BR Chakravarthi… - 2020 6th …, 2020 - ieeexplore.ieee.org
Code switching is a prevalent phenomenon in the multilingual community and social media
interaction. In the past ten years, we have witnessed an explosion of code switched data in …

A survey of code-switched speech and language processing

S Sitaram, KR Chandu, SK Rallabandi… - arXiv preprint arXiv …, 2019 - arxiv.org
Code-switching, the alternation of languages within a conversation or utterance, is a
common communicative phenomenon that occurs in multilingual communities across the …

Transformer based language identification for malayalam-english code-mixed text

S Thara, P Poornachandran - IEEE Access, 2021 - ieeexplore.ieee.org
Social media users have the proclivity to write majority of the data for under resourced
languages in code-mixed format. Code-mixing is defined as mixing of two or more …

Hinge: A dataset for generation and evaluation of code-mixed hinglish text

V Srivastava, M Singh - arXiv preprint arXiv:2107.03760, 2021 - arxiv.org
Text generation is a highly active area of research in the computational linguistic community.
The evaluation of the generated text is a challenging task and multiple theories and metrics …

Improving pretraining techniques for code-switched NLP

R Das, S Ranjan, S Pathak, P Jyothi - Proceedings of the 61st …, 2023 - aclanthology.org
Pretrained models are a mainstay in modern NLP applications. Pretraining requires access
to large volumes of unlabeled text. While monolingual text is readily available for many of …

Ascend: A spontaneous chinese-english dataset for code-switching in multi-turn conversation

H Lovenia, S Cahyawijaya, GI Winata, P Xu… - arXiv preprint arXiv …, 2021 - arxiv.org
Code-switching is a speech phenomenon occurring when a speaker switches language
during a conversation. Despite the spontaneous nature of code-switching in conversational …

Constructing code-mixed universal dependency forest for unbiased cross-lingual relation extraction

H Fei, M Zhang, M Zhang, TS Chua - arXiv preprint arXiv:2305.12258, 2023 - arxiv.org
Latest efforts on cross-lingual relation extraction (XRE) aggressively leverage the language-
consistent structural features from the universal dependency (UD) resource, while they may …

Knowing What to Say: Towards knowledge grounded code-mixed response generation for open-domain conversations

GV Singh, M Firdaus, S Mishra, A Ekbal - Knowledge-Based Systems, 2022 - Elsevier
Inculcating knowledge in the dialogue agents is an important step towards creating any
agent more human-like. Hence, the use of knowledge while conversing is crucial for building …

Exploring methods for building dialects-Mandarin code-mixing corpora: A case study in Taiwanese Hokkien

SE Lu, BH Lu, CY Lu, RTH Tsai - arXiv preprint arXiv:2301.08937, 2023 - arxiv.org
In natural language processing (NLP), code-mixing (CM) is a challenging task, especially
when the mixed languages include dialects. In Southeast Asian countries such as …

MulZDG: Multilingual code-switching framework for zero-shot dialogue generation

Y Liu, S Feng, D Wang, Y Zhang - arXiv preprint arXiv:2208.08629, 2022 - arxiv.org
Building dialogue generation systems in a zero-shot scenario remains a huge challenge,
since the typical zero-shot approaches in dialogue generation rely heavily on large-scale …