From machine translation to code-switching: Generating high-quality code-switched text
Generating code-switched text is a problem of growing interest, especially given the scarcity
of corpora containing large volumes of real code-switched text. In this work, we adapt a state …
of corpora containing large volumes of real code-switched text. In this work, we adapt a state …
Comet: Towards code-mixed translation using parallel monolingual sentences
Code-mixed languages are very popular in multilingual societies around the world, yet the
resources lag behind to enable robust systems on such languages. A major contributing …
resources lag behind to enable robust systems on such languages. A major contributing …
Prompting multilingual large language models to generate code-mixed texts: The case of south east asian languages
While code-mixing is a common linguistic practice in many parts of the world, collecting high-
quality and low-cost code-mixed data remains a challenge for natural language processing …
quality and low-cost code-mixed data remains a challenge for natural language processing …
Investigating lexical replacements for Arabic-English code-switched data augmentation
Data sparsity is a main problem hindering the development of code-switching (CS) NLP
systems. In this paper, we investigate data augmentation techniques for synthesizing …
systems. In this paper, we investigate data augmentation techniques for synthesizing …
X-RiSAWOZ: High-quality end-to-end multilingual dialogue datasets and few-shot agents
Task-oriented dialogue research has mainly focused on a few popular languages like
English and Chinese, due to the high dataset creation cost for a new language. To reduce …
English and Chinese, due to the high dataset creation cost for a new language. To reduce …
IndoRobusta: towards robustness against diverse code-mixed indonesian local languages
Significant progress has been made on Indonesian NLP. Nevertheless, exploration of the
code-mixing phenomenon in Indonesian is limited, despite many languages being …
code-mixing phenomenon in Indonesian is limited, despite many languages being …
Overview and results of MixMT shared-task at WMT 2022
V Srivastava, M Singh - … of the Seventh Conference on Machine …, 2022 - aclanthology.org
In this paper, we present an overview of the WMT 2022 shared task on code-mixed machine
translation (MixMT). In this shared task, we hosted two code-mixed machine translation …
translation (MixMT). In this shared task, we hosted two code-mixed machine translation …
CoCoa: An Encoder-Decoder Model for Controllable Code-switched Generation
Code-switching has seen growing interest in recent years as an important multilingual NLP
phenomenon. Generating code-switched text for data augmentation has been sufficiently …
phenomenon. Generating code-switched text for data augmentation has been sufficiently …
Comparing grammatical theories of code-mixing
A Pratapa, M Choudhury - … of the Seventh Workshop on Noisy …, 2021 - aclanthology.org
Code-mixed text generation systems have found applications in many downstream tasks,
including speech recognition, translation and dialogue. A paradigm of these generation …
including speech recognition, translation and dialogue. A paradigm of these generation …
Exploring text-to-text transformers for english to hinglish machine translation with synthetic code-mixing
We describe models focused at the understudied problem of translating between
monolingual and code-mixed language pairs. More specifically, we offer a wide range of …
monolingual and code-mixed language pairs. More specifically, we offer a wide range of …