Corpus creation for sentiment analysis in code-mixed Tamil-English text

BR Chakravarthi, V Muralidaran… - arXiv preprint arXiv …, 2020 - arxiv.org
Understanding the sentiment of a comment from a video or an image is an essential task in
many applications. Sentiment analysis of a text can be useful for various decision-making …

LLM-powered data augmentation for enhanced cross-lingual performance

C Whitehouse, M Choudhury, AF Aji - arXiv preprint arXiv:2305.14288, 2023 - arxiv.org
This paper explores the potential of leveraging Large Language Models (LLMs) for data
augmentation in multilingual commonsense reasoning datasets where the available training …

KanCMD: Kannada CodeMixed dataset for sentiment analysis and offensive language detection

A Hande, R Priyadharshini… - Proceedings of the Third …, 2020 - aclanthology.org
Abstract We introduce Kannada CodeMixed Dataset (KanCMD), a multi-task learning
dataset for sentiment analysis and offensive language identification. The KanCMD dataset …

IIITT@ DravidianLangTech-EACL2021: Transfer learning for offensive language detection in Dravidian languages

K Yasaswini, K Puranik, A Hande… - Proceedings of the …, 2021 - aclanthology.org
This paper demonstrates our work for the shared task on Offensive Language Identification
in Dravidian Languages-EACL 2021. Offensive language detection in the various social …

Dravidiancodemix: Sentiment analysis and offensive language identification dataset for dravidian languages in code-mixed text

BR Chakravarthi, R Priyadharshini… - Language Resources …, 2022 - Springer
This paper describes the development of a multilingual, manually annotated dataset for
three under-resourced Dravidian languages generated from social media comments. The …

GLUECoS: An evaluation benchmark for code-switched NLP

S Khanuja, S Dandapat, A Srinivasan… - arXiv preprint arXiv …, 2020 - arxiv.org
Code-switching is the use of more than one language in the same conversation or utterance.
Recently, multilingual contextual embedding models, trained on multiple monolingual …

A survey of code-switched speech and language processing

S Sitaram, KR Chandu, SK Rallabandi… - arXiv preprint arXiv …, 2019 - arxiv.org
Code-switching, the alternation of languages within a conversation or utterance, is a
common communicative phenomenon that occurs in multilingual communities across the …

Named entity recognition for code-mixed Indian corpus using meta embedding

R Priyadharshini, BR Chakravarthi… - 2020 6th …, 2020 - ieeexplore.ieee.org
In this paper, we utilize the pre-trained embedding, sub-word embedding and closely related
languages of languages in the code mixed corpus to create a meta-embedding. We then …

L3Cube-HingCorpus and HingBERT: A code mixed Hindi-English dataset and BERT language models

R Nayak, R Joshi - arXiv preprint arXiv:2204.08398, 2022 - arxiv.org
Code-switching occurs when more than one language is mixed in a given sentence or a
conversation. This phenomenon is more prominent on social media platforms and its …

A survey of code-switching: Linguistic and social perspectives for language technologies

AS Doğruöz, S Sitaram, BE Bullock… - arXiv preprint arXiv …, 2023 - arxiv.org
The analysis of data in which multiple languages are represented has gained popularity
among computational linguists in recent years. So far, much of this research focuses mainly …