The'Problem'of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation

B Plank - arXiv preprint arXiv:2211.02570, 2022 - arxiv.org
Human variation in labeling is often considered noise. Annotation projects for machine
learning (ML) aim at minimizing human label variation, with the assumption to maximize …

Overview of the shared task on homophobia and transphobia detection in social media comments

BR Chakravarthi, R Priyadharshini… - Proceedings of the …, 2022 - aclanthology.org
Abstract Homophobia and Transphobia Detection is the task of identifying homophobia,
transphobia, and non-anti-LGBT+ content from the given corpus. Homophobia and …

Is your toxicity my toxicity? exploring the impact of rater identity on toxicity annotation

N Goyal, ID Kivlichan, R Rosen… - Proceedings of the ACM …, 2022 - dl.acm.org
Machine learning models are commonly used to detect toxicity in online conversations.
These models are trained on datasets annotated by human raters. We explore how raters' …

Humour as an online safety issue: Exploring solutions to help platforms better address this form of expression

A Matamoros Fernandez, L Bartolo… - Internet Policy …, 2023 - eprints.qut.edu.au
This paper makes a case for addressing humour as an online safety issue so that social
media platforms can include it in their risk assessments and harm mitigation strategies. We …

How (not) to use sociodemographic information for subjective nlp tasks

T Beck, H Schuff, A Lauscher, I Gurevych - arXiv preprint arXiv:2309.07034, 2023 - arxiv.org
Annotators' sociodemographic backgrounds (ie, the individual compositions of their gender,
age, educational background, etc.) have a strong impact on their decisions when working on …

The measuring hate speech corpus: Leveraging rasch measurement theory for data perspectivism

P Sachdeva, R Barreto, G Bacon, A Sahn… - Proceedings of the …, 2022 - aclanthology.org
Abstract We introduce the Measuring Hate Speech corpus, a dataset created to measure
hate speech while adjusting for annotators' perspectives. It consists of 50,070 social media …

The ecological fallacy in annotation: Modelling human label variation goes beyond sociodemographics

M Orlikowski, P Röttger, P Cimiano, D Hovy - arXiv preprint arXiv …, 2023 - arxiv.org
Many NLP tasks exhibit human label variation, where different annotators give different
labels to the same texts. This variation is known to depend, at least in part, on the …

When the majority is wrong: Modeling annotator disagreement for subjective tasks

E Fleisig, R Abebe, D Klein - arXiv preprint arXiv:2305.06626, 2023 - arxiv.org
Though majority vote among annotators is typically used for ground truth labels in natural
language processing, annotator disagreement in tasks such as hate speech detection may …

Analyzing the effects of annotator gender across NLP tasks

L Biester, V Sharma, A Kazemi, N Deng… - Proceedings of the …, 2022 - aclanthology.org
Recent studies have shown that for subjective annotation tasks, the demographics, lived
experiences, and identity of annotators can have a large impact on how items are labeled …

Overview of third shared task on homophobia and transphobia detection in social media comments

BR Chakravarthi, P Kumaresan… - Proceedings of the …, 2024 - aclanthology.org
This paper provides a comprehensive summary of the “Homophobia and Transphobia
Detection in Social Media Comments” shared task, which was held at the LT-EDI@ EACL …