The'Problem'of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation
B Plank - arXiv preprint arXiv:2211.02570, 2022 - arxiv.org
Human variation in labeling is often considered noise. Annotation projects for machine
learning (ML) aim at minimizing human label variation, with the assumption to maximize …
learning (ML) aim at minimizing human label variation, with the assumption to maximize …
Overview of the shared task on homophobia and transphobia detection in social media comments
BR Chakravarthi, R Priyadharshini… - Proceedings of the …, 2022 - aclanthology.org
Abstract Homophobia and Transphobia Detection is the task of identifying homophobia,
transphobia, and non-anti-LGBT+ content from the given corpus. Homophobia and …
transphobia, and non-anti-LGBT+ content from the given corpus. Homophobia and …
Is your toxicity my toxicity? exploring the impact of rater identity on toxicity annotation
N Goyal, ID Kivlichan, R Rosen… - Proceedings of the ACM …, 2022 - dl.acm.org
Machine learning models are commonly used to detect toxicity in online conversations.
These models are trained on datasets annotated by human raters. We explore how raters' …
These models are trained on datasets annotated by human raters. We explore how raters' …
Humour as an online safety issue: Exploring solutions to help platforms better address this form of expression
A Matamoros Fernandez, L Bartolo… - Internet Policy …, 2023 - eprints.qut.edu.au
This paper makes a case for addressing humour as an online safety issue so that social
media platforms can include it in their risk assessments and harm mitigation strategies. We …
media platforms can include it in their risk assessments and harm mitigation strategies. We …
How (not) to use sociodemographic information for subjective nlp tasks
Annotators' sociodemographic backgrounds (ie, the individual compositions of their gender,
age, educational background, etc.) have a strong impact on their decisions when working on …
age, educational background, etc.) have a strong impact on their decisions when working on …
The measuring hate speech corpus: Leveraging rasch measurement theory for data perspectivism
Abstract We introduce the Measuring Hate Speech corpus, a dataset created to measure
hate speech while adjusting for annotators' perspectives. It consists of 50,070 social media …
hate speech while adjusting for annotators' perspectives. It consists of 50,070 social media …
The ecological fallacy in annotation: Modelling human label variation goes beyond sociodemographics
Many NLP tasks exhibit human label variation, where different annotators give different
labels to the same texts. This variation is known to depend, at least in part, on the …
labels to the same texts. This variation is known to depend, at least in part, on the …
When the majority is wrong: Modeling annotator disagreement for subjective tasks
Though majority vote among annotators is typically used for ground truth labels in natural
language processing, annotator disagreement in tasks such as hate speech detection may …
language processing, annotator disagreement in tasks such as hate speech detection may …
Analyzing the effects of annotator gender across NLP tasks
Recent studies have shown that for subjective annotation tasks, the demographics, lived
experiences, and identity of annotators can have a large impact on how items are labeled …
experiences, and identity of annotators can have a large impact on how items are labeled …
Overview of third shared task on homophobia and transphobia detection in social media comments
BR Chakravarthi, P Kumaresan… - Proceedings of the …, 2024 - aclanthology.org
This paper provides a comprehensive summary of the “Homophobia and Transphobia
Detection in Social Media Comments” shared task, which was held at the LT-EDI@ EACL …
Detection in Social Media Comments” shared task, which was held at the LT-EDI@ EACL …