Handling bias in toxic speech detection: A survey
Detecting online toxicity has always been a challenge due to its inherent subjectivity. Factors
such as the context, geography, socio-political climate, and background of the producers …
such as the context, geography, socio-political climate, and background of the producers …
Is your toxicity my toxicity? exploring the impact of rater identity on toxicity annotation
N Goyal, ID Kivlichan, R Rosen… - Proceedings of the ACM …, 2022 - dl.acm.org
Machine learning models are commonly used to detect toxicity in online conversations.
These models are trained on datasets annotated by human raters. We explore how raters' …
These models are trained on datasets annotated by human raters. We explore how raters' …
Identifying and measuring annotator bias based on annotators' demographic characteristics
Abstract Machine learning is recently used to detect hate speech and other forms of abusive
language in online platforms. However, a notable weakness of machine learning models is …
language in online platforms. However, a notable weakness of machine learning models is …
Why don't you do it right? analysing annotators' disagreement in subjective tasks
M Sandri, E Leonardelli, S Tonelli… - Proceedings of the 17th …, 2023 - aclanthology.org
Annotators' disagreement in linguistic data has been recently the focus of multiple initiatives
aimed at raising awareness on issues related to 'majority voting'when aggregating diverging …
aimed at raising awareness on issues related to 'majority voting'when aggregating diverging …
Cross-lingual few-shot hate speech and offensive language detection using meta learning
Automatic detection of abusive online content such as hate speech, offensive language,
threats, etc. has become prevalent in social media, with multiple efforts dedicated to …
threats, etc. has become prevalent in social media, with multiple efforts dedicated to …
Challenges in applying explainability methods to improve the fairness of NLP models
Motivations for methods in explainable artificial intelligence (XAI) often include detecting,
quantifying and mitigating bias, and contributing to making machine learning models fairer …
quantifying and mitigating bias, and contributing to making machine learning models fairer …
Un-compromised credibility: Social media based multi-class hate speech classification for text
KA Qureshi, M Sabih - IEEE Access, 2021 - ieeexplore.ieee.org
There is an enormous growth of social media which fully promotes freedom of expression
through its anonymity feature. Freedom of expression is a human right but hate speech …
through its anonymity feature. Freedom of expression is a human right but hate speech …
Annotating online misogyny
P Zeinert, N Inie, L Derczynski - … of the 59th Annual Meeting of the …, 2021 - aclanthology.org
Online misogyny, a category of online abusive language, has serious and harmful social
consequences. Automatic detection of misogynistic language online, while imperative …
consequences. Automatic detection of misogynistic language online, while imperative …
Explaining toxic text via knowledge enhanced text generation
Warning: This paper contains content that is offensive and may be upsetting. Biased or toxic
speech can be harmful to various demographic groups. Therefore, it is not only important for …
speech can be harmful to various demographic groups. Therefore, it is not only important for …
Unmasking and improving data credibility: A study with datasets for training harmless language models
Language models have shown promise in various tasks but can be affected by undesired
data during training, fine-tuning, or alignment. For example, if some unsafe conversations …
data during training, fine-tuning, or alignment. For example, if some unsafe conversations …