Resources and benchmark corpora for hate speech detection: a systematic review

F Poletto, V Basile, M Sanguinetti, C Bosco… - Language Resources …, 2021 - Springer
Hate Speech in social media is a complex phenomenon, whose detection has recently
gained significant traction in the Natural Language Processing community, as attested by …

A literature review of textual hate speech detection methods and datasets

F Alkomah, X Ma - Information, 2022 - mdpi.com
Online toxic discourses could result in conflicts between groups or harm to online
communities. Hate speech is complex and multifaceted harmful or offensive content …

Realtoxicityprompts: Evaluating neural toxic degeneration in language models

S Gehman, S Gururangan, M Sap, Y Choi… - arXiv preprint arXiv …, 2020 - arxiv.org
Pretrained neural language models (LMs) are prone to generating racist, sexist, or otherwise
toxic language which hinders their safe deployment. We investigate the extent to which …

The hateful memes challenge: Detecting hate speech in multimodal memes

D Kiela, H Firooz, A Mohan… - Advances in neural …, 2020 - proceedings.neurips.cc
This work proposes a new challenge set for multimodal classification, focusing on detecting
hate speech in multimodal memes. It is constructed such that unimodal models struggle and …

[HTML][HTML] A systematic review of hate speech automatic detection using natural language processing

MS Jahan, M Oussalah - Neurocomputing, 2023 - Elsevier
With the multiplication of social media platforms, which offer anonymity, easy access and
online community formation and online debate, the issue of hate speech detection and …

The risk of racial bias in hate speech detection

M Sap, D Card, S Gabriel, Y Choi… - Proceedings of the 57th …, 2019 - aclanthology.org
We investigate how annotators' insensitivity to differences in dialect can lead to racial bias in
automatic hate speech detection models, potentially amplifying harm against minority …

HateCheck: Functional tests for hate speech detection models

P Röttger, B Vidgen, D Nguyen, Z Waseem… - arXiv preprint arXiv …, 2020 - arxiv.org
Detecting online hate is a difficult task that even state-of-the-art models struggle with.
Typically, hate speech detection models are evaluated by measuring their performance on …

Racial bias in hate speech and abusive language detection datasets

T Davidson, D Bhattacharya, I Weber - arXiv preprint arXiv:1905.12516, 2019 - arxiv.org
Technologies for abusive language detection are being developed and applied with little
consideration of their potential biases. We examine racial bias in five different sets of Twitter …

Towards generalisable hate speech detection: a review on obstacles and solutions

W Yin, A Zubiaga - PeerJ Computer Science, 2021 - peerj.com
Hate speech is one type of harmful online content which directly attacks or promotes hate
towards a group or an individual member based on their actual or perceived aspects of …

Learning from the worst: Dynamically generated datasets to improve online hate detection

B Vidgen, T Thrush, Z Waseem, D Kiela - arXiv preprint arXiv:2012.15761, 2020 - arxiv.org
We present a human-and-model-in-the-loop process for dynamically generating datasets
and training better performing and more robust hate detection models. We provide a new …