Explainable artificial intelligence applications in cyber security: State-of-the-art in research

Z Zhang, H Al Hamadi, E Damiani, CY Yeun… - IEEE …, 2022 - ieeexplore.ieee.org
This survey presents a comprehensive review of current literature on Explainable Artificial
Intelligence (XAI) methods for cyber security applications. Due to the rapid development of …

Towards generalisable hate speech detection: a review on obstacles and solutions

W Yin, A Zubiaga - PeerJ Computer Science, 2021 - peerj.com
Hate speech is one type of harmful online content which directly attacks or promotes hate
towards a group or an individual member based on their actual or perceived aspects of …

Hatexplain: A benchmark dataset for explainable hate speech detection

B Mathew, P Saha, SM Yimam, C Biemann… - Proceedings of the …, 2021 - ojs.aaai.org
Hate speech is a challenging issue plaguing the online social media. While better models
for hate speech detection are continuously being developed, there is little research on the …

Hate speech detection: Challenges and solutions

S MacAvaney, HR Yao, E Yang, K Russell, N Goharian… - PloS one, 2019 - journals.plos.org
As online content continues to grow, so does the spread of hate speech. We identify and
examine challenges faced by online automatic approaches for hate speech detection in text …

The risk of racial bias in hate speech detection

M Sap, D Card, S Gabriel, Y Choi… - Proceedings of the 57th …, 2019 - aclanthology.org
We investigate how annotators' insensitivity to differences in dialect can lead to racial bias in
automatic hate speech detection models, potentially amplifying harm against minority …

Kuisail at semeval-2020 task 12: Bert-cnn for offensive speech identification in social media

A Safaya, M Abdullatif, D Yuret - arXiv preprint arXiv:2007.13184, 2020 - arxiv.org
In this paper, we describe our approach to utilize pre-trained BERT models with
Convolutional Neural Networks for sub-task A of the Multilingual Offensive Language …

A BERT-based transfer learning approach for hate speech detection in online social media

M Mozafari, R Farahbakhsh, N Crespi - … VIII: Volume 1 Proceedings of the …, 2020 - Springer
Generated hateful and toxic content by a portion of users in social media is a rising
phenomenon that motivated researchers to dedicate substantial efforts to the challenging …

Hate speech detection and racial bias mitigation in social media based on BERT model

M Mozafari, R Farahbakhsh, N Crespi - PloS one, 2020 - journals.plos.org
Disparate biases associated with datasets and trained classifiers in hateful and abusive
content identification tasks have raised many concerns recently. Although the problem of …

Impact of SMOTE on imbalanced text features for toxic comments classification using RVVC model

V Rupapara, F Rustam, HF Shahzad… - IEEE …, 2021 - ieeexplore.ieee.org
Social media platforms and microblogging websites have gained accelerated popularity
during the past few years. These platforms are used for expressing views and opinions …

You only prompt once: On the capabilities of prompt learning on large language models to tackle toxic content

X He, S Zannettou, Y Shen… - 2024 IEEE Symposium on …, 2024 - ieeexplore.ieee.org
The spread of toxic content online is an important problem that has adverse effects on user
experience online and in our society at large. Motivated by the importance and impact of the …