Extractive adversarial networks: High-recall explanations for identifying personal attacks in social media posts

S Carton, Q Mei, P Resnick - arXiv preprint arXiv:1809.01499, 2018 - arxiv.org
We introduce an adversarial method for producing high-recall explanations of neural text
classifier decisions. Building on an existing architecture for extractive explanations via hard …

Additive feature attribution explainable methods to craft adversarial attacks for text classification and text regression

Y Chai, R Liang, S Samtani, H Zhu… - … on Knowledge and …, 2023 - ieeexplore.ieee.org
Deep learning (DL) models have significantly improved the performance of text classification
and text regression tasks. However, DL models are often strikingly vulnerable to adversarial …

CATBERT: Context-aware tiny BERT for detecting social engineering emails

Y Lee, J Saxe, R Harang - arXiv preprint arXiv:2010.03484, 2020 - arxiv.org
Targeted phishing emails are on the rise and facilitate the theft of billions of dollars from
organizations a year. While malicious signals from attached files or malicious URLs in …

Adversarial attacks and defenses for social network text processing applications: Techniques, challenges and future research directions

I Alsmadi, K Ahmad, M Nazzal, F Alam… - arXiv preprint arXiv …, 2021 - arxiv.org
The growing use of social media has led to the development of several Machine Learning
(ML) and Natural Language Processing (NLP) tools to process the unprecedented amount …

Data-driven mitigation of adversarial text perturbation

R Bhalerao, M Al-Rubaie, A Bhaskar… - arXiv preprint arXiv …, 2022 - arxiv.org
Social networks have become an indispensable part of our lives, with billions of people
producing ever-increasing amounts of text. At such scales, content policies and their …

Adversarial text generation for google's perspective api

E Jain, S Brown, J Chen, E Neaton… - 2018 international …, 2018 - ieeexplore.ieee.org
With the preponderance of harassment and abuse, social media platforms and online
discussion platforms seek to curb toxic comments. Google's Perspective aims to help …

Generating natural language adversarial examples on a large scale with generative models

Y Ren, J Lin, S Tang, J Zhou, S Yang, Y Qi, X Ren - ECAI 2020, 2020 - ebooks.iospress.nl
Today text classification models have been widely used. However, these classifiers are
found to be easily fooled by adversarial examples. Fortunately, standard attacking methods …

Generating black-box adversarial examples for text classifiers using a deep reinforced model

P Vijayaraghavan, D Roy - … 2019, Würzburg, Germany, September 16–20 …, 2020 - Springer
Recently, generating adversarial examples has become an important means of measuring
robustness of a deep learning model. Adversarial examples help us identify the …

Learning to discriminate perturbations for blocking adversarial attacks in text classification

Y Zhou, JY Jiang, KW Chang, W Wang - arXiv preprint arXiv:1909.03084, 2019 - arxiv.org
Adversarial attacks against machine learning models have threatened various real-world
applications such as spam filtering and sentiment analysis. In this paper, we propose a …

Robust training under linguistic adversity

Y Li, T Cohn, T Baldwin - Proceedings of the 15th Conference of …, 2017 - aclanthology.org
Deep neural networks have achieved remarkable results across many language processing
tasks, however they have been shown to be susceptible to overfitting and highly sensitive to …