Generating textual adversarial examples for deep learning models: A survey

B Sabir, F Ullah, MA Babar, R Gaire - ACM Computing Surveys (CSUR), 2021 - dl.acm.org

Context: Research at the intersection of cybersecurity, Machine Learning (ML), and Software
Engineering (SE) has recently taken significant steps in proposing countermeasures for …

被引用次数：47 相关文章所有 4 个版本

[PDF] arxiv.org

Bae: Bert-based adversarial examples for text classification

S Garg, G Ramakrishnan - arXiv preprint arXiv:2004.01970, 2020 - arxiv.org

Modern text classification models are susceptible to adversarial examples, perturbed
versions of the original text indiscernible by humans which get misclassified by the model …

被引用次数：501 相关文章所有 4 个版本

[PDF] jair.org Full View

Explainable deep learning: A field guide for the uninitiated

G Ras, N Xie, M Van Gerven, D Doran - Journal of Artificial Intelligence …, 2022 - jair.org

Deep neural networks (DNNs) are an indispensable machine learning tool despite the
difficulty of diagnosing what aspects of a model's input drive its decisions. In countless real …

被引用次数：462 相关文章所有 11 个版本

[PDF] neurips.cc

A closer look at accuracy vs. robustness

YY Yang, C Rashtchian, H Zhang… - Advances in neural …, 2020 - proceedings.neurips.cc

Current methods for training robust networks lead to a drop in test accuracy, which has led
prior works to posit that a robustness-accuracy tradeoff may be inevitable in deep learning …

被引用次数：299 相关文章所有 10 个版本

[PDF] arxiv.org

[图书][B] Challenges in automated debiasing for toxic language detection

X Zhou - 2020 - search.proquest.com

Biased associations have been a challenge in the development of classifiers for detecting
toxic language, hindering both fairness and accuracy. As potential solutions, we investigate …

被引用次数：132 相关文章所有 11 个版本

[PDF] arxiv.org

Nl-augmenter: A framework for task-sensitive natural language augmentation

KD Dhole, V Gangal, S Gehrmann, A Gupta, Z Li… - arXiv preprint arXiv …, 2021 - arxiv.org

Data augmentation is an important component in the robustness evaluation of models in
natural language processing (NLP) and in enhancing the diversity of the data they are …

被引用次数：65 相关文章所有 5 个版本

[PDF] arxiv.org

Reevaluating adversarial examples in natural language

JX Morris, E Lifland, J Lanchantin, Y Ji, Y Qi - arXiv preprint arXiv …, 2020 - arxiv.org

State-of-the-art attacks on NLP models lack a shared definition of a what constitutes a
successful attack. We distill ideas from past work into a unified framework: a successful …

被引用次数：109 相关文章所有 3 个版本

Evading text based emotion detection mechanism via adversarial attacks

A Bajaj, DK Vishwakarma - Neurocomputing, 2023 - Elsevier

Abstract Textual Emotion Analysis (TEA) seeks to extract and assess the emotional states of
users from the text. Various Deep Learning (DL) algorithms have emerged rapidly and …

被引用次数：12 相关文章所有 2 个版本

[PDF] openreview.net

Natural language adversarial attack and defense in word level

X Wang, H Jin, K He - 2019 - openreview.net

Up until very recently, inspired by a mass of researches on adversarial examples for
computer vision, there has been a growing interest in designing adversarial attacks for …

被引用次数：104 相关文章

[PDF] arxiv.org

Characterizing the decision boundary of deep neural networks

H Karimi, T Derr, J Tang - arXiv preprint arXiv:1912.11460, 2019 - arxiv.org

Deep neural networks and in particular, deep neural classifiers have become an integral
part of many modern applications. Despite their practical success, we still have limited …

被引用次数：79 相关文章所有 5 个版本