Universal adversarial perturbations

AH Karimi, G Barthe, B Schölkopf, I Valera - ACM Computing Surveys, 2022 - dl.acm.org

Machine learning is increasingly used to inform decision making in sensitive situations
where decisions have consequential effects on individuals' lives. In these settings, in …

被引用次数：174 相关文章所有 4 个版本

[PDF] mdpi.com

Explainable ai: A review of machine learning interpretability methods

P Linardatos, V Papastefanopoulos, S Kotsiantis - Entropy, 2020 - mdpi.com

Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption,
with machine learning systems demonstrating superhuman performance in a significant …

被引用次数：2518 相关文章所有 12 个版本

[PDF] arxiv.org

Universal and transferable adversarial attacks on aligned language models

A Zou, Z Wang, N Carlini, M Nasr, JZ Kolter… - arXiv preprint arXiv …, 2023 - arxiv.org

Because" out-of-the-box" large language models are capable of generating a great deal of
objectionable content, recent work has focused on aligning these models in an attempt to …

被引用次数：964 相关文章所有 8 个版本

Promptbench: Towards evaluating the robustness of large language models on adversarial prompts

K Zhu, J Wang, J Zhou, Z Wang, H Chen… - arXiv e …, 2023 - ui.adsabs.harvard.edu

The increasing reliance on Large Language Models (LLMs) across academia and industry
necessitates a comprehensive understanding of their robustness to prompts. In response to …

被引用次数：229 相关文章所有 2 个版本

[PDF] arxiv.org

A survey on neural network interpretability

Y Zhang, P Tiňo, A Leonardis… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Along with the great success of deep neural networks, there is also growing concern about
their black-box nature. The interpretability issue affects people's trust on deep learning …

被引用次数：843 相关文章所有 6 个版本

[PDF] arxiv.org

Frequency domain model augmentation for adversarial attack

Y Long, Q Zhang, B Zeng, L Gao, X Liu, J Zhang… - European conference on …, 2022 - Springer

For black-box attacks, the gap between the substitute model and the victim model is usually
large, which manifests as a weak attack performance. Motivated by the observation that the …

被引用次数：158 相关文章所有 5 个版本

[PDF] arxiv.org

When machine learning meets privacy: A survey and outlook

B Liu, M Ding, S Shaham, W Rahayu… - ACM Computing …, 2021 - dl.acm.org

The newly emerged machine learning (eg, deep learning) methods have become a strong
driving force to revolutionize a wide range of industries, such as smart healthcare, financial …

被引用次数：559 相关文章所有 6 个版本

Explainable deep learning for efficient and robust pattern recognition: A survey of recent developments

X Bai, X Wang, X Liu, Q Liu, J Song, N Sebe, B Kim - Pattern Recognition, 2021 - Elsevier

Deep learning has recently achieved great success in many visual recognition tasks.
However, the deep neural networks (DNNs) are often perceived as black-boxes, making …

被引用次数：293 相关文章所有 4 个版本

[PDF] arxiv.org

Backdoor learning: A survey

Y Li, Y Jiang, Z Li, ST Xia - IEEE Transactions on Neural …, 2022 - ieeexplore.ieee.org

Backdoor attack intends to embed hidden backdoors into deep neural networks (DNNs), so
that the attacked models perform well on benign samples, whereas their predictions will be …

被引用次数：701 相关文章所有 6 个版本

[PDF] ieee.org

Advances in adversarial attacks and defenses in computer vision: A survey

N Akhtar, A Mian, N Kardan, M Shah - IEEE Access, 2021 - ieeexplore.ieee.org

Deep Learning is the most widely used tool in the contemporary field of computer vision. Its
ability to accurately solve complex problems is employed in vision research to learn deep …

被引用次数：292 相关文章所有 6 个版本