Anti-backdoor learning: Training clean models on poisoned data

Y Li, X Lyu, N Koren, L Lyu, B Li… - Advances in Neural …, 2021 - proceedings.neurips.cc
Backdoor attack has emerged as a major security threat to deep neural networks (DNNs).
While existing defense methods have demonstrated promising results on detecting or …

Backdoor learning: A survey

Y Li, Y Jiang, Z Li, ST Xia - IEEE Transactions on Neural …, 2022 - ieeexplore.ieee.org
Backdoor attack intends to embed hidden backdoors into deep neural networks (DNNs), so
that the attacked models perform well on benign samples, whereas their predictions will be …

Revisiting adversarial robustness distillation: Robust soft labels make student better

B Zi, S Zhao, X Ma, YG Jiang - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
Adversarial training is one effective approach for training robust deep neural networks
against adversarial attacks. While being able to bring reliable robustness, adversarial …

A survey of neural trojan attacks and defenses in deep learning

J Wang, GM Hassan, N Akhtar - arXiv preprint arXiv:2202.07183, 2022 - arxiv.org
Artificial Intelligence (AI) relies heavily on deep learning-a technology that is becoming
increasingly popular in real-life applications of AI, even in the safety-critical and high-risk …

Better safe than sorry: Preventing delusive adversaries with adversarial training

L Tao, L Feng, J Yi, SJ Huang… - Advances in Neural …, 2021 - proceedings.neurips.cc
Delusive attacks aim to substantially deteriorate the test accuracy of the learning model by
slightly perturbing the features of correctly labeled training examples. By formalizing this …

Training with more confidence: Mitigating injected and natural backdoors during training

Z Wang, H Ding, J Zhai, S Ma - Advances in Neural …, 2022 - proceedings.neurips.cc
The backdoor or Trojan attack is a severe threat to deep neural networks (DNNs).
Researchers find that DNNs trained on benign data and settings can also learn backdoor …

Beating backdoor attack at its own game

M Liu, A Sangiovanni-Vincentelli… - Proceedings of the …, 2023 - openaccess.thecvf.com
Deep neural networks (DNNs) are vulnerable to backdoor attack, which does not affect the
network's performance on clean data but would manipulate the network behavior once a …

Distilling cognitive backdoor patterns within an image

H Huang, X Ma, S Erfani, J Bailey - arXiv preprint arXiv:2301.10908, 2023 - arxiv.org
This paper proposes a simple method to distill and detect backdoor patterns within an
image:\emph {Cognitive Distillation}(CD). The idea is to extract the" minimal essence" from …

BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection

T Xie, X Qi, P He, Y Li, JT Wang, P Mittal - arXiv preprint arXiv:2308.12439, 2023 - arxiv.org
We present a novel defense, against backdoor attacks on Deep Neural Networks (DNNs),
wherein adversaries covertly implant malicious behaviors (backdoors) into DNNs. Our …

Towards Modeling Uncertainties of Self-Explaining Neural Networks via Conformal Prediction

W Qian, C Zhao, Y Li, F Ma, C Zhang… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Despite the recent progress in deep neural networks (DNNs), it remains challenging to
explain the predictions made by DNNs. Existing explanation methods for DNNs mainly focus …