Backdoor attacks and countermeasures on deep learning: A comprehensive review
This work provides the community with a timely comprehensive review of backdoor attacks
and countermeasures on deep learning. According to the attacker's capability and affected …
and countermeasures on deep learning. According to the attacker's capability and affected …
The threat of offensive ai to organizations
AI has provided us with the ability to automate tasks, extract information from vast amounts of
data, and synthesize media that is nearly indistinguishable from the real thing. However …
data, and synthesize media that is nearly indistinguishable from the real thing. However …
Unsolved problems in ml safety
Machine learning (ML) systems are rapidly increasing in size, are acquiring new
capabilities, and are increasingly deployed in high-stakes settings. As with other powerful …
capabilities, and are increasingly deployed in high-stakes settings. As with other powerful …
Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment
Ensuring alignment, which refers to making models behave in accordance with human
intentions [1, 2], has become a critical task before deploying large language models (LLMs) …
intentions [1, 2], has become a critical task before deploying large language models (LLMs) …
Lira: Learnable, imperceptible and robust backdoor attacks
Recently, machine learning models have demonstrated to be vulnerable to backdoor
attacks, primarily due to the lack of transparency in black-box models such as deep neural …
attacks, primarily due to the lack of transparency in black-box models such as deep neural …
Backdoor learning: A survey
Backdoor attack intends to embed hidden backdoors into deep neural networks (DNNs), so
that the attacked models perform well on benign samples, whereas their predictions will be …
that the attacked models perform well on benign samples, whereas their predictions will be …
Backdoorbench: A comprehensive benchmark of backdoor learning
Backdoor learning is an emerging and vital topic for studying deep neural networks'
vulnerability (DNNs). Many pioneering backdoor attack and defense methods are being …
vulnerability (DNNs). Many pioneering backdoor attack and defense methods are being …
Badencoder: Backdoor attacks to pre-trained encoders in self-supervised learning
Self-supervised learning in computer vision aims to pre-train an image encoder using a
large amount of unlabeled images or (image, text) pairs. The pre-trained image encoder can …
large amount of unlabeled images or (image, text) pairs. The pre-trained image encoder can …
Label poisoning is all you need
In a backdoor attack, an adversary injects corrupted data into a model's training dataset in
order to gain control over its predictions on images with a specific attacker-defined trigger. A …
order to gain control over its predictions on images with a specific attacker-defined trigger. A …
Truth serum: Poisoning machine learning models to reveal their secrets
We introduce a new class of attacks on machine learning models. We show that an
adversary who can poison a training dataset can cause models trained on this dataset to …
adversary who can poison a training dataset can cause models trained on this dataset to …