Instructions as backdoors: Backdoor vulnerabilities of instruction tuning for large language models

J Xu, MD Ma, F Wang, C Xiao, M Chen - arXiv preprint arXiv:2305.14710, 2023 - arxiv.org
We investigate security concerns of the emergent instruction tuning paradigm, that models
are trained on crowdsourced datasets with task instructions to achieve superior …

Backdoor defense via adaptively splitting poisoned dataset

K Gao, Y Bai, J Gu, Y Yang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Backdoor defenses have been studied to alleviate the threat of deep neural networks
(DNNs) being backdoor attacked and thus maliciously altered. Since DNNs usually adopt …

Prompt as triggers for backdoor attack: Examining the vulnerability in language models

S Zhao, J Wen, LA Tuan, J Zhao, J Fu - arXiv preprint arXiv:2305.01219, 2023 - arxiv.org
The prompt-based learning paradigm, which bridges the gap between pre-training and fine-
tuning, achieves state-of-the-art performance on several NLP tasks, particularly in few-shot …

Backdoor learning for nlp: Recent advances, challenges, and future research directions

M Omar - arXiv preprint arXiv:2302.06801, 2023 - arxiv.org
Although backdoor learning is an active research topic in the NLP domain, the literature
lacks studies that systematically categorize and summarize backdoor attacks and defenses …

Badprompt: Backdoor attacks on continuous prompts

X Cai, H Xu, S Xu, Y Zhang - Advances in Neural …, 2022 - proceedings.neurips.cc
The prompt-based learning paradigm has gained much research attention recently. It has
achieved state-of-the-art performance on several NLP tasks, especially in the few-shot …

A unified evaluation of textual backdoor learning: Frameworks and benchmarks

G Cui, L Yuan, B He, Y Chen… - Advances in Neural …, 2022 - proceedings.neurips.cc
Textual backdoor attacks are a kind of practical threat to NLP systems. By injecting a
backdoor in the training phase, the adversary could control model predictions via predefined …

A survey on backdoor attack and defense in natural language processing

X Sheng, Z Han, P Li, X Chang - 2022 IEEE 22nd International …, 2022 - ieeexplore.ieee.org
Deep learning is becoming increasingly popular in real-life applications, especially in
natural language processing (NLP). Users often choose training outsourcing or adopt third …

Backdooring multimodal learning

X Han, Y Wu, Q Zhang, Y Zhou, Y Xu… - … IEEE Symposium on …, 2024 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) are vulnerable to backdoor attacks, which poison the training
set to alter the model prediction over samples with a specific trigger. While existing efforts …

Attention-enhancing backdoor attacks against bert-based models

W Lyu, S Zheng, L Pang, H Ling, C Chen - arXiv preprint arXiv:2310.14480, 2023 - arxiv.org
Recent studies have revealed that\textit {Backdoor Attacks} can threaten the safety of natural
language processing (NLP) models. Investigating the strategies of backdoor attacks will help …

Poison attack and poison detection on deep source code processing models

J Li♂, Z Li, HZ Zhang, G Li, Z Jin, X Hu… - ACM Transactions on …, 2024 - dl.acm.org
In the software engineering (SE) community, deep learning (DL) has recently been applied
to many source code processing tasks, achieving state-of-the-art results. Due to the poor …