Attacks and defenses for generative diffusion models: A comprehensive survey

VT Truong, LB Dang, LB Le - arXiv preprint arXiv:2408.03400, 2024 - arxiv.org
Diffusion models (DMs) have achieved state-of-the-art performance on various generative
tasks such as image synthesis, text-to-image, and text-guided image-to-image generation …

Harmful fine-tuning attacks and defenses for large language models: A survey

T Huang, S Hu, F Ilhan, SF Tekin, L Liu - arXiv preprint arXiv:2409.18169, 2024 - arxiv.org
Recent research demonstrates that the nascent fine-tuning-as-a-service business model
exposes serious safety concerns--fine-tuning over a few harmful data uploaded by the users …

Terd: A unified framework for safeguarding diffusion models against backdoors

Y Mo, H Huang, M Li, A Li, Y Wang - arXiv preprint arXiv:2409.05294, 2024 - arxiv.org
Diffusion models have achieved notable success in image generation, but they remain
highly vulnerable to backdoor attacks, which compromise their integrity by producing …

Diff-cleanse: Identifying and mitigating backdoor attacks in diffusion models

J Hao, X Jin, H Xiaoguang, C Tianyou… - arXiv preprint arXiv …, 2024 - arxiv.org
Diffusion models (DMs) are regarded as one of the most advanced generative models today,
yet recent studies suggest that they are vulnerable to backdoor attacks, which establish …