Threats, attacks, and defenses in machine unlearning: A survey

Z Liu, H Ye, C Chen, KY Lam - arXiv preprint arXiv:2403.13682, 2024 - arxiv.org
Recently, Machine Unlearning (MU) has gained considerable attention for its potential to
improve AI safety by removing the influence of specific data from trained Machine Learning …

Rethinking machine unlearning for large language models

S Liu, Y Yao, J Jia, S Casper, N Baracaldo… - arXiv preprint arXiv …, 2024 - arxiv.org
We explore machine unlearning (MU) in the domain of large language models (LLMs),
referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence …

Salun: Empowering machine unlearning via gradient-based weight saliency in both image classification and generation

C Fan, J Liu, Y Zhang, D Wei, E Wong, S Liu - arXiv preprint arXiv …, 2023 - arxiv.org
With evolving data regulations, machine unlearning (MU) has become an important tool for
fostering trust and safety in today's AI models. However, existing MU methods focusing on …

Mma-diffusion: Multimodal attack on diffusion models

Y Yang, R Gao, X Wang, TY Ho… - Proceedings of the …, 2024 - openaccess.thecvf.com
In recent years Text-to-Image (T2I) models have seen remarkable advancements gaining
widespread adoption. However this progress has inadvertently opened avenues for …

Self-discovering interpretable diffusion latent directions for responsible text-to-image generation

H Li, C Shen, P Torr, V Tresp… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Diffusion-based models have gained significant popularity for text-to-image generation due
to their exceptional image-generation capabilities. A risk with these models is the potential …

Mace: Mass concept erasure in diffusion models

S Lu, Z Wang, L Li, Y Liu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
The rapid expansion of large-scale text-to-image diffusion models has raised growing
concerns regarding their potential misuse in creating harmful or misleading content. In this …

Machine Unlearning in Generative AI: A Survey

Z Liu, G Dou, Z Tan, Y Tian, M Jiang - arXiv preprint arXiv:2407.20516, 2024 - arxiv.org
Generative AI technologies have been deployed in many places, such as (multimodal) large
language models and vision generative models. Their remarkable performance should be …

Challenging forgets: Unveiling the worst-case forget sets in machine unlearning

C Fan, J Liu, A Hero, S Liu - arXiv preprint arXiv:2403.07362, 2024 - arxiv.org
The trustworthy machine learning (ML) community is increasingly recognizing the crucial
need for models capable of selectively'unlearning'data points after training. This leads to the …

Attacks and Defenses for Generative Diffusion Models: A Comprehensive Survey

VT Truong, LB Dang, LB Le - arXiv preprint arXiv:2408.03400, 2024 - arxiv.org
Diffusion models (DMs) have achieved state-of-the-art performance on various generative
tasks such as image synthesis, text-to-image, and text-guided image-to-image generation …

GuardT2I: Defending Text-to-Image Models from Adversarial Prompts

Y Yang, R Gao, X Yang, J Zhong, Q Xu - arXiv preprint arXiv:2403.01446, 2024 - arxiv.org
Recent advancements in Text-to-Image (T2I) models have raised significant safety concerns
about their potential misuse for generating inappropriate or Not-Safe-For-Work (NSFW) …