A survey on federated unlearning: Challenges, methods, and future directions

Z Liu, Y Jiang, J Shen, M Peng, KY Lam… - ACM Computing …, 2024 - dl.acm.org
In recent years, the notion of “the right to be forgotten”(RTBF) has become a crucial aspect of
data privacy for digital trust and AI safety, requiring the provision of mechanisms that support …

Threats, attacks, and defenses in machine unlearning: A survey

Z Liu, H Ye, C Chen, Y Zheng, KY Lam - arXiv preprint arXiv:2403.13682, 2024 - arxiv.org
Machine Unlearning (MU) has recently gained considerable attention due to its potential to
achieve Safe AI by removing the influence of specific data from trained Machine Learning …

Privacy in large language models: Attacks, defenses and future directions

H Li, Y Chen, J Luo, J Wang, H Peng, Y Kang… - arXiv preprint arXiv …, 2023 - arxiv.org
The advancement of large language models (LLMs) has significantly enhanced the ability to
effectively tackle various downstream NLP tasks and unify these tasks into generative …

Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities

E Yang, L Shen, G Guo, X Wang, X Cao… - arXiv preprint arXiv …, 2024 - arxiv.org
Model merging is an efficient empowerment technique in the machine learning community
that does not require the collection of raw training data and does not require expensive …

Negative preference optimization: From catastrophic collapse to effective unlearning

R Zhang, L Lin, Y Bai, S Mei - arXiv preprint arXiv:2404.05868, 2024 - arxiv.org
Large Language Models (LLMs) often memorize sensitive, private, or copyrighted data
during pre-training. LLM unlearning aims to eliminate the influence of undesirable data from …

Targeted latent adversarial training improves robustness to persistent harmful behaviors in llms

A Sheshadri, A Ewart, P Guo, A Lynch, C Wu… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) can often be made to behave in undesirable ways that they
are explicitly fine-tuned not to. For example, the LLM red-teaming literature has produced a …

Open problems in technical ai governance

A Reuel, B Bucknall, S Casper, T Fist, L Soder… - arXiv preprint arXiv …, 2024 - arxiv.org
AI progress is creating a growing range of risks and opportunities, but it is often unclear how
they should be navigated. In many cases, the barriers and uncertainties faced are at least …

Machine unlearning in generative ai: A survey

Z Liu, G Dou, Z Tan, Y Tian, M Jiang - arXiv preprint arXiv:2407.20516, 2024 - arxiv.org
Generative AI technologies have been deployed in many places, such as (multimodal) large
language models and vision generative models. Their remarkable performance should be …

Ununlearning: Unlearning is not sufficient for content regulation in advanced generative ai

I Shumailov, J Hayes, E Triantafillou… - arXiv preprint arXiv …, 2024 - arxiv.org
Exact unlearning was first introduced as a privacy mechanism that allowed a user to retract
their data from machine learning models on request. Shortly after, inexact schemes were …

Safe unlearning: A surprisingly effective and generalizable solution to defend against jailbreak attacks

Z Zhang, J Yang, P Ke, S Cui, C Zheng, H Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
LLMs are known to be vulnerable to jailbreak attacks, even after safety alignment. An
important observation is that, while different types of jailbreak attacks can generate …