Rethinking machine unlearning for large language models
We explore machine unlearning (MU) in the domain of large language models (LLMs),
referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence …
referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence …
Blind baselines beat membership inference attacks for foundation models
Membership inference (MI) attacks try to determine if a data sample was used to train a
machine learning model. For foundation models trained on unknown Web data, MI attacks …
machine learning model. For foundation models trained on unknown Web data, MI attacks …
Recall: Membership inference via relative conditional log-likelihoods
The rapid scaling of large language models (LLMs) has raised concerns about the
transparency and fair use of the pretraining data used for training them. Detecting such …
transparency and fair use of the pretraining data used for training them. Detecting such …
Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding
The training data in large language models is key to their success, but it also presents
privacy and security risks, as it may contain sensitive information. Detecting pre-training data …
privacy and security risks, as it may contain sensitive information. Detecting pre-training data …
Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?
Large Language Models (LLMs) have demonstrated impressive capabilities in generating
diverse and contextually rich text. However, concerns regarding copyright infringement arise …
diverse and contextually rich text. However, concerns regarding copyright infringement arise …
Context-Aware Membership Inference Attacks against Pre-trained Large Language Models
Prior Membership Inference Attacks (MIAs) on pre-trained Large Language Models (LLMs),
adapted from classification model attacks, fail due to ignoring the generative process of …
adapted from classification model attacks, fail due to ignoring the generative process of …
Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models
Membership inference attacks (MIA) attempt to verify the membership of a given data sample
in the training set for a model. MIA has become relevant in recent years, following the rapid …
in the training set for a model. MIA has become relevant in recent years, following the rapid …
Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data
We consider the problem of a training data proof, where a data creator or owner wants to
demonstrate to a third party that some machine learning model was trained on their data …
demonstrate to a third party that some machine learning model was trained on their data …
Position: LLM Unlearning Benchmarks are Weak Measures of Progress
Unlearning methods have the potential to improve the privacy and safety of large language
models (LLMs) by removing sensitive or harmful information post hoc. The LLM unlearning …
models (LLMs) by removing sensitive or harmful information post hoc. The LLM unlearning …
Semantic Membership Inference Attack against Large Language Models
H Mozaffari, VJ Marathe - arXiv preprint arXiv:2406.10218, 2024 - arxiv.org
Membership Inference Attacks (MIAs) determine whether a specific data point was included
in the training set of a target model. In this paper, we introduce the Semantic Membership …
in the training set of a target model. In this paper, we introduce the Semantic Membership …