Min-k%++: Improved baseline for detecting pre-training data from large language models

S Liu, Y Yao, J Jia, S Casper, N Baracaldo… - arXiv preprint arXiv …, 2024 - arxiv.org

We explore machine unlearning (MU) in the domain of large language models (LLMs),
referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence …

被引用次数：74 相关文章所有 2 个版本

[PDF] arxiv.org

Blind baselines beat membership inference attacks for foundation models

D Das, J Zhang, F Tramèr - arXiv preprint arXiv:2406.16201, 2024 - arxiv.org

Membership inference (MI) attacks try to determine if a data sample was used to train a
machine learning model. For foundation models trained on unknown Web data, MI attacks …

被引用次数：9 相关文章所有 3 个版本

[PDF] arxiv.org

Recall: Membership inference via relative conditional log-likelihoods

R Xie, J Wang, R Huang, M Zhang, R Ge, J Pei… - arXiv preprint arXiv …, 2024 - arxiv.org

The rapid scaling of large language models (LLMs) has raised concerns about the
transparency and fair use of the pretraining data used for training them. Detecting such …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding

C Wang, Y Wang, B Hooi, Y Cai, N Peng… - arXiv preprint arXiv …, 2024 - arxiv.org

The training data in large language models is key to their success, but it also presents
privacy and security risks, as it may contain sensitive information. Detecting pre-training data …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?

MA Panaitescu-Liess, Z Che, B An, Y Xu… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs) have demonstrated impressive capabilities in generating
diverse and contextually rich text. However, concerns regarding copyright infringement arise …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Context-Aware Membership Inference Attacks against Pre-trained Large Language Models

H Chang, AS Shamsabadi, K Katevas… - arXiv preprint arXiv …, 2024 - arxiv.org

Prior Membership Inference Attacks (MIAs) on pre-trained Large Language Models (LLMs),
adapted from classification model attacks, fail due to ignoring the generative process of …

相关文章所有 2 个版本

[PDF] arxiv.org

Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models

H Puerto, M Gubri, S Yun, SJ Oh - arXiv preprint arXiv:2411.00154, 2024 - arxiv.org

Membership inference attacks (MIA) attempt to verify the membership of a given data sample
in the training set for a model. MIA has become relevant in recent years, following the rapid …

相关文章所有 2 个版本

[PDF] arxiv.org

Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data

J Zhang, D Das, G Kamath, F Tramèr - arXiv preprint arXiv:2409.19798, 2024 - arxiv.org

We consider the problem of a training data proof, where a data creator or owner wants to
demonstrate to a third party that some machine learning model was trained on their data …

相关文章所有 2 个版本

[PDF] arxiv.org

Position: LLM Unlearning Benchmarks are Weak Measures of Progress

P Thaker, S Hu, N Kale, Y Maurya, ZS Wu… - arXiv preprint arXiv …, 2024 - arxiv.org

Unlearning methods have the potential to improve the privacy and safety of large language
models (LLMs) by removing sensitive or harmful information post hoc. The LLM unlearning …

相关文章所有 2 个版本

[PDF] arxiv.org

Semantic Membership Inference Attack against Large Language Models

H Mozaffari, VJ Marathe - arXiv preprint arXiv:2406.10218, 2024 - arxiv.org

Membership Inference Attacks (MIAs) determine whether a specific data point was included
in the training set of a target model. In this paper, we introduce the Semantic Membership …