Defenses to membership inference attacks: A survey

L Hu, A Yan, H Yan, J Li, T Huang, Y Zhang… - ACM Computing …, 2023 - dl.acm.org
Machine learning (ML) has gained widespread adoption in a variety of fields, including
computer vision and natural language processing. However, ML models are vulnerable to …

Extracting training data from diffusion models

N Carlini, J Hayes, M Nasr, M Jagielski… - 32nd USENIX Security …, 2023 - usenix.org
Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted
significant attention due to their ability to generate high-quality synthetic images. In this work …

Membership inference attacks from first principles

N Carlini, S Chien, M Nasr, S Song… - … IEEE Symposium on …, 2022 - ieeexplore.ieee.org
A membership inference attack allows an adversary to query a trained machine learning
model to predict whether or not a particular example was contained in the model's training …

Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment

Y Liu, Y Yao, JF Ton, X Zhang, RGH Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org
Ensuring alignment, which refers to making models behave in accordance with human
intentions [1, 2], has become a critical task before deploying large language models (LLMs) …

Membership inference attacks against language models via neighbourhood comparison

J Mattern, F Mireshghallah, Z Jin, B Schölkopf… - arXiv preprint arXiv …, 2023 - arxiv.org
Membership Inference attacks (MIAs) aim to predict whether a data sample was present in
the training data of a machine learning model or not, and are widely used for assessing the …

Truth serum: Poisoning machine learning models to reveal their secrets

F Tramèr, R Shokri, A San Joaquin, H Le… - Proceedings of the …, 2022 - dl.acm.org
We introduce a new class of attacks on machine learning models. We show that an
adversary who can poison a training dataset can cause models trained on this dataset to …

Unique security and privacy threats of large language model: A comprehensive survey

S Wang, T Zhu, B Liu, M Ding, X Guo, D Ye… - arXiv preprint arXiv …, 2024 - arxiv.org
With the rapid development of artificial intelligence, large language models (LLMs) have
made remarkable advancements in natural language processing. These models are trained …

Quantifying privacy risks of masked language models using membership inference attacks

F Mireshghallah, K Goyal, A Uniyal… - arXiv preprint arXiv …, 2022 - arxiv.org
The wide adoption and application of Masked language models~(MLMs) on sensitive data
(from legal to medical) necessitates a thorough quantitative investigation into their privacy …

Evaluations of machine learning privacy defenses are misleading

M Aerni, J Zhang, F Tramèr - Proceedings of the 2024 on ACM SIGSAC …, 2024 - dl.acm.org
Empirical defenses for machine learning privacy forgo the provable guarantees of
differential privacy in the hope of achieving higher utility while resisting realistic adversaries …

Membership inference attacks by exploiting loss trajectory

Y Liu, Z Zhao, M Backes, Y Zhang - Proceedings of the 2022 ACM …, 2022 - dl.acm.org
Machine learning models are vulnerable to membership inference attacks in which an
adversary aims to predict whether or not a particular sample was contained in the target …