Privacy Backdoors: Stealing Data with Corrupted Pretrained Models

S Feng, F Tramèr - arXiv preprint arXiv:2404.00473, 2024 - arxiv.org
Practitioners commonly download pretrained machine learning models from open
repositories and finetune them to fit specific applications. We show that this practice …

Forget to flourish: Leveraging machine-unlearning on pretrained language models for privacy leakage

MRU Rashid, J Liu, T Koike-Akino, S Mehnaz… - arXiv preprint arXiv …, 2024 - arxiv.org
Fine-tuning large language models on private data for downstream applications poses
significant privacy risks in potentially exposing sensitive information. Several popular …

Reconstruction of Differentially Private Text Sanitization via Large Language Models

S Pang, Z Lu, H Wang, P Fu, Y Zhou, M Xue… - arXiv preprint arXiv …, 2024 - arxiv.org
Differential privacy (DP) is the de facto privacy standard against privacy leakage attacks,
including many recently discovered ones against large language models (LLMs). However …

Balancing Generalization and Robustness in Adversarial Training via Steering through Clean and Adversarial Gradient Directions

H Tong, X Zhang, Y Jin, J Lou, K Wu… - Proceedings of the 32nd …, 2024 - dl.acm.org
Adversarial training (AT) is a fundamental method to enhance the robustness of Deep
Neural Networks (DNNs) against adversarial examples. While AT achieves improved …

ExpShield: Safeguarding Web Text from Unauthorized Crawling and Language Modeling Exploitation

R Liu, T Tran, T Wang, H Hu, S Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
As large language models (LLMs) increasingly depend on web-scraped datasets, concerns
over unauthorized use of copyrighted or personal content for training have intensified …

Privacy in Fine-tuning Large Language Models: Attacks, Defenses, and Future Directions

H Du, S Liu, L Zheng, Y Cao, A Nakamura… - arXiv preprint arXiv …, 2024 - arxiv.org
Fine-tuning has emerged as a critical process in leveraging Large Language Models (LLMs)
for specific downstream tasks, enabling these models to achieve state-of-the-art …