Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
S Feng, F Tramèr - arXiv preprint arXiv:2404.00473, 2024 - arxiv.org
Practitioners commonly download pretrained machine learning models from open
repositories and finetune them to fit specific applications. We show that this practice …
repositories and finetune them to fit specific applications. We show that this practice …
Forget to flourish: Leveraging machine-unlearning on pretrained language models for privacy leakage
Fine-tuning large language models on private data for downstream applications poses
significant privacy risks in potentially exposing sensitive information. Several popular …
significant privacy risks in potentially exposing sensitive information. Several popular …
Reconstruction of Differentially Private Text Sanitization via Large Language Models
Differential privacy (DP) is the de facto privacy standard against privacy leakage attacks,
including many recently discovered ones against large language models (LLMs). However …
including many recently discovered ones against large language models (LLMs). However …
Balancing Generalization and Robustness in Adversarial Training via Steering through Clean and Adversarial Gradient Directions
Adversarial training (AT) is a fundamental method to enhance the robustness of Deep
Neural Networks (DNNs) against adversarial examples. While AT achieves improved …
Neural Networks (DNNs) against adversarial examples. While AT achieves improved …
ExpShield: Safeguarding Web Text from Unauthorized Crawling and Language Modeling Exploitation
As large language models (LLMs) increasingly depend on web-scraped datasets, concerns
over unauthorized use of copyrighted or personal content for training have intensified …
over unauthorized use of copyrighted or personal content for training have intensified …
Privacy in Fine-tuning Large Language Models: Attacks, Defenses, and Future Directions
Fine-tuning has emerged as a critical process in leveraging Large Language Models (LLMs)
for specific downstream tasks, enabling these models to achieve state-of-the-art …
for specific downstream tasks, enabling these models to achieve state-of-the-art …