What neural networks memorize and why: Discovering the long tail via influence estimation

X Li, H Xiong, X Li, X Wu, X Zhang, J Liu, J Bian… - … and Information Systems, 2022 - Springer

Deep neural networks have been well-known for their superb handling of various machine
learning and artificial intelligence tasks. However, due to their over-parameterized black-box …

被引用次数：288 相关文章所有 7 个版本

Sok: Model inversion attack landscape: Taxonomy, challenges, and future roadmap

SV Dibbo - 2023 IEEE 36th Computer Security Foundations …, 2023 - ieeexplore.ieee.org

A crucial module of the widely applied machine learning (ML) model is the model training
phase, which involves large-scale training data, often including sensitive private data. ML …

被引用次数：18 相关文章所有 2 个版本

[PDF] usenix.org

Extracting training data from diffusion models

N Carlini, J Hayes, M Nasr, M Jagielski… - 32nd USENIX Security …, 2023 - usenix.org

Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted
significant attention due to their ability to generate high-quality synthetic images. In this work …

被引用次数：408 相关文章所有 7 个版本

[PDF] thecvf.com

Diffusion art or digital forgery? investigating data replication in diffusion models

G Somepalli, V Singla, M Goldblum… - Proceedings of the …, 2023 - openaccess.thecvf.com

Cutting-edge diffusion models produce images with high quality and customizability,
enabling them to be used for commercial art and graphic design purposes. But do diffusion …

被引用次数：204 相关文章所有 6 个版本

[PDF] neurips.cc

Beyond neural scaling laws: beating power law scaling via data pruning

B Sorscher, R Geirhos, S Shekhar… - Advances in …, 2022 - proceedings.neurips.cc

Widely observed neural scaling laws, in which error falls off as a power of the training set
size, model size, or both, have driven substantial performance improvements in deep …

被引用次数：288 相关文章所有 9 个版本

[PDF] arxiv.org

Fine-tuning aligned language models compromises safety, even when users do not intend to!

X Qi, Y Zeng, T Xie, PY Chen, R Jia, P Mittal… - arXiv preprint arXiv …, 2023 - arxiv.org

Optimizing large language models (LLMs) for downstream use cases often involves the
customization of pre-trained LLMs through further fine-tuning. Meta's open release of Llama …

被引用次数：210 相关文章所有 4 个版本

[PDF] arxiv.org

Text embeddings by weakly-supervised contrastive pre-training

L Wang, N Yang, X Huang, B Jiao, L Yang… - arXiv preprint arXiv …, 2022 - arxiv.org

This paper presents E5, a family of state-of-the-art text embeddings that transfer well to a
wide range of tasks. The model is trained in a contrastive manner with weak supervision …

被引用次数：253 相关文章所有 2 个版本

[PDF] arxiv.org

Quantifying memorization across neural language models

N Carlini, D Ippolito, M Jagielski, K Lee… - arXiv preprint arXiv …, 2022 - arxiv.org

Large language models (LMs) have been shown to memorize parts of their training data,
and when prompted appropriately, they will emit the memorized training data verbatim. This …

被引用次数：503 相关文章所有 5 个版本

[PDF] arxiv.org

Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment

Y Liu, Y Yao, JF Ton, X Zhang, R Guo, H Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org

Ensuring alignment, which refers to making models behave in accordance with human
intentions [1, 2], has become a critical task before deploying large language models (LLMs) …

被引用次数：160 相关文章所有 3 个版本

[PDF] arxiv.org

Membership inference attacks from first principles

N Carlini, S Chien, M Nasr, S Song… - … IEEE Symposium on …, 2022 - ieeexplore.ieee.org

A membership inference attack allows an adversary to query a trained machine learning
model to predict whether or not a particular example was contained in the model's training …

被引用次数：498 相关文章所有 4 个版本