Backdoor attacks on pre-trained models by layerwise weight poisoning

[HTML][HTML] Foundation and large language models: fundamentals, challenges, opportunities, and social impacts

D Myers, R Mohawesh, VI Chellaboina, AL Sathvik… - Cluster …, 2024 - Springer

Abstract Foundation and Large Language Models (FLLMs) are models that are trained using
a massive amount of data with the intent to perform a variety of downstream tasks. FLLMs …

被引用次数：65 相关文章所有 2 个版本

[PDF] arxiv.org

Threats to pre-trained language models: Survey and taxonomy

S Guo, C Xie, J Li, L Lyu, T Zhang - arXiv preprint arXiv:2202.06862, 2022 - arxiv.org

Pre-trained language models (PTLMs) have achieved great success and remarkable
performance over a wide range of natural language processing (NLP) tasks. However, there …

被引用次数：41 相关文章所有 2 个版本

[PDF] arxiv.org

Backdoor learning: A survey

Y Li, Y Jiang, Z Li, ST Xia - IEEE Transactions on Neural …, 2022 - ieeexplore.ieee.org

Backdoor attack intends to embed hidden backdoors into deep neural networks (DNNs), so
that the attacked models perform well on benign samples, whereas their predictions will be …

被引用次数：698 相关文章所有 6 个版本

[PDF] arxiv.org

On protecting the data privacy of large language models (llms): A survey

B Yan, K Li, M Xu, Y Dong, Y Zhang, Z Ren… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) are complex artificial intelligence systems capable of
understanding, generating and translating human language. They learn language patterns …

被引用次数：52 相关文章所有 4 个版本

[PDF] arxiv.org

Prompt as triggers for backdoor attack: Examining the vulnerability in language models

S Zhao, J Wen, LA Tuan, J Zhao, J Fu - arXiv preprint arXiv:2305.01219, 2023 - arxiv.org

The prompt-based learning paradigm, which bridges the gap between pre-training and fine-
tuning, achieves state-of-the-art performance on several NLP tasks, particularly in few-shot …

被引用次数：72 相关文章所有 5 个版本

[PDF] arxiv.org

Large language model alignment: A survey

T Shen, R Jin, Y Huang, C Liu, W Dong, Z Guo… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent years have witnessed remarkable progress made in large language models (LLMs).
Such advancements, while garnering significant attention, have concurrently elicited various …

被引用次数：122 相关文章所有 2 个版本

[PDF] neurips.cc

Badprompt: Backdoor attacks on continuous prompts

X Cai, H Xu, S Xu, Y Zhang - Advances in Neural …, 2022 - proceedings.neurips.cc

The prompt-based learning paradigm has gained much research attention recently. It has
achieved state-of-the-art performance on several NLP tasks, especially in the few-shot …

被引用次数：70 相关文章所有 5 个版本

[PDF] neurips.cc

A unified evaluation of textual backdoor learning: Frameworks and benchmarks

G Cui, L Yuan, B He, Y Chen… - Advances in Neural …, 2022 - proceedings.neurips.cc

Textual backdoor attacks are a kind of practical threat to NLP systems. By injecting a
backdoor in the training phase, the adversary could control model predictions via predefined …

被引用次数：75 相关文章所有 7 个版本

[PDF] arxiv.org

Badpre: Task-agnostic backdoor attacks to pre-trained nlp foundation models

K Chen, Y Meng, X Sun, S Guo, T Zhang, J Li… - arXiv preprint arXiv …, 2021 - arxiv.org

Pre-trained Natural Language Processing (NLP) models can be easily adapted to a variety
of downstream language tasks. This significantly accelerates the development of language …

被引用次数：110 相关文章所有 6 个版本

[PDF] arxiv.org

Badchain: Backdoor chain-of-thought prompting for large language models

Z Xiang, F Jiang, Z Xiong, B Ramasubramanian… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) are shown to benefit from chain-of-thought (COT) prompting,
particularly when tackling tasks that require systematic reasoning processes. On the other …

被引用次数：48 相关文章所有 5 个版本