Defending against insertion-based textual backdoor attacks via attribution

J Li, Y Yang, Z Wu, VG Vydiswaran, C Xiao - arXiv preprint arXiv …, 2023 - arxiv.org

Textual backdoor attacks pose a practical threat to existing systems, as they can
compromise the model by inserting imperceptible triggers into inputs and manipulating …

被引用次数：24 相关文章所有 3 个版本

[PDF] arxiv.org

Defending against weight-poisoning backdoor attacks for parameter-efficient fine-tuning

S Zhao, L Gan, LA Tuan, J Fu, L Lyu, M Jia… - arXiv preprint arXiv …, 2024 - arxiv.org

Recently, various parameter-efficient fine-tuning (PEFT) strategies for application to
language models have been proposed and successfully implemented. However, this raises …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Backdoor attacks and countermeasures in natural language processing models: A comprehensive security review

P Cheng, Z Wu, W Du, G Liu - arXiv preprint arXiv:2309.06055, 2023 - arxiv.org

Deep Neural Networks (DNNs) have led to unprecedented progress in various natural
language processing (NLP) tasks. Owing to limited data and computation resources, using …

被引用次数：10 相关文章所有 2 个版本

[PDF] arxiv.org

Securing Multi-turn Conversational Language Models Against Distributed Backdoor Triggers

T Tong, J Xu, Q Liu, M Chen - arXiv preprint arXiv:2407.04151, 2024 - arxiv.org

The security of multi-turn conversational large language models (LLMs) is understudied
despite it being one of the most popular LLM utilization. Specifically, LLMs are vulnerable to …

Combating Security and Privacy Issues in the Era of Large Language Models

M Chen, C Xiao, H Sun, L Li, L Derczynski… - Proceedings of the …, 2024 - aclanthology.org

This tutorial seeks to provide a systematic summary of risks and vulnerabilities in security,
privacy and copyright aspects of large language models (LLMs), and most recent solutions …

被引用次数：1 相关文章所有 3 个版本

[PDF] github.io

[PDF][PDF] DESIGNING FOR RELIABILITY: ALGORITHMIC AND APPLIED PERSPECTIVES ON TRUSTWORTHY ARTIFICIAL INTELLIGENCE

YAO QIANG - 2024 - dongxiaozhu.github.io

Transformers have advanced the state-of-the-art on a variety of natural language processing
tasks [255, 70] and see increasing popularity in the field of computer vision [74, 157]. The …