A survey of backdoor attacks and defenses on large language models: Implications for security measures
Large Language Models (LLMs), which bridge the gap between human language
understanding and complex problem-solving, achieve state-of-the-art performance on …
understanding and complex problem-solving, achieve state-of-the-art performance on …
Transferring backdoors between large language models by knowledge distillation
Backdoor Attacks have been a serious vulnerability against Large Language Models
(LLMs). However, previous methods only reveal such risk in specific models, or present …
(LLMs). However, previous methods only reveal such risk in specific models, or present …
Certifiably Robust RAG against Retrieval Corruption
Retrieval-augmented generation (RAG) has been shown vulnerable to retrieval corruption
attacks: an attacker can inject malicious passages into retrieval results to induce inaccurate …
attacks: an attacker can inject malicious passages into retrieval results to induce inaccurate …
Mitigating backdoor threats to large language models: Advancement and challenges
The advancement of Large Language Models (LLMs) has significantly impacted various
domains, including Web search, healthcare, and software development. However, as these …
domains, including Web search, healthcare, and software development. However, as these …
Typos that Broke the RAG's Back: Genetic Attack on RAG Pipeline by Simulating Documents in the Wild via Low-level Perturbations
The robustness of recent Large Language Models (LLMs) has become increasingly crucial
as their applicability expands across various domains and real-world applications. Retrieval …
as their applicability expands across various domains and real-world applications. Retrieval …
Robust neural information retrieval: An adversarial and out-of-distribution perspective
Recent advances in neural information retrieval (IR) models have significantly enhanced
their effectiveness over various IR tasks. The robustness of these models, essential for …
their effectiveness over various IR tasks. The robustness of these models, essential for …
SynGhost: Imperceptible and Universal Task-agnostic Backdoor Attack in Pre-trained Language Models
Pre-training has been a necessary phase for deploying pre-trained language models
(PLMs) to achieve remarkable performance in downstream tasks. However, we empirically …
(PLMs) to achieve remarkable performance in downstream tasks. However, we empirically …
Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation
Parameter-efficient fine-tuning (PEFT) can bridge the gap between large language models
(LLMs) and downstream tasks. However, PEFT has been proven vulnerable to malicious …
(LLMs) and downstream tasks. However, PEFT has been proven vulnerable to malicious …
Robust Information Retrieval
Beyond effectiveness, the robustness of an information retrieval (IR) system is increasingly
attracting attention. When deployed, a critical technology such as IR should not only deliver …
attracting attention. When deployed, a critical technology such as IR should not only deliver …
Weak-to-Strong Backdoor Attack for Large Language Models
Despite being widely applied due to their exceptional capabilities, Large Language Models
(LLMs) have been proven to be vulnerable to backdoor attacks. These attacks introduce …
(LLMs) have been proven to be vulnerable to backdoor attacks. These attacks introduce …