Privacy in large language models: Attacks, defenses and future directions

H Li, Y Chen, J Luo, J Wang, H Peng, Y Kang… - arXiv preprint arXiv …, 2023 - arxiv.org
The advancement of large language models (LLMs) has significantly enhanced the ability to
effectively tackle various downstream NLP tasks and unify these tasks into generative …

Llm-based edge intelligence: A comprehensive survey on architectures, applications, security and trustworthiness

O Friha, MA Ferrag, B Kantarci… - IEEE Open Journal …, 2024 - ieeexplore.ieee.org
The integration of Large Language Models (LLMs) and Edge Intelligence (EI) introduces a
groundbreaking paradigm for intelligent edge devices. With their capacity for human-like …

The instruction hierarchy: Training llms to prioritize privileged instructions

E Wallace, K Xiao, R Leike, L Weng, J Heidecke… - arXiv preprint arXiv …, 2024 - arxiv.org
Today's LLMs are susceptible to prompt injections, jailbreaks, and other attacks that allow
adversaries to overwrite a model's original instructions with their own malicious prompts. In …

Injecagent: Benchmarking indirect prompt injections in tool-integrated large language model agents

Q Zhan, Z Liang, Z Ying, D Kang - arXiv preprint arXiv:2403.02691, 2024 - arxiv.org
Recent work has embodied LLMs as agents, allowing them to access tools, perform actions,
and interact with external content (eg, emails or websites). However, external content …

Boosting jailbreak attack with momentum

Y Zhang, Z Wei - arXiv preprint arXiv:2405.01229, 2024 - arxiv.org
Large Language Models (LLMs) have achieved remarkable success across diverse tasks,
yet they remain vulnerable to adversarial attacks, notably the well-documented\textit …

Eia: Environmental injection attack on generalist web agents for privacy leakage

Z Liao, L Mo, C Xu, M Kang, J Zhang, C Xiao… - arXiv preprint arXiv …, 2024 - arxiv.org
Generalist web agents have evolved rapidly and demonstrated remarkable potential.
However, there are unprecedented safety risks associated with these them, which are nearly …

VLMGuard: Defending VLMs against Malicious Prompts via Unlabeled Data

X Du, R Ghosh, R Sim, A Salem, V Carvalho… - arXiv preprint arXiv …, 2024 - arxiv.org
Vision-language models (VLMs) are essential for contextual understanding of both visual
and textual information. However, their vulnerability to adversarially manipulated inputs …

PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs

J Yu, Y Shao, H Miao, J Shi, X Xing - arXiv preprint arXiv:2409.14729, 2024 - arxiv.org
Large Language Models (LLMs) have gained widespread use in various applications due to
their powerful capability to generate human-like text. However, prompt injection attacks …

AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents

E Debenedetti, J Zhang, M Balunović… - arXiv preprint arXiv …, 2024 - arxiv.org
AI agents aim to solve complex tasks by combining text-based reasoning with external tool
calls. Unfortunately, AI agents are vulnerable to prompt injection attacks where data returned …

Adversarial Search Engine Optimization for Large Language Models

F Nestaas, E Debenedetti, F Tramèr - arXiv preprint arXiv:2406.18382, 2024 - arxiv.org
Large Language Models (LLMs) are increasingly used in applications where the model
selects from competing third-party content, such as in LLM-powered search engines or …