Red-Teaming for Generative AI: Silver Bullet or Security Theater?

M Feffer, A Sinha, ZC Lipton, H Heidari - arXiv preprint arXiv:2401.15897, 2024 - arxiv.org
In response to rising concerns surrounding the safety, security, and trustworthiness of
Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red …

Jailbreaker in jail: Moving target defense for large language models

B Chen, A Paliwal, Q Yan - Proceedings of the 10th ACM Workshop on …, 2023 - dl.acm.org
Large language models (LLMs), known for their capability in understanding and following
instructions, are vulnerable to adversarial attacks. Researchers have found that current …

Beyond Boundaries: A Comprehensive Survey of Transferable Attacks on AI Systems

G Wang, C Zhou, Y Wang, B Chen, H Guo… - arXiv preprint arXiv …, 2023 - arxiv.org
Artificial Intelligence (AI) systems such as autonomous vehicles, facial recognition, and
speech recognition systems are increasingly integrated into our daily lives. However …

Multi-turn hidden backdoor in large language model-powered chatbot models

B Chen, N Ivanov, G Wang, Q Yan - Proceedings of the 19th ACM Asia …, 2024 - dl.acm.org
Large Language Model (LLM)-powered chatbot services like GPTs, simulating human-to-
human conversation via machine-generated text, are used in numerous fields. They are …

DynamicFL: Balancing Communication Dynamics and Client Manipulation for Federated Learning

B Chen, N Ivanov, G Wang… - 2023 20th Annual IEEE …, 2023 - ieeexplore.ieee.org
Federated Learning (FL) is a distributed machine learning (ML) paradigm, aiming to train a
global model by exploiting the decentralized data across millions of edge devices …

Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM

X Yang, X Tang, S Hu, J Han - arXiv preprint arXiv:2405.05610, 2024 - arxiv.org
Large language models (LLMs) have achieved remarkable performance in various natural
language processing tasks, especially in dialogue systems. However, LLM may also pose …

A Systematic Review of Toxicity in Large Language Models: Definitions, Datasets, Detectors, Detoxification Methods and Challenges

G Villate-Castillo, JDS Lorente, BS Urquijo - 2024 - researchsquare.com
The emergence of the transformer architecture has ushered in a new era of possibilities,
showcasing remarkable capabilities in generative tasks exemplified by models like GPT4o …

IntentObfuscator: A Jailbreaking Method via Confusing LLM with Prompts

S Shang, Z Yao, Y Yao, L Su, Z Fan, X Zhang… - … on Research in …, 2024 - Springer
In the era of Large Language Models (LLMs), developers establish content review
conditions to comply with legal, policy, and societal requirements, aiming to prevent the …

The Personification of ChatGPT (GPT-4)—Understanding Its Personality and Adaptability

L Stöckli, L Joho, F Lehner, T Hanne - Information, 2024 - mdpi.com
Thanks to the publication of ChatGPT, Artificial Intelligence is now basically accessible and
usable to all internet users. The technology behind it can be used in many chatbots …

Protecting Activity Sensing Data Privacy Using Hierarchical Information Dissociation

G Wang, H Guo, Y Wang, B Chen, C Zhou… - arXiv preprint arXiv …, 2024 - arxiv.org
Smartphones and wearable devices have been integrated into our daily lives, offering
personalized services. However, many apps become overprivileged as their collected …