Red-Teaming for Generative AI: Silver Bullet or Security Theater?

M Feffer, A Sinha, ZC Lipton, H Heidari - arXiv preprint arXiv:2401.15897, 2024 - arxiv.org
In response to rising concerns surrounding the safety, security, and trustworthiness of
Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red …

SpeechGuard: Exploring the adversarial robustness of multimodal large language models

R Peri, SM Jayanthi, S Ronanki, A Bhatia… - arXiv preprint arXiv …, 2024 - arxiv.org
Integrated Speech and Large Language Models (SLMs) that can follow speech instructions
and generate relevant text responses have gained popularity lately. However, the safety and …

MICo: Preventative detoxification of large language models through inhibition control

R Siegelmann, N Mehrabi, P Goyal… - Findings of the …, 2024 - aclanthology.org
Abstract Large Language Models (LLMs) are powerful tools which have been both dominant
and commonplace in the field of Artificial Intelligence. Yet, LLMs have a tendency to devolve …