Jab: Joint adversarial prompting and belief augmentation

文章

学术资源搜索

获得 3 条结果（用时0.04秒）

我的图书馆

Jab: Joint adversarial prompting and belief augmentation

在引用文章中搜索

[PDF] arxiv.org

Red-Teaming for Generative AI: Silver Bullet or Security Theater?

M Feffer, A Sinha, ZC Lipton, H Heidari - arXiv preprint arXiv:2401.15897, 2024 - arxiv.org

In response to rising concerns surrounding the safety, security, and trustworthiness of
Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

SpeechGuard: Exploring the adversarial robustness of multimodal large language models

R Peri, SM Jayanthi, S Ronanki, A Bhatia… - arXiv preprint arXiv …, 2024 - arxiv.org

Integrated Speech and Large Language Models (SLMs) that can follow speech instructions
and generate relevant text responses have gained popularity lately. However, the safety and …

MICo: Preventative detoxification of large language models through inhibition control

R Siegelmann, N Mehrabi, P Goyal… - Findings of the …, 2024 - aclanthology.org

Abstract Large Language Models (LLMs) are powerful tools which have been both dominant
and commonplace in the field of Artificial Intelligence. Yet, LLMs have a tendency to devolve …