Red teaming visual language models

D Zhang, Y Yu, C Li, J Dong, D Su, C Chu… - arXiv preprint arXiv …, 2024 - arxiv.org

In the past year, MultiModal Large Language Models (MM-LLMs) have undergone
substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs …

被引用次数：54 相关文章所有 2 个版本

[PDF] arxiv.org

Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models

Y Li, H Guo, K Zhou, WX Zhao, JR Wen - arXiv preprint arXiv:2403.09792, 2024 - arxiv.org

In this paper, we study the harmlessness alignment problem of multimodal large language
models~(MLLMs). We conduct a systematic empirical analysis of the harmlessness …

被引用次数：9 相关文章所有 2 个版本

[PDF] arxiv.org

Adashield: Safeguarding multimodal large language models from structure-based attack via adaptive shield prompting

Y Wang, X Liu, Y Li, M Chen, C Xiao - arXiv preprint arXiv:2403.09513, 2024 - arxiv.org

With the advent and widespread deployment of Multimodal Large Language Models
(MLLMs), the imperative to ensure their safety has become increasingly pronounced …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

Eyes closed, safety on: Protecting multimodal llms via image-to-text transformation

Y Gou, K Chen, Z Liu, L Hong, H Xu, Z Li… - arXiv preprint arXiv …, 2024 - arxiv.org

Multimodal large language models (MLLMs) have shown impressive reasoning abilities,
which, however, are also more vulnerable to jailbreak attacks than their LLM predecessors …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Safety of Multimodal Large Language Models on Images and Text

X Liu, Y Zhu, Y Lan, C Yang, Y Qiao - arXiv preprint arXiv:2402.00357, 2024 - arxiv.org

Attracted by the impressive power of Multimodal Large Language Models (MLLMs), the
public is increasingly utilizing them to improve the efficiency of daily work. Nonetheless, the …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends

D Liu, M Yang, X Qu, P Zhou, W Hu… - arXiv preprint arXiv …, 2024 - arxiv.org

With the significant development of large models in recent years, Large Vision-Language
Models (LVLMs) have demonstrated remarkable capabilities across a wide range of …

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

X Tao, S Zhong, L Li, Q Liu, L Kong - arXiv preprint arXiv:2403.02910, 2024 - arxiv.org

There has been an increasing interest in the alignment of large language models (LLMs)
with human values. However, the safety issues of their integration with a vision module, or …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model

Y Zhang, L Chen, G Zheng, Y Gao, R Zheng… - arXiv preprint arXiv …, 2024 - arxiv.org

The emergence of Vision Language Models (VLMs) has brought unprecedented advances
in understanding multimodal information. The combination of textual and visual semantics in …

[PDF] arxiv.org

A Survey on Safe Multi-Modal Learning System

T Zhao, L Zhang, Y Ma, L Cheng - arXiv preprint arXiv:2402.05355, 2024 - arxiv.org

With the wide deployment of multimodal learning systems (MMLS) in real-world scenarios,
safety concerns have become increasingly prominent. The absence of systematic research …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Unbridled Icarus: A Survey of the Potential Perils of Image Inputs in Multimodal Large Language Model Security

Y Fan, Y Cao, Z Zhao, Z Liu, S Li - arXiv preprint arXiv:2404.05264, 2024 - arxiv.org

Multimodal Large Language Models (MLLMs) demonstrate remarkable capabilities that
increasingly influence various aspects of our daily lives, constantly defining the new …

被引用次数：2 相关文章所有 2 个版本