Universal and transferable adversarial attacks on aligned language models
… these attacks have required significant human ingenuity and are brittle in practice. Attempts
at automatic adversarial … and effective attack method that causes aligned language models to …
at automatic adversarial … and effective attack method that causes aligned language models to …
Set-level guidance attack: Boosting adversarial transferability of vision-language pre-training models
… of generalizable adversarial examples, we propose using set-level alignmentpreserving …
Accelerating vision-language pretraining with free language modeling. In Proceedings of …
Accelerating vision-language pretraining with free language modeling. In Proceedings of …
Universal Adversarial Perturbations for Vision-Language Pre-trained Models
… Vision-language models form the cornerstone of a wide … proposed to learn aligned VLP
models that generate embeddings … Effective and Transferable Universal Adversarial Attack (ETU…
models that generate embeddings … Effective and Transferable Universal Adversarial Attack (ETU…
Are aligned neural networks adversarially aligned?
… attacks are simply not powerful enough to distinguish between robust and non-robust defenses:
even when we guarantee that an adversarial input on the language model … will transfer …
even when we guarantee that an adversarial input on the language model … will transfer …
Automatic hallucination assessment for aligned large language models via transferable adversarial attacks
… to use prompting chaining to generate transferable adversarial attacks in the form of question-…
Finally, we find that the adversarial examples generated by our method are transferable …
Finally, we find that the adversarial examples generated by our method are transferable …
Why do universal adversarial attacks work on large language models?: Geometry might be the answer
… Triggers are seen to transfer across models. We observe that this transferability occurs
across a number of models using the same tokenization algorithm. This behavior has also been …
across a number of models using the same tokenization algorithm. This behavior has also been …
Transferable multimodal attack on vision-language pre-training models
… Considering that VLP models rely more on aligned … [58],which are universal adversarial attack
defense methods.For … Tang, “Glm:General language model pretraining with autoregressive …
defense methods.For … Tang, “Glm:General language model pretraining with autoregressive …
From Noise to Clarity: Unraveling the Adversarial Suffix of Large Language Model Attacks via Translation of Text Embeddings
… model to align the embedding dimension of the target model (d2… universal adversarial suffixes
while preserving the ability of … transferable adversarial suffixes to attack black box models, …
while preserving the ability of … transferable adversarial suffixes to attack black box models, …
One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models
… on multiple Large vision-language Models (LVLM), such as … , ie, the transferable adversarial
attack which involves … we align with TMM [14] and consider some universal defense …
attack which involves … we align with TMM [14] and consider some universal defense …
Adversarial attacks on deep-learning models in natural language processing: A survey
… fluent and effective adversarial attacks [155]. MHA is based on language model and
Metropolis… model, but is also highly transferable to another model Show-Attend-and-Tell [147]. …
Metropolis… model, but is also highly transferable to another model Show-Attend-and-Tell [147]. …