BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP

J Bai, K Gao, S Min, ST Xia, Z Li… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Contrastive Vision-Language Pre-training known as CLIP has shown promising
effectiveness in addressing downstream image recognition tasks. However recent works …

Towards faithful xai evaluation via generalization-limited backdoor watermark

M Ya, Y Li, T Dai, B Wang, Y Jiang… - The Twelfth International …, 2023 - openreview.net
Saliency-based representation visualization (SRV)($ eg $, Grad-CAM) is one of the most
classical and widely adopted explainable artificial intelligence (XAI) methods for its simplicity …

Not all prompts are secure: A switchable backdoor attack against pre-trained vision transfomers

S Yang, J Bai, K Gao, Y Yang, Y Li… - Proceedings of the …, 2024 - openaccess.thecvf.com
Given the power of vision transformers a new learning paradigm pre-training and then
prompting makes it more efficient and effective to address downstream visual recognition …

Pointncbw: Towards dataset ownership verification for point clouds via negative clean-label backdoor watermark

C Wei, Y Wang, K Gao, S Shao, Y Li… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Recently, point clouds have been widely used in computer vision, whereas their collection is
time-consuming and expensive. As such, point cloud datasets are the valuable intellectual …

Backdoor attacks on dense passage retrievers for disseminating misinformation

Q Long, Y Deng, LL Gan, W Wang, SJ Pan - arXiv preprint arXiv …, 2024 - arxiv.org
Dense retrievers and retrieval-augmented language models have been widely used in
various NLP applications. Despite being designed to deliver reliable and secure outcomes …

GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval

Y Wang, J Wang, B Chen, Z Zeng, ST Xia - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Given a text query, partially relevant video retrieval (PRVR) seeks to find untrimmed videos
containing pertinent moments in a database. For PRVR, clip modeling is essential to capture …

WaterDiff: Perceptual Image Watermarks Via Diffusion Model

Y Tan, Y Peng, H Fang, B Chen… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
Recent studies have demonstrated that diffusion probabilistic models (DPMs) have
numerous advantages in image generation through learning a decodable latent …

Efficient Self-Supervised Video Hashing with Selective State Spaces

J Wang, N Lian, J Li, Y Wang, Y Feng, B Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
Self-supervised video hashing (SSVH) is a practical task in video indexing and retrieval.
Although Transformers are predominant in SSVH for their impressive temporal modeling …

One Pixel is All I Need

D Siqin, Z Xiaoyi - arXiv preprint arXiv:2412.10681, 2024 - arxiv.org
Vision Transformers (ViTs) have achieved record-breaking performance in various visual
tasks. However, concerns about their robustness against backdoor attacks have grown …