Improving vision transformers by revisiting high-frequency components

S Jamil, M Jalil Piran, OJ Kwon - Drones, 2023 - mdpi.com

As a special type of transformer, vision transformers (ViTs) can be used for various computer
vision (CV) applications. Convolutional neural networks (CNNs) have several potential …

被引用次数：51 相关文章所有 8 个版本

[PDF] thecvf.com

Backdoor defense via adaptively splitting poisoned dataset

K Gao, Y Bai, J Gu, Y Yang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Backdoor defenses have been studied to alleviate the threat of deep neural networks
(DNNs) being backdoor attacked and thus maliciously altered. Since DNNs usually adopt …

被引用次数：55 相关文章所有 7 个版本

[PDF] arxiv.org

Towards efficient adversarial training on vision transformers

B Wu, J Gu, Z Li, D Cai, X He, W Liu - European Conference on Computer …, 2022 - Springer

Abstract Vision Transformer (ViT), as a powerful alternative to Convolutional Neural Network
(CNN), has received much attention. Recent work showed that ViTs are also vulnerable to …

被引用次数：49 相关文章所有 6 个版本

[PDF] thecvf.com

BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP

J Bai, K Gao, S Min, ST Xia, Z Li… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Contrastive Vision-Language Pre-training known as CLIP has shown promising
effectiveness in addressing downstream image recognition tasks. However recent works …

被引用次数：23 相关文章所有 5 个版本

[PDF] neurips.cc

Vtc-lfc: Vision transformer compression with low-frequency components

Z Wang, H Luo, P Wang, F Ding… - Advances in Neural …, 2022 - proceedings.neurips.cc

Abstract Although Vision transformers (ViTs) have recently dominated many vision tasks,
deploying ViT models on resource-limited devices remains a challenging problem. To …

被引用次数：30 相关文章所有 4 个版本

[PDF] thecvf.com

Aloft: A lightweight mlp-like architecture with dynamic low-frequency transform for domain generalization

J Guo, N Wang, L Qi, Y Shi - … of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com

Abstract Domain generalization (DG) aims to learn a model that generalizes well to unseen
target domains utilizing multiple source domains without re-training. Most existing DG works …

被引用次数：36 相关文章所有 5 个版本

[PDF] openreview.net

Towards reliable and efficient backdoor trigger inversion via decoupling benign features

X Xu, K Huang, Y Li, Z Qin, K Ren - The Twelfth International …, 2024 - openreview.net

Recent studies revealed that using third-party models may lead to backdoor threats, where
adversaries can maliciously manipulate model predictions based on backdoors implanted …

被引用次数：18 相关文章

[PDF] mlr.press

Dynamixer: a vision mlp architecture with dynamic mixing

Z Wang, W Jiang, YM Zhu, L Yuan… - … on machine learning, 2022 - proceedings.mlr.press

Recently, MLP-like vision models have achieved promising performances on mainstream
visual recognition tasks. In contrast with vision transformers and CNNs, the success of MLP …

被引用次数：41 相关文章所有 5 个版本

[PDF] thecvf.com

Improving adversarial robustness of masked autoencoders via test-time frequency-domain prompting

Q Huang, X Dong, D Chen, Y Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

In this paper, we investigate the adversarial robustness of vision transformers that are
equipped with BERT pretraining (eg, BEiT, MAE). A surprising observation is that MAE has …

被引用次数：10 相关文章所有 5 个版本

A general-purpose edge-feature guidance module to enhance vision transformers for plant disease identification

B Chang, Y Wang, X Zhao, G Li, P Yuan - Expert Systems with Applications, 2024 - Elsevier

As agricultural applications are specialized, plant diseases are diverse, and there is a lack of
agricultural datasets, current plant disease identification performance is inadequate. In this …

被引用次数：26 相关文章所有 2 个版本