Transformers in vision: A survey
Astounding results from Transformer models on natural language tasks have intrigued the
vision community to study their application to computer vision problems. Among their salient …
vision community to study their application to computer vision problems. Among their salient …
Vision transformers for dense prediction: A survey
S Zuo, Y Xiao, X Chang, X Wang - Knowledge-Based Systems, 2022 - Elsevier
Transformers have demonstrated impressive expressiveness and transfer capability in
computer vision fields. Dense prediction is a fundamental problem in computer vision that is …
computer vision fields. Dense prediction is a fundamental problem in computer vision that is …
Large selective kernel network for remote sensing object detection
Recent research on remote sensing object detection has largely focused on improving the
representation of oriented bounding boxes but has overlooked the unique prior knowledge …
representation of oriented bounding boxes but has overlooked the unique prior knowledge …
Vision transformer adapter for dense predictions
This work investigates a simple yet powerful adapter for Vision Transformer (ViT). Unlike
recent visual transformers that introduce vision-specific inductive biases into their …
recent visual transformers that introduce vision-specific inductive biases into their …
Visual attention network
While originally designed for natural language processing tasks, the self-attention
mechanism has recently taken various computer vision areas by storm. However, the 2D …
mechanism has recently taken various computer vision areas by storm. However, the 2D …
Segvit: Semantic segmentation with plain vision transformers
We explore the capability of plain Vision Transformers (ViTs) for semantic segmentation and
propose the SegViT. Previous ViT-based segmentation networks usually learn a pixel-level …
propose the SegViT. Previous ViT-based segmentation networks usually learn a pixel-level …
Delivering arbitrary-modal semantic segmentation
Multimodal fusion can make semantic segmentation more robust. However, fusing an
arbitrary number of modalities remains underexplored. To delve into this problem, we create …
arbitrary number of modalities remains underexplored. To delve into this problem, we create …
Focal network for image restoration
Image restoration aims to reconstruct a sharp image from its degraded counterpart, which
plays an important role in many fields. Recently, Transformer models have achieved …
plays an important role in many fields. Recently, Transformer models have achieved …
Jcs: An explainable covid-19 diagnosis system by joint classification and segmentation
Recently, the coronavirus disease 2019 (COVID-19) has caused a pandemic disease in
over 200 countries, influencing billions of humans. To control the infection, identifying and …
over 200 countries, influencing billions of humans. To control the infection, identifying and …
Centralized feature pyramid for object detection
Y Quan, D Zhang, L Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
The visual feature pyramid has shown its superiority in both effectiveness and efficiency in a
variety of applications. However, current methods overly focus on inter-layer feature …
variety of applications. However, current methods overly focus on inter-layer feature …