Attention mechanisms in computer vision: A survey

MH Guo, TX Xu, JJ Liu, ZN Liu, PT Jiang, TJ Mu… - Computational visual …, 2022 - Springer
Humans can naturally and effectively find salient regions in complex scenes. Motivated by
this observation, attention mechanisms were introduced into computer vision with the aim of …

A survey of convolutional neural networks: analysis, applications, and prospects

Z Li, F Liu, W Yang, S Peng… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
A convolutional neural network (CNN) is one of the most significant networks in the deep
learning field. Since CNN made impressive achievements in many areas, including but not …

Internimage: Exploring large-scale vision foundation models with deformable convolutions

W Wang, J Dai, Z Chen, Z Huang, Z Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Compared to the great progress of large-scale vision transformers (ViTs) in recent years,
large-scale models based on convolutional neural networks (CNNs) are still in an early …

Large selective kernel network for remote sensing object detection

Y Li, Q Hou, Z Zheng, MM Cheng… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent research on remote sensing object detection has largely focused on improving the
representation of oriented bounding boxes but has overlooked the unique prior knowledge …

Vision transformer with deformable attention

Z Xia, X Pan, S Song, LE Li… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Transformers have recently shown superior performances on various vision tasks. The large,
sometimes even global, receptive field endows Transformer models with higher …

A small-sized object detection oriented multi-scale feature fusion approach with application to defect detection

N Zeng, P Wu, Z Wang, H Li, W Liu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Object detection is a well-known task in the field of computer vision, especially the small
target detection problem that has aroused great academic attention. In order to improve the …

Focal sparse convolutional networks for 3d object detection

Y Chen, Y Li, X Zhang, J Sun… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Non-uniformed 3D sparse data, eg, point clouds or voxels in different spatial positions, make
contribution to the task of 3D object detection in different ways. Existing basic components in …

Swin transformer embedding UNet for remote sensing image semantic segmentation

X He, Y Zhou, J Zhao, D Zhang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Global context information is essential for the semantic segmentation of remote sensing (RS)
images. However, most existing methods rely on a convolutional neural network (CNN) …

Tood: Task-aligned one-stage object detection

C Feng, Y Zhong, Y Gao, MR Scott… - 2021 IEEE/CVF …, 2021 - computer.org
One-stage object detection is commonly implemented by optimizing two sub-tasks: object
classification and localization, using heads with two parallel branches, which might lead to a …

Practical stereo matching via cascaded recurrent network with adaptive correlation

J Li, P Wang, P Xiong, T Cai, Z Yan… - Proceedings of the …, 2022 - openaccess.thecvf.com
With the advent of convolutional neural networks, stereo matching algorithms have recently
gained tremendous progress. However, it remains a great challenge to accurately extract …