Relation networks for object detection

MH Guo, TX Xu, JJ Liu, ZN Liu, PT Jiang, TJ Mu… - Computational visual …, 2022 - Springer

Humans can naturally and effectively find salient regions in complex scenes. Motivated by
this observation, attention mechanisms were introduced into computer vision with the aim of …

被引用次数：1323 相关文章所有 8 个版本

[PDF] sciencedirect.com

Recent advances and clinical applications of deep learning in medical image analysis

X Chen, X Wang, K Zhang, KM Fung, TC Thai… - Medical image …, 2022 - Elsevier

Deep learning has received extensive research interest in developing new medical image
processing algorithms, and deep learning based models have been remarkably successful …

被引用次数：391 相关文章所有 9 个版本

[HTML] springer.com Full View

[HTML][HTML] Visual attention network

MH Guo, CZ Lu, ZN Liu, MM Cheng, SM Hu - Computational Visual Media, 2023 - Springer

While originally designed for natural language processing tasks, the self-attention
mechanism has recently taken various computer vision areas by storm. However, the 2D …

被引用次数：503 相关文章所有 8 个版本

[PDF] thecvf.com

Swin transformer v2: Scaling up capacity and resolution

Z Liu, H Hu, Y Lin, Z Yao, Z Xie, Y Wei… - Proceedings of the …, 2022 - openaccess.thecvf.com

We present techniques for scaling Swin Transformer [??] up to 3 billion parameters and
making it capable of training with images of up to 1,536 x1, 536 resolution. By scaling up …

被引用次数：1440 相关文章所有 6 个版本

A small-sized object detection oriented multi-scale feature fusion approach with application to defect detection

N Zeng, P Wu, Z Wang, H Li, W Liu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Object detection is a well-known task in the field of computer vision, especially the small
target detection problem that has aroused great academic attention. In order to improve the …

被引用次数：315 相关文章所有 2 个版本

[PDF] thecvf.com

An end-to-end transformer model for 3d object detection

I Misra, R Girdhar, A Joulin - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com

We propose 3DETR, an end-to-end Transformer based object detection model for 3D point
clouds. Compared to existing detection methods that employ a number of 3D-specific …

被引用次数：426 相关文章所有 7 个版本

[PDF] thecvf.com

Video swin transformer

Z Liu, J Ning, Y Cao, Y Wei, Z Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com

The vision community is witnessing a modeling shift from CNNs to Transformers, where pure
Transformer architectures have attained top accuracy on the major video recognition …

被引用次数：1459 相关文章所有 8 个版本

[PDF] mdpi.com

A survey of visual transformers

Y Liu, Y Zhang, Y Wang, F Hou, J Yuan… - … on Neural Networks …, 2023 - ieeexplore.ieee.org

Transformer, an attention-based encoder–decoder model, has already revolutionized the
field of natural language processing (NLP). Inspired by such significant achievements, some …

被引用次数：291 相关文章所有 22 个版本

[PDF] arxiv.org

Swin-unet: Unet-like pure transformer for medical image segmentation

H Cao, Y Wang, J Chen, D Jiang, X Zhang… - European conference on …, 2022 - Springer

In the past few years, convolutional neural networks (CNNs) have achieved milestones in
medical image analysis. In particular, deep neural networks based on U-shaped architecture …

被引用次数：2477 相关文章所有 5 个版本

[PDF] thecvf.com

End-to-end semi-supervised object detection with soft teacher

M Xu, Z Zhang, H Hu, J Wang, L Wang… - Proceedings of the …, 2021 - openaccess.thecvf.com

Previous pseudo-label approaches for semi-supervised object detection typically follow a
multi-stage schema, with the first stage to train an initial detector on a few labeled data …

被引用次数：456 相关文章所有 7 个版本