Boosting vision transformers for image retrieval

CH Song, J Yoon, S Choi… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
The explosive increase in vision transformers studies has shown remarkable progress in
vision tasks such as image classification and detection. However, in instance-level image …

A comparative evaluation between convolutional neural networks and vision transformers for COVID-19 detection

SI Nafisah, G Muhammad, MS Hossain, SA AlQahtani - Mathematics, 2023 - mdpi.com
Early illness detection enables medical professionals to deliver the best care and increases
the likelihood of a full recovery. In this work, we show that computer-aided design (CAD) …

DKT: Diverse knowledge transfer transformer for class incremental learning

X Gao, Y He, S Dong, J Cheng… - Proceedings of the …, 2023 - openaccess.thecvf.com
Deep neural networks suffer from catastrophic forgetting in class incremental learning,
where the classification accuracy of old classes drastically deteriorates when the networks …

NOAH: Learning Pairwise Object Category Attentions for Image Classification

C Li, A Zhou, A Yao - arXiv preprint arXiv:2402.02377, 2024 - arxiv.org
A modern deep neural network (DNN) for image classification tasks typically consists of two
parts: a backbone for feature extraction, and a head for feature encoding and class …

Towards a Deeper Understanding of Global Covariance Pooling in Deep Learning: An Optimization Perspective

Q Wang, Z Zhang, M Gao, J Xie, P Zhu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Global covariance pooling (GCP) as an effective alternative to global average pooling has
shown good capacity to improve deep convolutional neural networks (CNNs) in a variety of …

基于语言− 视觉对比学习的多模态视频行为识别方法

张颖, 张冰冰, 董微, 安峰民, 张建新, 张强 - 自动化学报, 2024 - aas.net.cn
以对比语言− 图像预训练(Contrastive language-image pre-training, CLIP) 模型为基础,
提出一种面向视频行为识别的多模态模型, 该模型从视觉编码器的时序建模和行为类别语言描述 …

High-order correlation network for video recognition

W Dong, Z Wang, B Zhang, J Zhang… - 2022 International Joint …, 2022 - ieeexplore.ieee.org
How to model global video representation is an important research content of video
recognition. Among current convolutional neural network (CNN) based methods, only using …

Remote Sensing Scene Classification via Second-order Differentiable Token Transformer Network

K Ni, Q Wu, S Li, Z Zheng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
The vision transformer has been widely applied in remote sensing image scene
classification due to its excellent ability to capture global features. However, remote sensing …

PM2: A New Prompting Multi-modal Model Paradigm for Few-shot Medical Image Classification

Z Wang, Q Sun, B Zhang, P Wang, J Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Few-shot learning has been successfully applied to medical image classification as only
very few medical examples are available for training. Due to the challenging problem of …

Optimization of neural networks for deep learning and applications to CT image segmentation

G Pezzano - 2023 - diposit.ub.edu
[eng] During the last few years, AI development in deep learning has been going so fast that
even important researchers, politicians, and entrepreneurs are signing petitions to try to slow …