Boosting vision transformers for image retrieval
CH Song, J Yoon, S Choi… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
The explosive increase in vision transformers studies has shown remarkable progress in
vision tasks such as image classification and detection. However, in instance-level image …
vision tasks such as image classification and detection. However, in instance-level image …
A comparative evaluation between convolutional neural networks and vision transformers for COVID-19 detection
Early illness detection enables medical professionals to deliver the best care and increases
the likelihood of a full recovery. In this work, we show that computer-aided design (CAD) …
the likelihood of a full recovery. In this work, we show that computer-aided design (CAD) …
DKT: Diverse knowledge transfer transformer for class incremental learning
Deep neural networks suffer from catastrophic forgetting in class incremental learning,
where the classification accuracy of old classes drastically deteriorates when the networks …
where the classification accuracy of old classes drastically deteriorates when the networks …
NOAH: Learning Pairwise Object Category Attentions for Image Classification
A modern deep neural network (DNN) for image classification tasks typically consists of two
parts: a backbone for feature extraction, and a head for feature encoding and class …
parts: a backbone for feature extraction, and a head for feature encoding and class …
Towards a Deeper Understanding of Global Covariance Pooling in Deep Learning: An Optimization Perspective
Global covariance pooling (GCP) as an effective alternative to global average pooling has
shown good capacity to improve deep convolutional neural networks (CNNs) in a variety of …
shown good capacity to improve deep convolutional neural networks (CNNs) in a variety of …
基于语言− 视觉对比学习的多模态视频行为识别方法
张颖, 张冰冰, 董微, 安峰民, 张建新, 张强 - 自动化学报, 2024 - aas.net.cn
以对比语言− 图像预训练(Contrastive language-image pre-training, CLIP) 模型为基础,
提出一种面向视频行为识别的多模态模型, 该模型从视觉编码器的时序建模和行为类别语言描述 …
提出一种面向视频行为识别的多模态模型, 该模型从视觉编码器的时序建模和行为类别语言描述 …
High-order correlation network for video recognition
How to model global video representation is an important research content of video
recognition. Among current convolutional neural network (CNN) based methods, only using …
recognition. Among current convolutional neural network (CNN) based methods, only using …
Remote Sensing Scene Classification via Second-order Differentiable Token Transformer Network
K Ni, Q Wu, S Li, Z Zheng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
The vision transformer has been widely applied in remote sensing image scene
classification due to its excellent ability to capture global features. However, remote sensing …
classification due to its excellent ability to capture global features. However, remote sensing …
PM2: A New Prompting Multi-modal Model Paradigm for Few-shot Medical Image Classification
Z Wang, Q Sun, B Zhang, P Wang, J Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Few-shot learning has been successfully applied to medical image classification as only
very few medical examples are available for training. Due to the challenging problem of …
very few medical examples are available for training. Due to the challenging problem of …
Optimization of neural networks for deep learning and applications to CT image segmentation
G Pezzano - 2023 - diposit.ub.edu
[eng] During the last few years, AI development in deep learning has been going so fast that
even important researchers, politicians, and entrepreneurs are signing petitions to try to slow …
even important researchers, politicians, and entrepreneurs are signing petitions to try to slow …