Rmt: Retentive networks meet vision transformers
Abstract Vision Transformer (ViT) has gained increasing attention in the computer vision
community in recent years. However the core component of ViT Self-Attention lacks explicit …
community in recent years. However the core component of ViT Self-Attention lacks explicit …
DenseNets reloaded: paradigm shift beyond ResNets and ViTs
Abstract This paper revives Densely Connected Convolutional Networks (DenseNets) and
reveals the underrated effectiveness over predominant ResNet-style architectures. We …
reveals the underrated effectiveness over predominant ResNet-style architectures. We …
MambaOut: Do We Really Need Mamba for Vision?
Mamba, an architecture with RNN-like token mixer of state space model (SSM), was recently
introduced to address the quadratic complexity of the attention mechanism and …
introduced to address the quadratic complexity of the attention mechanism and …
PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution
Recently some large kernel convnets strike back with appealing performance and efficiency.
However given the square complexity of convolution scaling up kernels can bring about an …
However given the square complexity of convolution scaling up kernels can bring about an …
Mixed receptive fields augmented YOLO with multi-path spatial pyramid pooling for steel surface defect detection
K Xia, Z Lv, C Zhou, G Gu, Z Zhao, K Liu, Z Li - Sensors, 2023 - mdpi.com
Aiming at the problems of low detection efficiency and poor detection accuracy caused by
texture feature interference and dramatic changes in the scale of defect on steel surfaces, an …
texture feature interference and dramatic changes in the scale of defect on steel surfaces, an …
Gramian Attention Heads are Strong yet Efficient Vision Learners
We introduce a novel architecture design that enhances expressiveness by incorporating
multiple head classifiers (ie, classification heads) instead of relying on channel expansion or …
multiple head classifiers (ie, classification heads) instead of relying on channel expansion or …
Poly kernel inception network for remote sensing detection
Object detection in remote sensing images (RSIs) often suffers from several increasing
challenges including the large variation in object scales and the diverse-ranging context …
challenges including the large variation in object scales and the diverse-ranging context …
MLP-based classification of COVID-19 and skin diseases
Recent years have witnessed a growing interest in neural network-based medical image
classification methods, which have demonstrated remarkable performance in this field …
classification methods, which have demonstrated remarkable performance in this field …
YOLOFM: an improved fire and smoke object detection algorithm based on YOLOv5n
X Geng, Y Su, X Cao, H Li, L Liu - Scientific Reports, 2024 - nature.com
To address the current difficulties in fire detection algorithms, including inadequate feature
extraction, excessive computational complexity, limited deployment on devices with limited …
extraction, excessive computational complexity, limited deployment on devices with limited …
An efficient medical image classification network based on multi-branch CNN, token grouping Transformer and mixer MLP
S Liu, L Wang, W Yue - Applied Soft Computing, 2024 - Elsevier
In recent years, medical image classification techniques based on deep learning have made
remarkable achievements, but most of the current models sacrifice the efficiency of the …
remarkable achievements, but most of the current models sacrifice the efficiency of the …