Amixer: Adaptive weight mixing for self-attention free vision transformers

Y Rao, W Zhao, J Zhou, J Lu - European Conference on Computer Vision, 2022 - Springer
Vision Transformers have shown state-of-the-art results for various visual recognition tasks.
The dot-product self-attention mechanism that replaces convolution to mix spatial …

BiMLP: Compact binary architectures for vision multi-layer perceptrons

Y Xu, X Chen, Y Wang - Advances in Neural Information …, 2022 - proceedings.neurips.cc
This paper studies the problem of designing compact binary architectures for vision multi-
layer perceptrons (MLPs). We provide extensive analysis on the difficulty of binarizing vision …

RaftMLP: How much can be done without attention and with less spatial locality?

Y Tatsunami, M Taki - … of the Asian Conference on Computer …, 2022 - openaccess.thecvf.com
For the past ten years, CNN has reigned supreme in the world of computer vision, but
recently, Transformer has been on the rise. However, the quadratic computational cost of …

Automated classification of cervical Lymph-Node-Level from ultrasound using depthwise separable convolutional swin transformer

Y Liu, J Zhao, Q Luo, C Shen, R Wang… - Computers in Biology and …, 2022 - Elsevier
There are few studies on cervical ultrasound lymph-node-level classification which is very
important for qualitative diagnosis and surgical treatment of diseases. Currently, ultrasound …

Deep crowd anomaly detection: state-of-the-art, challenges, and future research directions

MH Sharif, L Jiao, CW Omlin - arXiv preprint arXiv:2210.13927, 2022 - arxiv.org
Crowd anomaly detection is one of the most popular topics in computer vision in the context
of smart cities. A plethora of deep learning methods have been proposed that generally …

[PDF][PDF] Activemlp: An mlp-like architecture with active token mixer

G Wei, Z Zhang, C Lan, Y Lu… - arXiv preprint arXiv …, 2022 - researchgate.net
This paper presents ActiveMLP, a general MLP-like backbone for computer vision. The three
existing dominant network families, ie, CNNs, Transformers and MLPs, differ from each other …

Eurnet: Efficient multi-range relational modeling of spatial multi-relational data

M Xu, Y Guo, Y Xu, J Tang, X Chen, Y Tian - arXiv preprint arXiv …, 2022 - arxiv.org
Modeling spatial relationship in the data remains critical across many different tasks, such as
image classification, semantic segmentation and protein structure understanding. Previous …

A close look at spatial modeling: From attention to convolution

X Ma, H Wang, C Qin, K Li, X Zhao, J Fu… - arXiv preprint arXiv …, 2022 - arxiv.org
Vision Transformers have shown great promise recently for many vision tasks due to the
insightful architecture design and attention mechanism. By revisiting the self-attention …

Axial multi-layer perceptron architecture for automatic segmentation of choroid plexus in multiple sclerosis

M Schmidt-Mengin, VAG Ricigliano… - Medical Imaging …, 2022 - spiedigitallibrary.org
Choroid plexuses (CP) are structures of the brain ventricles which produce most of the
cerebrospinal fluid (CSF). Several postmortem and in vivo studies have pointed towards …

SIL-Net: A Semi-Isotropic L-shaped network for dermoscopic image segmentation

Z Zhang, Y Jiang, H Qiao, M Wang, W Yan… - Computers in Biology and …, 2022 - Elsevier
Background: Dermoscopic image segmentation using deep learning algorithms is a critical
technology for skin cancer detection and therapy. Specifically, this technology is a spatially …