Clusterfomer: clustering as a universal visual learner
This paper presents ClusterFormer, a universal vision model that is based on the Clustering
paradigm with TransFormer. It comprises two novel designs: 1) recurrent cross-attention …
paradigm with TransFormer. It comprises two novel designs: 1) recurrent cross-attention …
Survival prediction across diverse cancer types using neural networks
X Yan, W Wang, M Xiao, Y Li, M Gao - Proceedings of the 2024 7th …, 2024 - dl.acm.org
Gastric cancer and Colon adenocarcinoma represent widespread and challenging
malignancies with high mortality rates and complex treatment landscapes. In response to the …
malignancies with high mortality rates and complex treatment landscapes. In response to the …
Sea-raft: Simple, efficient, accurate raft for optical flow
We introduce SEA-RAFT, a more simple, efficient, and accurate RAFT for optical flow.
Compared with RAFT, SEA-RAFT is trained with a new loss (mixture of Laplace). It directly …
Compared with RAFT, SEA-RAFT is trained with a new loss (mixture of Laplace). It directly …
Unified 3d segmenter as prototypical classifiers
The task of point cloud segmentation, comprising semantic, instance, and panoptic
segmentation, has been mainly tackled by designing task-specific network architectures …
segmentation, has been mainly tackled by designing task-specific network architectures …
Convolutional neural network classification of cancer cytopathology images: taking breast cancer as an example
MX Xiao, Y Li, X Yan, M Gao, W Wang - Proceedings of the 2024 7th …, 2024 - dl.acm.org
Breast cancer is a relatively common cancer among gynecological cancers. Its diagnosis
often relies on the pathology of cells in the lesion. The pathological diagnosis of breast …
often relies on the pathology of cells in the lesion. The pathological diagnosis of breast …
E^ 2VPT: An Effective and Efficient Approach for Visual Prompt Tuning
As the size of transformer-based models continues to grow, fine-tuning these large-scale
pretrained vision models for new tasks has become increasingly parameter-intensive …
pretrained vision models for new tasks has become increasingly parameter-intensive …
Reformulating graph kernels for self-supervised space-time correspondence learning
Self-supervised space-time correspondence learning utilizing unlabeled videos holds great
potential in computer vision. Most existing methods rely on contrastive learning with mining …
potential in computer vision. Most existing methods rely on contrastive learning with mining …
FlowDiffuser: Advancing Optical Flow Estimation with Diffusion Models
Optical flow estimation a process of predicting pixel-wise displacement between consecutive
frames has commonly been approached as a regression task in the age of deep learning …
frames has commonly been approached as a regression task in the age of deep learning …
Label-efficient video object segmentation with motion clues
Video object segmentation (VOS) plays an important role in video analysis and
understanding, which in turn facilitates a number of diverse applications, including video …
understanding, which in turn facilitates a number of diverse applications, including video …
Feature fusion Vision Transformers using MLP-Mixer for enhanced deepfake detection
E Essa - Neurocomputing, 2024 - Elsevier
Deepfake technology, utilizing deep learning and computer vision, presents significant
security threats by generating highly realistic synthetic media, such as images and videos. In …
security threats by generating highly realistic synthetic media, such as images and videos. In …