Clusterfomer: clustering as a universal visual learner

J Liang, Y Cui, Q Wang, T Geng… - Advances in neural …, 2024 - proceedings.neurips.cc
This paper presents ClusterFormer, a universal vision model that is based on the Clustering
paradigm with TransFormer. It comprises two novel designs: 1) recurrent cross-attention …

Survival prediction across diverse cancer types using neural networks

X Yan, W Wang, M Xiao, Y Li, M Gao - Proceedings of the 2024 7th …, 2024 - dl.acm.org
Gastric cancer and Colon adenocarcinoma represent widespread and challenging
malignancies with high mortality rates and complex treatment landscapes. In response to the …

Sea-raft: Simple, efficient, accurate raft for optical flow

Y Wang, L Lipson, J Deng - European Conference on Computer Vision, 2025 - Springer
We introduce SEA-RAFT, a more simple, efficient, and accurate RAFT for optical flow.
Compared with RAFT, SEA-RAFT is trained with a new loss (mixture of Laplace). It directly …

Unified 3d segmenter as prototypical classifiers

Z Qin, C Han, Q Wang, X Nie, Y Yin… - Advances in Neural …, 2023 - proceedings.neurips.cc
The task of point cloud segmentation, comprising semantic, instance, and panoptic
segmentation, has been mainly tackled by designing task-specific network architectures …

Convolutional neural network classification of cancer cytopathology images: taking breast cancer as an example

MX Xiao, Y Li, X Yan, M Gao, W Wang - Proceedings of the 2024 7th …, 2024 - dl.acm.org
Breast cancer is a relatively common cancer among gynecological cancers. Its diagnosis
often relies on the pathology of cells in the lesion. The pathological diagnosis of breast …

E^ 2VPT: An Effective and Efficient Approach for Visual Prompt Tuning

C Han, Q Wang, Y Cui, Z Cao, W Wang, S Qi… - arXiv preprint arXiv …, 2023 - arxiv.org
As the size of transformer-based models continues to grow, fine-tuning these large-scale
pretrained vision models for new tasks has become increasingly parameter-intensive …

Reformulating graph kernels for self-supervised space-time correspondence learning

Z Qin, X Lu, D Liu, X Nie, Y Yin, J Shen… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Self-supervised space-time correspondence learning utilizing unlabeled videos holds great
potential in computer vision. Most existing methods rely on contrastive learning with mining …

FlowDiffuser: Advancing Optical Flow Estimation with Diffusion Models

A Luo, X Li, F Yang, J Liu, H Fan… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Optical flow estimation a process of predicting pixel-wise displacement between consecutive
frames has commonly been approached as a regression task in the age of deep learning …

Label-efficient video object segmentation with motion clues

Y Lu, J Zhang, S Sun, Q Guo, Z Cao… - … on Circuits and …, 2023 - ieeexplore.ieee.org
Video object segmentation (VOS) plays an important role in video analysis and
understanding, which in turn facilitates a number of diverse applications, including video …

Feature fusion Vision Transformers using MLP-Mixer for enhanced deepfake detection

E Essa - Neurocomputing, 2024 - Elsevier
Deepfake technology, utilizing deep learning and computer vision, presents significant
security threats by generating highly realistic synthetic media, such as images and videos. In …