Segment any point cloud sequences by distilling vision foundation models

Y Liu, L Kong, J Cen, R Chen… - Advances in …, 2024 - proceedings.neurips.cc
Recent advancements in vision foundation models (VFMs) have opened up new
possibilities for versatile and efficient visual perception. In this work, we introduce Seal, a …

An outlook into the future of egocentric vision

C Plizzari, G Goletto, A Furnari, S Bansal… - International Journal of …, 2024 - Springer
What will the future be? We wonder! In this survey, we explore the gap between current
research in egocentric vision and the ever-anticipated future, where wearable computing …

LeaF: Learning Frames for 4D Point Cloud Sequence Understanding

Y Liu, J Chen, Z Zhang, J Huang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We focus on learning descriptive geometry and motion features from 4D point cloud
sequences in this work. Existing works usually develop generic 4D learning tools without …

Masked spatio-temporal structure prediction for self-supervised learning on point cloud videos

Z Shen, X Sheng, H Fan, L Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recently, the community has made tremendous progress in developing effective methods
for point cloud video understanding that learn from massive amounts of labeled data …

Point contrastive prediction with semantic clustering for self-supervised learning on point cloud videos

X Sheng, Z Shen, G Xiao, L Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
We propose a unified point cloud video self-supervised learning framework for object-centric
and scene-centric data. Previous methods commonly conduct representation learning at the …

A Unified Framework for Human-centric Point Cloud Video Understanding

Y Xu, K Ye, X Han, Y Ren, X Zhu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Human-centric Point Cloud Video Understanding (PVU) is an emerging field
focused on extracting and interpreting human-related features from sequences of human …

CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding

Y Liu, C Chen, Z Wang, L Yi - arXiv preprint arXiv:2401.09057, 2024 - arxiv.org
This paper introduces a novel approach named CrossVideo, which aims to enhance self-
supervised cross-modal contrastive learning in the field of point cloud video understanding …

X4d-sceneformer: Enhanced scene understanding on 4d point cloud videos through cross-modal knowledge transfer

L Jing, Y Xue, X Yan, C Zheng, D Wang… - Proceedings of the …, 2024 - ojs.aaai.org
The field of 4D point cloud understanding is rapidly developing with the goal of analyzing
dynamic 3D point cloud sequences. However, it remains a challenging task due to the …

MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models

J Liu, J Han, L Liu, AI Aviles-Rivero, C Jiang… - arXiv preprint arXiv …, 2024 - arxiv.org
Point cloud videos effectively capture real-world spatial geometries and temporal dynamics,
which are essential for enabling intelligent agents to understand the dynamically changing …

A review of point cloud segmentation for understanding 3D indoor scenes

Y Sun, X Zhang, Y Miao - Visual Intelligence, 2024 - Springer
Point cloud segmentation is an essential task in three-dimensional (3D) vision and
intelligence. It is a critical step in understanding 3D scenes with a variety of applications …