Point Transformer V3: Simpler Faster Stronger
This paper is not motivated to seek innovation within the attention mechanism. Instead it
focuses on overcoming the existing trade-offs between accuracy and efficiency within the …
focuses on overcoming the existing trade-offs between accuracy and efficiency within the …
Unipad: A universal pre-training paradigm for autonomous driving
In the context of autonomous driving the significance of effective feature learning is widely
acknowledged. While conventional 3D self-supervised pre-training methods have shown …
acknowledged. While conventional 3D self-supervised pre-training methods have shown …
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation
The booming of 3D recognition in the 2020s began with the introduction of point cloud
transformers. They quickly overwhelmed sparse CNNs and became state-of-the-art models …
transformers. They quickly overwhelmed sparse CNNs and became state-of-the-art models …
Groupcontrast: Semantic-aware self-supervised representation learning for 3d understanding
Self-supervised 3D representation learning aims to learn effective representations from
large-scale unlabeled point clouds. Most existing approaches adopt point discrimination as …
large-scale unlabeled point clouds. Most existing approaches adopt point discrimination as …
Skeleton-in-context: Unified skeleton sequence modeling with in-context learning
In-context learning provides a new perspective for multi-task modeling for vision and NLP.
Under this setting the model can perceive tasks from prompts and accomplish them without …
Under this setting the model can perceive tasks from prompts and accomplish them without …
Ponderv2: Pave the way for 3d foundataion model with a universal pre-training paradigm
In contrast to numerous NLP and 2D computer vision foundational models, the learning of a
robust and highly generalized 3D foundational model poses considerably greater …
robust and highly generalized 3D foundational model poses considerably greater …
Multi-Space Alignments Towards Universal LiDAR Segmentation
A unified and versatile LiDAR segmentation model with strong robustness and
generalizability is desirable for safe autonomous driving perception. This work presents …
generalizability is desirable for safe autonomous driving perception. This work presents …
UniMODE: Unified Monocular 3D Object Detection
Realizing unified monocular 3D object detection including both indoor and outdoor scenes
holds great importance in applications like robot navigation. However involving various …
holds great importance in applications like robot navigation. However involving various …
SaCo Loss: Sample-wise Affinity Consistency for Vision-Language Pre-training
Vision-language pre-training (VLP) aims to learn joint representations of vision and
language modalities. The contrastive paradigm is currently dominant in this field. However …
language modalities. The contrastive paradigm is currently dominant in this field. However …
Graph Transformer for 3D point clouds classification and semantic segmentation
W Zhou, Q Wang, W Jin, X Shi, Y He - Computers & Graphics, 2024 - Elsevier
Recently, graph-based and Transformer-based deep learning have demonstrated excellent
performances on various point cloud tasks. Most of the existing graph-based methods rely …
performances on various point cloud tasks. Most of the existing graph-based methods rely …