Cross-source point cloud registration: Challenges, progress and prospects
The emerging topic of cross-source point cloud (CSPC) registration has attracted increasing
attention with the fast development background of 3D sensor technologies. Different from the …
attention with the fast development background of 3D sensor technologies. Different from the …
Lamm: Language-assisted multi-modal instruction-tuning dataset, framework, and benchmark
Large language models have emerged as a promising approach towards achieving general-
purpose AI agents. The thriving open-source LLM community has greatly accelerated the …
purpose AI agents. The thriving open-source LLM community has greatly accelerated the …
Clip2point: Transfer clip to point cloud classification with image-depth pre-training
Pre-training across 3D vision and language remains under development because of limited
training data. Recent works attempt to transfer vision-language (VL) pre-training methods to …
training data. Recent works attempt to transfer vision-language (VL) pre-training methods to …
Swin3d: A pretrained transformer backbone for 3d indoor scene understanding
The use of pretrained backbones with fine-tuning has been successful for 2D vision and
natural language processing tasks, showing advantages over task-specific networks. In this …
natural language processing tasks, showing advantages over task-specific networks. In this …
Uni3d: Exploring unified 3d representation at scale
Scaling up representations for images or text has been extensively investigated in the past
few years and has led to revolutions in learning vision and language. However, scalable …
few years and has led to revolutions in learning vision and language. However, scalable …
Instance-aware dynamic prompt tuning for pre-trained point cloud models
Pre-trained point cloud models have found extensive applications in 3D understanding tasks
like object classification and part segmentation. However, the prevailing strategy of full fine …
like object classification and part segmentation. However, the prevailing strategy of full fine …
Point Cloud Pre-training with Diffusion Models
Pre-training a model and then fine-tuning it on downstream tasks has demonstrated
significant success in the 2D image and NLP domains. However due to the unordered and …
significant success in the 2D image and NLP domains. However due to the unordered and …
3dmit: 3d multi-modal instruction tuning for scene understanding
Z Li, C Zhang, X Wang, R Ren, Y Xu… - … on Multimedia and …, 2024 - ieeexplore.ieee.org
The remarkable potential of multi-modal large language models (MLLMs) in comprehending
both vision and language information has been widely acknowledged. However, the scarcity …
both vision and language information has been widely acknowledged. However, the scarcity …
Self-supervised learning for pre-training 3d point clouds: A survey
Point cloud data has been extensively studied due to its compact form and flexibility in
representing complex 3D structures. The ability of point cloud data to accurately capture and …
representing complex 3D structures. The ability of point cloud data to accurately capture and …
E-clip: Towards label-efficient event-based open-world understanding by clip
Contrasting Language-image pertaining (CLIP) has recently shown promising open-world
and few-shot performance on 2D image-based recognition tasks. However, the transferred …
and few-shot performance on 2D image-based recognition tasks. However, the transferred …