Cross-source point cloud registration: Challenges, progress and prospects

X Huang, G Mei, J Zhang - Neurocomputing, 2023 - Elsevier
The emerging topic of cross-source point cloud (CSPC) registration has attracted increasing
attention with the fast development background of 3D sensor technologies. Different from the …

Lamm: Language-assisted multi-modal instruction-tuning dataset, framework, and benchmark

Z Yin, J Wang, J Cao, Z Shi, D Liu… - Advances in …, 2024 - proceedings.neurips.cc
Large language models have emerged as a promising approach towards achieving general-
purpose AI agents. The thriving open-source LLM community has greatly accelerated the …

Clip2point: Transfer clip to point cloud classification with image-depth pre-training

T Huang, B Dong, Y Yang, X Huang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Pre-training across 3D vision and language remains under development because of limited
training data. Recent works attempt to transfer vision-language (VL) pre-training methods to …

Swin3d: A pretrained transformer backbone for 3d indoor scene understanding

YQ Yang, YX Guo, JY Xiong, Y Liu, H Pan… - arXiv preprint arXiv …, 2023 - arxiv.org
The use of pretrained backbones with fine-tuning has been successful for 2D vision and
natural language processing tasks, showing advantages over task-specific networks. In this …

Uni3d: Exploring unified 3d representation at scale

J Zhou, J Wang, B Ma, YS Liu, T Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
Scaling up representations for images or text has been extensively investigated in the past
few years and has led to revolutions in learning vision and language. However, scalable …

Instance-aware dynamic prompt tuning for pre-trained point cloud models

Y Zha, J Wang, T Dai, B Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Pre-trained point cloud models have found extensive applications in 3D understanding tasks
like object classification and part segmentation. However, the prevailing strategy of full fine …

Point Cloud Pre-training with Diffusion Models

X Zheng, X Huang, G Mei, Y Hou… - Proceedings of the …, 2024 - openaccess.thecvf.com
Pre-training a model and then fine-tuning it on downstream tasks has demonstrated
significant success in the 2D image and NLP domains. However due to the unordered and …

3dmit: 3d multi-modal instruction tuning for scene understanding

Z Li, C Zhang, X Wang, R Ren, Y Xu… - … on Multimedia and …, 2024 - ieeexplore.ieee.org
The remarkable potential of multi-modal large language models (MLLMs) in comprehending
both vision and language information has been widely acknowledged. However, the scarcity …

Self-supervised learning for pre-training 3d point clouds: A survey

B Fei, W Yang, L Liu, T Luo, R Zhang, Y Li… - arXiv preprint arXiv …, 2023 - arxiv.org
Point cloud data has been extensively studied due to its compact form and flexibility in
representing complex 3D structures. The ability of point cloud data to accurately capture and …

E-clip: Towards label-efficient event-based open-world understanding by clip

J Zhou, X Zheng, Y Lyu, L Wang - arXiv preprint arXiv:2308.03135, 2023 - arxiv.org
Contrasting Language-image pertaining (CLIP) has recently shown promising open-world
and few-shot performance on 2D image-based recognition tasks. However, the transferred …