Unsupervised point cloud representation learning with deep neural networks: A survey
Point cloud data have been widely explored due to its superior accuracy and robustness
under various adverse situations. Meanwhile, deep neural networks (DNNs) have achieved …
under various adverse situations. Meanwhile, deep neural networks (DNNs) have achieved …
Prompt, generate, then cache: Cascade of foundation models makes strong few-shot learners
Visual recognition in low-data regimes requires deep neural networks to learn generalized
representations from limited training samples. Recently, CLIP-based methods have shown …
representations from limited training samples. Recently, CLIP-based methods have shown …
Tip-adapter: Training-free adaption of clip for few-shot classification
Abstract Contrastive Vision-Language Pre-training, known as CLIP, has provided a new
paradigm for learning visual representations using large-scale image-text pairs. It shows …
paradigm for learning visual representations using large-scale image-text pairs. It shows …
Learning 3d representations from 2d pre-trained models via image-to-point masked autoencoders
Pre-training by numerous image data has become de-facto for robust 2D representations. In
contrast, due to the expensive data processing, a paucity of 3D datasets severely hinders …
contrast, due to the expensive data processing, a paucity of 3D datasets severely hinders …
Ulip-2: Towards scalable multimodal pre-training for 3d understanding
Recent advancements in multimodal pre-training have shown promising efficacy in 3D
representation learning by aligning multimodal features across 3D shapes their 2D …
representation learning by aligning multimodal features across 3D shapes their 2D …
Contrast with reconstruct: Contrastive 3d representation learning guided by generative pretraining
Mainstream 3D representation learning approaches are built upon contrastive or generative
modeling pretext tasks, where great improvements in performance on various downstream …
modeling pretext tasks, where great improvements in performance on various downstream …
CLIP2: Contrastive language-image-point pretraining from real-world point cloud data
Abstract Contrastive Language-Image Pre-training, benefiting from large-scale unlabeled
text-image pairs, has demonstrated great performance in open-world vision understanding …
text-image pairs, has demonstrated great performance in open-world vision understanding …
Pimae: Point cloud and image interactive masked autoencoders for 3d object detection
Masked Autoencoders learn strong visual representations and achieve state-of-the-art
results in several independent modalities, yet very few works have addressed their …
results in several independent modalities, yet very few works have addressed their …
Transformer-based visual segmentation: A survey
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …
segments or groups. This technique has numerous real-world applications, such as …
Advancing 3D point cloud understanding through deep transfer learning: A comprehensive survey
The 3D point cloud (3DPC) has significantly evolved and benefited from the advance of
deep learning (DL). However, the latter faces various issues, including the lack of data or …
deep learning (DL). However, the latter faces various issues, including the lack of data or …