When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
As large language models (LLMs) evolve, their integration with 3D spatial data (3D-LLMs)
has seen rapid progress, offering unprecedented capabilities for understanding and …
has seen rapid progress, offering unprecedented capabilities for understanding and …
Multi-Task Domain Adaptation for Language Grounding with 3D Objects
The existing works on object-level language grounding with 3D objects mostly focus on
improving performance by utilizing the offthe-shelf pre-trained models to capture features …
improving performance by utilizing the offthe-shelf pre-trained models to capture features …
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions
Text-guided 3D visual grounding (T-3DVG), which aims to locate a specific object that
semantically corresponds to a language query from a complicated 3D scene, has drawn …
semantically corresponds to a language query from a complicated 3D scene, has drawn …
Reimagining 3D Visual Grounding: Instance Segmentation and Transformers for Fragmented Point Cloud Scenarios
This work introduces a pioneering, engineerable approach to 3D visual localization (3DVG).
Current challenges for 2D Visual Grounding (2DVG) and 3DVG are summarized: Absence of …
Current challenges for 2D Visual Grounding (2DVG) and 3DVG are summarized: Absence of …