Chain-of-thought predictive control

R Firoozi, J Tucker, S Tian… - … Journal of Robotics …, 2023 - journals.sagepub.com

We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …

被引用次数：69 相关文章所有 2 个版本

[PDF] arxiv.org

Robot learning in the era of foundation models: A survey

X Xiao, J Liu, Z Wang, Y Zhou, Y Qi, Q Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org

The proliferation of Large Language Models (LLMs) has s fueled a shift in robot learning
from automation towards general embodied Artificial Intelligence (AI). Adopting foundation …

被引用次数：14 相关文章所有 4 个版本

[PDF] arxiv.org

Diffusion world model

Z Ding, A Zhang, Y Tian, Q Zheng - arXiv preprint arXiv:2402.03570, 2024 - arxiv.org

We introduce Diffusion World Model (DWM), a conditional diffusion model capable of
predicting multistep future states and rewards concurrently. As opposed to traditional one …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning

H Zhu, Y Wang, D Huang, W Ye, W Ouyang… - arXiv preprint arXiv …, 2024 - arxiv.org

In this study, we explore the influence of different observation spaces on robot learning,
focusing on three predominant modalities: RGB, RGB-D, and point cloud. Through extensive …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Easyhec: Accurate and automatic hand-eye calibration via differentiable rendering and space exploration

L Chen, Y Qin, X Zhou, H Su - IEEE Robotics and Automation …, 2023 - ieeexplore.ieee.org

Hand-eye calibration is a critical task in robotics, as it directly affects the efficacy of critical
operations such as manipulation and grasping. Traditional methods for achieving this …

被引用次数：10 相关文章所有 3 个版本

[PDF] arxiv.org

A Survey on Integration of Large Language Models with Intelligent Robots

Y Kim, D Kim, J Choi, J Park, N Oh, D Park - arXiv preprint arXiv …, 2024 - arxiv.org

In recent years, the integration of large language models (LLMs) has revolutionized the field
of robotics, enabling robots to communicate, understand, and reason with human-like …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Efficient Planning with Latent Diffusion

W Li - arXiv preprint arXiv:2310.00311, 2023 - arxiv.org

Temporal abstraction and efficient planning pose significant challenges in offline
reinforcement learning, mainly when dealing with domains that involve temporally extended …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation

J Zhang, C Bai, H He, W Xia, Z Wang, B Zhao… - arXiv preprint arXiv …, 2024 - arxiv.org

Acquiring a multi-task imitation policy in 3D manipulation poses challenges in terms of
scene understanding and action prediction. Current methods employ both 3D representation …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation

R Feng, D Hu, W Ma, X Li - arXiv preprint arXiv:2408.01366, 2024 - arxiv.org

Humans possess a remarkable talent for flexibly alternating to different senses when
interacting with the environment. Picture a chef skillfully gauging the timing of ingredient …

MaxMI: A Maximal Mutual Information Criterion for Manipulation Concept Discovery

P Zhou, Y Yang - arXiv preprint arXiv:2407.15086, 2024 - arxiv.org

We aim to discover manipulation concepts embedded in the unannotated demonstrations,
which are recognized as key physical states. The discovered concepts can facilitate training …