Chain-of-thought predictive control- 学术资源搜索

文章

学术资源搜索

Chain-of-thought predictive control

Z Jia, F Liu, V Thumuluri, L Chen, Z Huang… - arXiv preprint arXiv …, 2023 - arxiv.org

We study generalizable policy learning from demonstrations for complex low-level control
tasks (eg, contact-rich object manipulations). We propose an imitation learning method that
incorporates the idea of temporal abstraction and the planning capabilities from Hierarchical
RL (HRL) in a novel and effective manner. As a step towards decision foundation models,
our design can utilize scalable, albeit highly sub-optimal, demonstrations. Specifically, we
find certain short subsequences of the demos, ie the chain-of-thought (CoT), reflect their …

被引用次数：12 相关文章所有 4 个版本

[引用][C] Chain-ofthought predictive control

Z Jia, F Liu, V Thumuluri, L Chen, Z Huang, H Su - arXiv preprint arXiv:2304.00776, 2023

被引用次数：3 相关文章

以上显示的是最相近的搜索结果。查看全部搜索结果