Chain-of-thought predictive control

Z Jia, F Liu, V Thumuluri, L Chen, Z Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
We study generalizable policy learning from demonstrations for complex low-level control
tasks (eg, contact-rich object manipulations). We propose an imitation learning method that
incorporates the idea of temporal abstraction and the planning capabilities from Hierarchical
RL (HRL) in a novel and effective manner. As a step towards decision foundation models,
our design can utilize scalable, albeit highly sub-optimal, demonstrations. Specifically, we
find certain short subsequences of the demos, ie the chain-of-thought (CoT), reflect their …

[引用][C] Chain-ofthought predictive control

Z Jia, F Liu, V Thumuluri, L Chen, Z Huang, H Su - arXiv preprint arXiv:2304.00776, 2023
以上显示的是最相近的搜索结果。 查看全部搜索结果