Starformer: Transformer with state-action-reward representations for robot learning

X Li, C Mata, J Park, K Kahatapitiya, YS Jang… - arXiv preprint arXiv …, 2024 - arxiv.org

LLMs with visual inputs, ie, Vision Language Models (VLMs), have the capacity to process
state information as visual-textual prompts and respond with policy decisions in text. We …

被引用次数：12 相关文章所有 3 个版本

[PDF] arxiv.org

Crossway diffusion: Improving diffusion-based visuomotor policy via self-supervised learning

X Li, V Belagali, J Shang… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

Diffusion models have been adopted for behavioral cloning in a sequence modeling
fashion, benefiting from their exceptional capabilities in modeling complex data distributions …

被引用次数：18 相关文章所有 2 个版本

[PDF] neurips.cc

Learning viewpoint-agnostic visual representations by recovering tokens in 3d space

J Shang, S Das, M Ryoo - Advances in Neural Information …, 2022 - proceedings.neurips.cc

Humans are remarkably flexible in understanding viewpoint changes due to visual cortex
supporting the perception of 3D structure. In contrast, most of the computer vision models …

被引用次数：14 相关文章所有 7 个版本

[PDF] researchgate.net

Enhancing parcel singulation efficiency through transformer-based position attention and state space augmentation

J Shen, H Lu, S Lyu, Y Lu - Expert Systems with Applications, 2024 - Elsevier

Parcel singulation has emerged as a critical bottleneck in the swiftly advancing logistics
processes. In the pursuit of a balance between cost-effectiveness and singulation efficiency …

被引用次数：5 相关文章所有 2 个版本

[PDF] nju.edu.cn

Weighting online decision transformer with episodic memory for offline-to-online reinforcement learning

X Ma, WJ Li - 2024 IEEE International Conference on Robotics …, 2024 - ieeexplore.ieee.org

Offline reinforcement learning (RL) has been shown to be successfully modeled as a
sequence modeling problem, drawing inspiration from the success of Transformers. Offline …

被引用次数：2 相关文章所有 2 个版本

[PDF] openreview.net

Prescribed safety performance imitation learning from a single expert dataset

Z Cheng, L Shen, M Zhu, J Guo, M Fang… - IEEE transactions on …, 2023 - ieeexplore.ieee.org

Existing safe imitation learning (safe IL) methods mainly focus on learning safe policies that
are similar to expert ones, but may fail in applications requiring different safety constraints. In …

被引用次数：1 相关文章所有 8 个版本

[PDF] neurips.cc

Active vision reinforcement learning under limited visual observability

J Shang, MS Ryoo - Advances in Neural Information …, 2024 - proceedings.neurips.cc

In this work, we investigate Active Vision Reinforcement Learning (ActiveVision-RL), where
an embodied agent simultaneously learns action policy for the task while also controlling its …

被引用次数：3 相关文章所有 5 个版本

[PDF] arxiv.org

In-Dataset Trajectory Return Regularization for Offline Preference-based Reinforcement Learning

S Tu, J Sun, Q Zhang, Y Zhang, J Liu, K Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

Offline preference-based reinforcement learning (PbRL) typically operates in two phases:
first, use human preferences to learn a reward model and annotate rewards for a reward …

[PDF][PDF] Pre-controller for Safe Reinforcement Learning using Transformer with State-Action-Reward Representations

Z Shen - 2024 - waseda.repo.nii.ac.jp

Reinforcement Learning (RL) is a dynamic and influential field within artificial intelligence
that focuses on how agents should take actions in an environment to maximize a cumulative …

[引用][C] 面向智能博弈的决策Transformer 方法综述

罗俊仁，张万鹏，苏炯铭，王尧，陈璟 - 指挥与控制学报, 2023

被引用次数：1 相关文章