Improving behavioural cloning with positive unlabeled learning

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

被引用次数：206 相关文章所有 3 个版本

[HTML] sciencedirect.com

[HTML][HTML] State transition learning with limited data for safe control of switched nonlinear systems

C Fan, KF Chu, X Wang, KW Kwok, F Iida - Neural Networks, 2024 - Elsevier

Switching dynamics are prevalent in real-world systems, arising from either intrinsic changes
or responses to external influences, which can be appropriately modeled by switched …

Towards Effective Utilization of Mixed-Quality Demonstrations in Robotic Manipulation via Segment-Level Selection and Optimization

J Chen, H Fang, HS Fang, C Lu - arXiv preprint arXiv:2409.19917, 2024 - arxiv.org

Data is crucial for robotic manipulation, as it underpins the development of robotic systems
for complex tasks. While high-quality, diverse datasets enhance the performance and …

Identifying Expert Behavior in Offline Training Datasets Improves Behavioral Cloning of Robotic Manipulation Policies

Q Wang, R McCarthy, DC Bulens… - IEEE Robotics and …, 2023 - ieeexplore.ieee.org

This letter presents our solution for the Real Robot Challenge III 1, aiming to address
dexterous robotic manipulation tasks through learning from offline data. In this competition …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

Dataset Clustering for Improved Offline Policy Learning

Q Wang, Y Deng, FR Sanchez, K Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Offline policy learning aims to discover decision-making policies from previously-collected
datasets without additional online interactions with the environment. As the training dataset …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains

S Nishimori, XQ Cai, J Ackermann… - arXiv preprint arXiv …, 2024 - arxiv.org

In this paper, we investigate an offline reinforcement learning (RL) problem where datasets
are collected from two domains. In this scenario, having datasets with domain labels …

Predicting Long-Term Human Behaviors in Discrete Representations via Physics-Guided Diffusion

Z Zhang, A Li, A Lim, M Chen - arXiv preprint arXiv:2405.19528, 2024 - arxiv.org

Long-term human trajectory prediction is a challenging yet critical task in robotics and
autonomous systems. Prior work that studied how to predict accurate short-term human …

被引用次数：1 相关文章所有 3 个版本