Perceptual grouping in contrastive vision-language models

K Ranasinghe, B McKinzie, S Ravi… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent advances in zero-shot image recognition suggest that vision-language models learn
generic visual representations with a high degree of semantic information that may be …

Unsupervised representation learning in deep reinforcement learning: A review

N Botteghi, M Poel, C Brune - arXiv preprint arXiv:2208.14226, 2022 - arxiv.org
This review addresses the problem of learning abstract representations of the measurement
data in the context of Deep Reinforcement Learning (DRL). While the data are often …

Does self-supervised learning really improve reinforcement learning from pixels?

X Li, J Shang, S Das, M Ryoo - Advances in Neural …, 2022 - proceedings.neurips.cc
We investigate whether self-supervised learning (SSL) can improve online reinforcement
learning (RL) from pixels. We extend the contrastive reinforcement learning framework (eg …

Deep generative models for offline policy learning: Tutorial, survey, and perspectives on future directions

J Chen, B Ganguly, Y Xu, Y Mei, T Lan… - arXiv preprint arXiv …, 2024 - arxiv.org
Deep generative models (DGMs) have demonstrated great success across various domains,
particularly in generating texts, images, and videos using models trained from offline data …

Crossway diffusion: Improving diffusion-based visuomotor policy via self-supervised learning

X Li, V Belagali, J Shang… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Diffusion models have been adopted for behavioral cloning in a sequence modeling
fashion, benefiting from their exceptional capabilities in modeling complex data distributions …

Theia: Distilling diverse vision foundation models for robot learning

J Shang, K Schmeckpeper, BB May, MV Minniti… - arXiv preprint arXiv …, 2024 - arxiv.org
Vision-based robot policy learning, which maps visual inputs to actions, necessitates a
holistic understanding of diverse visual tasks beyond single-task needs like classification or …

Movie: Visual model-based policy adaptation for view generalization

S Yang, Y Ze, H Xu - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Abstract Visual Reinforcement Learning (RL) agents trained on limited views face significant
challenges in generalizing their learned abilities to unseen views. This inherent difficulty is …

[HTML][HTML] A survey of demonstration learning

A Correia, LA Alexandre - Robotics and Autonomous Systems, 2024 - Elsevier
With the fast improvement of machine learning, reinforcement learning (RL) has been used
to automate human tasks in different areas. However, training such agents is difficult and …

Starformer: Transformer with state-action-reward representations for visual reinforcement learning

J Shang, K Kahatapitiya, X Li, MS Ryoo - European conference on …, 2022 - Springer
Reinforcement Learning (RL) can be considered as a sequence modeling task: given a
sequence of past state-action-reward experiences, an agent predicts a sequence of next …

Learning viewpoint-agnostic visual representations by recovering tokens in 3d space

J Shang, S Das, M Ryoo - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Humans are remarkably flexible in understanding viewpoint changes due to visual cortex
supporting the perception of 3D structure. In contrast, most of the computer vision models …