Sora: A review on background, technology, limitations, and opportunities of large vision models

Y Liu, K Zhang, Y Li, Z Yan, C Gao, R Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The
model is trained to generate videos of realistic or imaginative scenes from text instructions …

Beyond deep reinforcement learning: A tutorial on generative diffusion models in network optimization

H Du, R Zhang, Y Liu, J Wang, Y Lin, Z Li… - arXiv preprint arXiv …, 2023 - arxiv.org
Generative Diffusion Models (GDMs) have emerged as a transformative force in the realm of
Generative Artificial Intelligence (GAI), demonstrating their versatility and efficacy across a …

Diffusion policy: Visuomotor policy learning via action diffusion

C Chi, S Feng, Y Du, Z Xu, E Cousineau… - arXiv preprint arXiv …, 2023 - arxiv.org
This paper introduces Diffusion Policy, a new way of generating robot behavior by
representing a robot's visuomotor policy as a conditional denoising diffusion process. We …

Learning universal policies via text-guided video generation

Y Du, S Yang, B Dai, H Dai… - Advances in …, 2024 - proceedings.neurips.cc
A goal of artificial intelligence is to construct an agent that can solve a wide variety of tasks.
Recent progress in text-guided image synthesis has yielded models with an impressive …

Diffusion policies as an expressive policy class for offline reinforcement learning

Z Wang, JJ Hunt, M Zhou - arXiv preprint arXiv:2208.06193, 2022 - arxiv.org
Offline reinforcement learning (RL), which aims to learn an optimal policy using a previously
collected static dataset, is an important paradigm of RL. Standard RL methods often perform …

Training diffusion models with reinforcement learning

K Black, M Janner, Y Du, I Kostrikov… - arXiv preprint arXiv …, 2023 - arxiv.org
Diffusion models are a class of flexible generative models trained with an approximation to
the log-likelihood objective. However, most use cases of diffusion models are not concerned …

Fast sampling of diffusion models via operator learning

H Zheng, W Nie, A Vahdat… - International …, 2023 - proceedings.mlr.press
Diffusion models have found widespread adoption in various areas. However, their
sampling process is slow because it requires hundreds to thousands of network evaluations …

Foundation models for decision making: Problems, methods, and opportunities

S Yang, O Nachum, Y Du, J Wei, P Abbeel… - arXiv preprint arXiv …, 2023 - arxiv.org
Foundation models pretrained on diverse data at scale have demonstrated extraordinary
capabilities in a wide range of vision and language tasks. When such models are deployed …

Imitating human behaviour with diffusion models

T Pearce, T Rashid, A Kanervisto, D Bignell… - arXiv preprint arXiv …, 2023 - arxiv.org
Diffusion models have emerged as powerful generative models in the text-to-image domain.
This paper studies their application as observation-to-action models for imitating human …

Playfusion: Skill acquisition via diffusion from language-annotated play

L Chen, S Bahl, D Pathak - Conference on Robot Learning, 2023 - proceedings.mlr.press
Learning from unstructured and uncurated data has become the dominant paradigm for
generative approaches in language or vision. Such unstructured and unguided behavior …