A Review of Safe Reinforcement Learning: Methods, Theories and Applications
Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …
making tasks. However, safety concerns are raised during deploying RL in real-world …
Decision transformer: Reinforcement learning via sequence modeling
We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence
modeling problem. This allows us to draw upon the simplicity and scalability of the …
modeling problem. This allows us to draw upon the simplicity and scalability of the …
Behavior Transformers: Cloning modes with one stone
NM Shafiullah, Z Cui… - Advances in neural …, 2022 - proceedings.neurips.cc
While behavior learning has made impressive progress in recent times, it lags behind
computer vision and natural language processing due to its inability to leverage large …
computer vision and natural language processing due to its inability to leverage large …
Foundation models for decision making: Problems, methods, and opportunities
Foundation models pretrained on diverse data at scale have demonstrated extraordinary
capabilities in a wide range of vision and language tasks. When such models are deployed …
capabilities in a wide range of vision and language tasks. When such models are deployed …
Imitating human behaviour with diffusion models
Diffusion models have emerged as powerful generative models in the text-to-image domain.
This paper studies their application as observation-to-action models for imitating human …
This paper studies their application as observation-to-action models for imitating human …
Offline-to-online reinforcement learning via balanced replay and pessimistic q-ensemble
Recent advance in deep offline reinforcement learning (RL) has made it possible to train
strong robotic agents from offline datasets. However, depending on the quality of the trained …
strong robotic agents from offline datasets. However, depending on the quality of the trained …
Goal-conditioned imitation learning using score-based diffusion policies
We propose a new policy representation based on score-based diffusion models (SDMs).
We apply our new policy representation in the domain of Goal-Conditioned Imitation …
We apply our new policy representation in the domain of Goal-Conditioned Imitation …
Can wikipedia help offline reinforcement learning?
Fine-tuning reinforcement learning (RL) models has been challenging because of a lack of
large scale off-the-shelf datasets as well as high variance in transferability among different …
large scale off-the-shelf datasets as well as high variance in transferability among different …
Offline reinforcement learning via high-fidelity generative behavior modeling
In offline reinforcement learning, weighted regression is a common method to ensure the
learned policy stays close to the behavior policy and to prevent selecting out-of-sample …
learned policy stays close to the behavior policy and to prevent selecting out-of-sample …
Representation matters: Offline pretraining for sequential decision making
The recent success of supervised learning methods on ever larger offline datasets has
spurred interest in the reinforcement learning (RL) field to investigate whether the same …
spurred interest in the reinforcement learning (RL) field to investigate whether the same …