A minimalist approach to offline reinforcement learning
S Fujimoto, SS Gu - Advances in neural information …, 2021 - proceedings.neurips.cc
Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data.
Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms …
Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms …
Cal-ql: Calibrated offline rl pre-training for efficient online fine-tuning
A compelling use case of offline reinforcement learning (RL) is to obtain a policy initialization
from existing datasets followed by fast online fine-tuning with limited interaction. However …
from existing datasets followed by fast online fine-tuning with limited interaction. However …
Transfer learning in deep reinforcement learning: A survey
Reinforcement learning is a learning paradigm for solving sequential decision-making
problems. Recent years have witnessed remarkable progress in reinforcement learning …
problems. Recent years have witnessed remarkable progress in reinforcement learning …
Deep reinforcement learning for autonomous driving: A survey
With the development of deep representation learning, the domain of reinforcement learning
(RL) has become a powerful learning framework now capable of learning complex policies …
(RL) has become a powerful learning framework now capable of learning complex policies …
Mildly conservative q-learning for offline reinforcement learning
Offline reinforcement learning (RL) defines the task of learning from a static logged dataset
without continually interacting with the environment. The distribution shift between the …
without continually interacting with the environment. The distribution shift between the …
Dexmv: Imitation learning for dexterous manipulation from human videos
While significant progress has been made on understanding hand-object interactions in
computer vision, it is still very challenging for robots to perform complex dexterous …
computer vision, it is still very challenging for robots to perform complex dexterous …
Discriminator-actor-critic: Addressing sample inefficiency and reward bias in adversarial imitation learning
We identify two issues with the family of algorithms based on the Adversarial Imitation
Learning framework. The first problem is implicit bias present in the reward functions used in …
Learning framework. The first problem is implicit bias present in the reward functions used in …
Maniskill: Generalizable manipulation skill benchmark with large-scale demonstrations
Object manipulation from 3D visual inputs poses many challenges on building generalizable
perception and policy models. However, 3D assets in existing benchmarks mostly lack the …
perception and policy models. However, 3D assets in existing benchmarks mostly lack the …
Model-free reinforcement learning from expert demonstrations: a survey
Reinforcement learning from expert demonstrations (RLED) is the intersection of imitation
learning with reinforcement learning that seeks to take advantage of these two learning …
learning with reinforcement learning that seeks to take advantage of these two learning …
Deep-reinforcement-learning-based autonomous UAV navigation with sparse rewards
Unmanned aerial vehicles (UAVs) have the potential in delivering Internet-of-Things (IoT)
services from a great height, creating an airborne domain of the IoT. In this article, we …
services from a great height, creating an airborne domain of the IoT. In this article, we …