How to fine-tune the model: Unified model shift and model bias policy optimization
Designing and deriving effective model-based reinforcement learning (MBRL) algorithms
with a performance improvement guarantee is challenging, mainly attributed to the high …
with a performance improvement guarantee is challenging, mainly attributed to the high …
Seizing serendipity: Exploiting the value of past success in off-policy actor-critic
Learning high-quality $ Q $-value functions plays a key role in the success of many modern
off-policy deep reinforcement learning (RL) algorithms. Previous works primarily focus on …
off-policy deep reinforcement learning (RL) algorithms. Previous works primarily focus on …
Dyna-style Model-based reinforcement learning with Model-Free Policy Optimization
Dyna-style Model-based reinforcement learning (MBRL) methods have demonstrated
superior sample efficiency compared to their model-free counterparts, largely attributable to …
superior sample efficiency compared to their model-free counterparts, largely attributable to …
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
Off-policy reinforcement learning (RL) has achieved notable success in tackling many
complex real-world tasks, by leveraging previously collected data for policy learning …
complex real-world tasks, by leveraging previously collected data for policy learning …
Understanding world models through multi-step pruning policy via reinforcement learning
In model-based reinforcement learning, the conventional approach to addressing world
model bias is to use gradient optimization methods. However, using a singular policy from …
model bias is to use gradient optimization methods. However, using a singular policy from …
Model-Based Reinforcement Learning with Isolated Imaginations
World models learn the consequences of actions in vision-based interactive systems.
However, in practical scenarios like autonomous driving, noncontrollable dynamics that are …
However, in practical scenarios like autonomous driving, noncontrollable dynamics that are …
Scrutinize What We Ignore: Reining Task Representation Shift In Context-Based Offline Meta Reinforcement Learning
Offline meta reinforcement learning (OMRL) has emerged as a promising approach for
interaction avoidance and strong generalization performance by leveraging pre-collected …
interaction avoidance and strong generalization performance by leveraging pre-collected …
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
Training reinforcement learning policies using environment interaction data collected from
varying policies or dynamics presents a fundamental challenge. Existing works often …
varying policies or dynamics presents a fundamental challenge. Existing works often …
Trust the Model Where It Trusts Itself--Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption
Dyna-style model-based reinforcement learning (MBRL) combines model-free agents with
predictive transition models through model-based rollouts. This combination raises a critical …
predictive transition models through model-based rollouts. This combination raises a critical …
A model of how hierarchical representations constructed in the hippocampus are used to navigate through space
E Chalmers, M Bardal, R McDonald… - Adaptive …, 2024 - journals.sagepub.com
Animals can navigate through complex environments with amazing flexibility and efficiency:
they forage over large areas, quickly learning rewarding behavior and changing their plans …
they forage over large areas, quickly learning rewarding behavior and changing their plans …