Risk-aware transfer in reinforcement learning using successor features

M Gimelfarb, A Barreto, S Sanner… - Advances in Neural …, 2021 - proceedings.neurips.cc
Sample efficiency and risk-awareness are central to the development of practical
reinforcement learning (RL) for complex decision-making. The former can be addressed by …

Safe option-critic: learning safety in the option-critic architecture

A Jain, K Khetarpal, D Precup - The Knowledge Engineering Review, 2021 - cambridge.org
Designing hierarchical reinforcement learning algorithms that exhibit safe behaviour is not
only vital for practical applications but also facilitates a better understanding of an agent's …

A unifying framework of off-policy general value function evaluation

T Xu, Z Yang, Z Wang, Y Liang - Advances in Neural …, 2022 - proceedings.neurips.cc
Abstract General Value Function (GVF) is a powerful tool to represent both the {\em
predictive} and {\em retrospective} knowledge in reinforcement learning (RL). In practice …

Robust reinforcement learning with distributional risk-averse formulation

P Clavier, S Allassonière, EL Pennec - arXiv preprint arXiv:2206.06841, 2022 - arxiv.org
Robust Reinforcement Learning tries to make predictions more robust to changes in the
dynamics or rewards of the system. This problem is particularly important when the …

CAT: Caution Aware Transfer in Reinforcement Learning via Distributional Risk

MFEH Chehade, AS Bedi, A Zhang, H Zhu - arXiv preprint arXiv …, 2024 - arxiv.org
Transfer learning in reinforcement learning (RL) has become a pivotal strategy for improving
data efficiency in new, unseen tasks by utilizing knowledge from previously learned tasks …

Taylor TD-learning

M Garibbo, M Robeyns… - Advances in Neural …, 2024 - proceedings.neurips.cc
Many reinforcement learning approaches rely on temporal-difference (TD) learning to learn
a critic. However, TD-learning updates can be high variance. Here, we introduce a model …

Adaptive Exploration for Data-Efficient General Value Function Evaluations

A Jain, JP Hanna, D Precup - arXiv preprint arXiv:2405.07838, 2024 - arxiv.org
General Value Functions (GVFs)(Sutton et al, 2011) are an established way to represent
predictive knowledge in reinforcement learning. Each GVF computes the expected return for …

A unified off-policy evaluation approach for general value function

T Xu, Z Yang, Z Wang, Y Liang - arXiv preprint arXiv:2107.02711, 2021 - arxiv.org
General Value Function (GVF) is a powerful tool to represent both the {\em predictive} and
{\em retrospective} knowledge in reinforcement learning (RL). In practice, often multiple …

Shapley-Optimized Reinforcement Learning for Human-Machine Collaboration Policy

J Zhang, Y Niu, W He, C Jin, C Wang - International Conference on …, 2024 - Springer
Human-machine collaboration is a promising training framework aimed at learning optimal
strategies in high-cost exploration scenarios. However, such work is challenging. On one …

Reinforcement Learning based Sequential and Robust Bayesian Optimal Experimental Design

W Shen - 2023 - deepblue.lib.umich.edu
Optimal experimental design (OED) is a statistical approach aimed at designing experiments
in order to extract maximum information from them. It entails carefully selecting experimental …