Near-optimal goal-oriented reinforcement learning in non-stationary environments - 学术资源搜索

文章

学术资源搜索

获得 7 条结果（用时0.02秒）

我的图书馆

Near-optimal goal-oriented reinforcement learning in non-stationary environments

在引用文章中搜索

[PDF] neurips.cc

Tracking most significant shifts in nonparametric contextual bandits

J Suk, S Kpotufe - Advances in Neural Information …, 2023 - proceedings.neurips.cc

We study nonparametric contextual bandits where Lipschitz mean reward functions may
change over time. We first establish the minimax dynamic regret rate in this less understood …

被引用次数：3 相关文章所有 5 个版本

[PDF] mlr.press

A robust test for the stationarity assumption in sequential decision making

J Wang, C Shi, Z Wu - International Conference on Machine …, 2023 - proceedings.mlr.press

Reinforcement learning (RL) is a powerful technique that allows an autonomous agent to
learn an optimal policy to maximize the expected return. The optimality of various RL …

被引用次数：2 相关文章所有 5 个版本

[PDF] mlr.press

Reaching goals is hard: Settling the sample complexity of the stochastic shortest path

L Chen, A Tirinzoni, M Pirotta… - … on Algorithmic Learning …, 2023 - proceedings.mlr.press

We study the sample complexity of learning an $\epsilon $-optimal policy in the Stochastic
Shortest Path (SSP) problem. We first derive sample complexity bounds when the learner …

被引用次数：3 相关文章所有 3 个版本

[PDF] mlr.press

Layered state discovery for incremental autonomous exploration

L Chen, A Tirinzoni, A Lazaric… - … Conference on Machine …, 2023 - proceedings.mlr.press

We study the autonomous exploration (AX) problem proposed by Lim & Auer (2012). In this
setting, the objective is to discover a set of $\epsilon $-optimal policies reaching a set …

相关文章所有 6 个版本

[PDF] arxiv.org

Hi-Core: Hierarchical Knowledge Transfer for Continual Reinforcement Learning

C Pan, X Yang, H Wang, W Wei, T Li - arXiv preprint arXiv:2401.15098, 2024 - arxiv.org

Continual reinforcement learning (CRL) empowers RL agents with the ability to learn from a
sequence of tasks, preserving previous knowledge and leveraging it to facilitate future …

相关文章所有 2 个版本

[PDF] mlr.press

A Unified Algorithm for Stochastic Path Problems

C Dann, CY Wei, J Zimmert - International Conference on …, 2023 - proceedings.mlr.press

We study reinforcement learning in stochastic path (SP) problems. The goal in these
problems is to maximize the expected sum of rewards until the agent reaches a terminal …

相关文章所有 4 个版本

Advances in Non-stationary Sequential Decision-Making

J Suk - 2024 - search.proquest.com

We study the problem of sequential decision-making (eg multi-armed bandits, contextual
bandits, reinforcement learning) under changing environments, or distribution shifts. Ideally …