A definition of continual reinforcement learning

D Abel, A Barreto, B Van Roy… - Advances in …, 2024 - proceedings.neurips.cc
In a standard view of the reinforcement learning problem, an agent's goal is to efficiently
identify a policy that maximizes long-term reward. However, this perspective is based on a …

Settling the reward hypothesis

M Bowling, JD Martin, D Abel… - … on Machine Learning, 2023 - proceedings.mlr.press
The reward hypothesis posits that," all of what we mean by goals and purposes can be well
thought of as maximization of the expected value of the cumulative sum of a received scalar …

An invitation to deep reinforcement learning

B Jaeger, A Geiger - Foundations and Trends® in …, 2024 - nowpublishers.com
Training a deep neural network to maximize a target objective has become the standard
recipe for successful machine learning over the last decade. These networks can be …

Approximate thompson sampling via epistemic neural networks

I Osband, Z Wen, SM Asghari… - Uncertainty in …, 2023 - proceedings.mlr.press
Thompson sampling (TS) is a popular heuristic for action selection, but it requires sampling
from a posterior distribution. Unfortunately, this can become computationally intractable in …

Continual learning as computationally constrained reinforcement learning

S Kumar, H Marklund, A Rao, Y Zhu, HJ Jeon… - arXiv preprint arXiv …, 2023 - arxiv.org
An agent that efficiently accumulates knowledge to develop increasingly sophisticated skills
over a long lifetime could advance the frontier of artificial intelligence capabilities. The …

Multiagent reinforcement learning-based adaptive sampling for conformational dynamics of proteins

DE Kleiman, D Shukla - Journal of Chemical Theory and …, 2022 - ACS Publications
Machine learning is increasingly applied to improve the efficiency and accuracy of molecular
dynamics (MD) simulations. Although the growth of distributed computer clusters has …

Regret bounds for information-directed reinforcement learning

B Hao, T Lattimore - Advances in neural information …, 2022 - proceedings.neurips.cc
Abstract Information-directed sampling (IDS) has revealed its potential as a data-efficient
algorithm for reinforcement learning (RL). However, theoretical understanding of IDS for …

Learning and information in stochastic networks and queues

N Walton, K Xu - Tutorials in Operations Research …, 2021 - pubsonline.informs.org
We review the role of information and learning in the stability and optimization of queueing
systems. In recent years, techniques from supervised learning, online learning, and …

Contextual information-directed sampling

B Hao, T Lattimore, C Qin - International Conference on …, 2022 - proceedings.mlr.press
Abstract Information-directed sampling (IDS) has recently demonstrated its potential as a
data-efficient reinforcement learning algorithm. However, it is still unclear what is the right …

Simple agent, complex environment: Efficient reinforcement learning with agent states

S Dong, B Van Roy, Z Zhou - Journal of Machine Learning Research, 2022 - jmlr.org
We design a simple reinforcement learning (RL) agent that implements an optimistic version
of Q-learning and establish through regret analysis that this agent can operate with some …