Exploration in deep reinforcement learning: A survey
This paper reviews exploration techniques in deep reinforcement learning. Exploration
techniques are of primary importance when solving sparse reward problems. In sparse …
techniques are of primary importance when solving sparse reward problems. In sparse …
A tutorial on thompson sampling
Thompson sampling is an algorithm for online decision problems where actions are taken
sequentially in a manner that must balance between exploiting what is known to maximize …
sequentially in a manner that must balance between exploiting what is known to maximize …
Behavior-1k: A benchmark for embodied ai with 1,000 everyday activities and realistic simulation
We present BEHAVIOR-1K, a comprehensive simulation benchmark for human-centered
robotics. BEHAVIOR-1K includes two components, guided and motivated by the results of an …
robotics. BEHAVIOR-1K includes two components, guided and motivated by the results of an …
Maven: Multi-agent variational exploration
Centralised training with decentralised execution is an important setting for cooperative
deep multi-agent reinforcement learning due to communication constraints during execution …
deep multi-agent reinforcement learning due to communication constraints during execution …
Model-based reinforcement learning with value-targeted regression
This paper studies model-based reinforcement learning (RL) for regret minimization. We
focus on finite-horizon episodic RL where the transition model $ P $ belongs to a known …
focus on finite-horizon episodic RL where the transition model $ P $ belongs to a known …
Strength training session induces important changes on physiological, immunological, and inflammatory biomarkers
AK Fortunato, WM Pontes… - Journal of …, 2018 - Wiley Online Library
Strength exercise is a strategy applied in sports and physical training processes. It may
induce skeletal muscle hypertrophy. The hypertrophy is dependent on the eccentric muscle …
induce skeletal muscle hypertrophy. The hypertrophy is dependent on the eccentric muscle …
Reinforcement learning in feature space: Matrix bandit, kernels, and regret bound
Exploration in reinforcement learning (RL) suffers from the curse of dimensionality when the
state-action space is large. A common practice is to parameterize the high-dimensional …
state-action space is large. A common practice is to parameterize the high-dimensional …
Randomized prior functions for deep reinforcement learning
I Osband, J Aslanides… - Advances in Neural …, 2018 - proceedings.neurips.cc
Dealing with uncertainty is essential for efficient reinforcement learning. There is a growing
literature on uncertainty estimation for deep learning from fixed datasets, but many of the …
literature on uncertainty estimation for deep learning from fixed datasets, but many of the …
Q-learning: Theory and applications
J Clifton, E Laber - Annual Review of Statistics and Its …, 2020 - annualreviews.org
Q-learning, originally an incremental algorithm for estimating an optimal decision strategy in
an infinite-horizon decision problem, now refers to a general class of reinforcement learning …
an infinite-horizon decision problem, now refers to a general class of reinforcement learning …
Epistemic neural networks
Intelligence relies on an agent's knowledge of what it does not know. This capability can be
assessed based on the quality of joint predictions of labels across multiple inputs. In …
assessed based on the quality of joint predictions of labels across multiple inputs. In …