Deep exploration via randomized value functions

P Ladosz, L Weng, M Kim, H Oh - Information Fusion, 2022 - Elsevier

This paper reviews exploration techniques in deep reinforcement learning. Exploration
techniques are of primary importance when solving sparse reward problems. In sparse …

被引用次数：339 相关文章所有 5 个版本

[PDF] nowpublishers.com

A tutorial on thompson sampling

DJ Russo, B Van Roy, A Kazerouni… - … and Trends® in …, 2018 - nowpublishers.com

Thompson sampling is an algorithm for online decision problems where actions are taken
sequentially in a manner that must balance between exploiting what is known to maximize …

被引用次数：1259 相关文章所有 34 个版本

[PDF] mlr.press

Behavior-1k: A benchmark for embodied ai with 1,000 everyday activities and realistic simulation

C Li, R Zhang, J Wong, C Gokmen… - … on Robot Learning, 2023 - proceedings.mlr.press

We present BEHAVIOR-1K, a comprehensive simulation benchmark for human-centered
robotics. BEHAVIOR-1K includes two components, guided and motivated by the results of an …

被引用次数：156 相关文章所有 3 个版本

[PDF] neurips.cc

Maven: Multi-agent variational exploration

A Mahajan, T Rashid, M Samvelyan… - Advances in neural …, 2019 - proceedings.neurips.cc

Centralised training with decentralised execution is an important setting for cooperative
deep multi-agent reinforcement learning due to communication constraints during execution …

被引用次数：436 相关文章所有 11 个版本

[PDF] mlr.press

Model-based reinforcement learning with value-targeted regression

A Ayoub, Z Jia, C Szepesvari… - … on Machine Learning, 2020 - proceedings.mlr.press

This paper studies model-based reinforcement learning (RL) for regret minimization. We
focus on finite-horizon episodic RL where the transition model $ P $ belongs to a known …

被引用次数：349 相关文章所有 8 个版本

[PDF] wiley.com Full View

Strength training session induces important changes on physiological, immunological, and inflammatory biomarkers

AK Fortunato, WM Pontes… - Journal of …, 2018 - Wiley Online Library

Strength exercise is a strategy applied in sports and physical training processes. It may
induce skeletal muscle hypertrophy. The hypertrophy is dependent on the eccentric muscle …

被引用次数：1243 相关文章所有 21 个版本

[PDF] mlr.press

Reinforcement learning in feature space: Matrix bandit, kernels, and regret bound

L Yang, M Wang - International Conference on Machine …, 2020 - proceedings.mlr.press

Exploration in reinforcement learning (RL) suffers from the curse of dimensionality when the
state-action space is large. A common practice is to parameterize the high-dimensional …

被引用次数：335 相关文章所有 6 个版本

[PDF] neurips.cc

Randomized prior functions for deep reinforcement learning

I Osband, J Aslanides… - Advances in Neural …, 2018 - proceedings.neurips.cc

Dealing with uncertainty is essential for efficient reinforcement learning. There is a growing
literature on uncertainty estimation for deep learning from fixed datasets, but many of the …

被引用次数：456 相关文章所有 12 个版本

[PDF] archive.org

Q-learning: Theory and applications

J Clifton, E Laber - Annual Review of Statistics and Its …, 2020 - annualreviews.org

Q-learning, originally an incremental algorithm for estimating an optimal decision strategy in
an infinite-horizon decision problem, now refers to a general class of reinforcement learning …

被引用次数：322 相关文章所有 4 个版本

[PDF] neurips.cc

Epistemic neural networks

I Osband, Z Wen, SM Asghari… - Advances in …, 2023 - proceedings.neurips.cc

Intelligence relies on an agent's knowledge of what it does not know. This capability can be
assessed based on the quality of joint predictions of labels across multiple inputs. In …

被引用次数：118 相关文章所有 6 个版本