Accelerated gradient temporal difference learning
The family of temporal difference (TD) methods span a spectrum from computationally frugal
linear methods like TD (λ) to data efficient least squares methods. Least square methods …
linear methods like TD (λ) to data efficient least squares methods. Least square methods …
Meta-descent for online, continual prediction
This paper investigates different vector step-size adaptation approaches for non-stationary
online, continual prediction problems. Vanilla stochastic gradient descent can be …
online, continual prediction problems. Vanilla stochastic gradient descent can be …
[图书][B] Generalization, optimization, diverse generation: insights and advances in the use of bootstrapping in deep neural networks
E Bengio - 2022 - search.proquest.com
This thesis investigates the use of bootstrapping in Temporal Difference (TD) learning, a
central mechanism in reinforcement learning (RL), when applied to deep neural networks. I …
central mechanism in reinforcement learning (RL), when applied to deep neural networks. I …
Accelerated Gradient Algorithms for Robust Temporal Difference Learning
DJ Meyer - 2021 - mediatum.ub.tum.de
This thesis deals with linearly approximated gradient temporal difference learning. The
applicability of the underlying cost functions are discussed and investigated with respect to …
applicability of the underlying cost functions are discussed and investigated with respect to …
Accelerated algorithms for temporal difference learning methods
A Rankawat - 2023 - papyrus.bib.umontreal.ca
The central idea of this thesis is to understand the notion of acceleration in stochastic
approximation algorithms. Specifically, we attempt to answer the question: How does …
approximation algorithms. Specifically, we attempt to answer the question: How does …
Improving Sample Efficiency of Online Temporal Difference Learning
Y Pan - 2021 - era.library.ualberta.ca
A common scientific challenge for putting a reinforcement learning agent into practice is how
to improve sample efficiency as much as possible with limited computational or memory …
to improve sample efficiency as much as possible with limited computational or memory …
Vector Step-size Adaptation for Continual, Online Prediction
A Jacobsen - 2019 - era.library.ualberta.ca
In this thesis, we investigate different vector step-size adaptation approaches for continual,
online prediction problems. Vanilla stochastic gradient descent can be considerably …
online prediction problems. Vanilla stochastic gradient descent can be considerably …