Accelerated gradient temporal difference learning algorithms

Y Pan, A White, M White - Proceedings of the AAAI Conference on …, 2017 - ojs.aaai.org

The family of temporal difference (TD) methods span a spectrum from computationally frugal
linear methods like TD (λ) to data efficient least squares methods. Least square methods …

被引用次数：34 相关文章所有 13 个版本

[PDF] aaai.org

Meta-descent for online, continual prediction

A Jacobsen, M Schlegel, C Linke, T Degris… - Proceedings of the …, 2019 - ojs.aaai.org

This paper investigates different vector step-size adaptation approaches for non-stationary
online, continual prediction problems. Vanilla stochastic gradient descent can be …

被引用次数：22 相关文章所有 9 个版本

[PDF] mcgill.ca

[图书][B] Generalization, optimization, diverse generation: insights and advances in the use of bootstrapping in deep neural networks

E Bengio - 2022 - search.proquest.com

This thesis investigates the use of bootstrapping in Temporal Difference (TD) learning, a
central mechanism in reinforcement learning (RL), when applied to deep neural networks. I …

Accelerated Gradient Algorithms for Robust Temporal Difference Learning

DJ Meyer - 2021 - mediatum.ub.tum.de

This thesis deals with linearly approximated gradient temporal difference learning. The
applicability of the underlying cost functions are discussed and investigated with respect to …

被引用次数：1 相关文章所有 2 个版本

[PDF] umontreal.ca

Accelerated algorithms for temporal difference learning methods

A Rankawat - 2023 - papyrus.bib.umontreal.ca

The central idea of this thesis is to understand the notion of acceleration in stochastic
approximation algorithms. Specifically, we attempt to answer the question: How does …

[PDF] ualberta.ca

Improving Sample Efficiency of Online Temporal Difference Learning

Y Pan - 2021 - era.library.ualberta.ca

A common scientific challenge for putting a reinforcement learning agent into practice is how
to improve sample efficiency as much as possible with limited computational or memory …

Vector Step-size Adaptation for Continual, Online Prediction

A Jacobsen - 2019 - era.library.ualberta.ca

In this thesis, we investigate different vector step-size adaptation approaches for continual,
online prediction problems. Vanilla stochastic gradient descent can be considerably …