A finite time analysis of temporal difference learning with linear function approximation- 学术资源搜索

A finite time analysis of temporal difference learning with linear function approximation

J Bhandari, D Russo, R Singal - Conference on learning …, 2018 - proceedings.mlr.press

… explicit finite time analysis of temporal difference learning with linear function approximation.
… and explicit finite time analysis of temporal difference learning. We draw inspiration from the …

被引用次数：438 相关文章所有 11 个版本

[PDF] mlr.press

Finite-time analysis of decentralized temporal-difference learning with linear function approximation

J Sun, G Wang, GB Giannakis… - International …, 2020 - proceedings.mlr.press

… of a decentralized linear function approximation variant of the vanilla TD(0) learning, for …
We proved that such decentralized TD(0) algorithms converge linearly to a small neighborhood …

被引用次数：58 相关文章所有 9 个版本

[PDF] mlr.press

Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation

G Patil, LA Prashanth, D Nagaraj… - International …, 2023 - proceedings.mlr.press

… TD with function approximation used for our analysis. In Section 3, we describe the tail-averaged
TD algorithm, and also present the finite time … in a TD algorithm, and provide finite time …

被引用次数：18 相关文章所有 3 个版本

[PDF] siam.org

Finite-time performance of distributed temporal-difference learning with linear function approximation

TT Doan, ST Maguluri, J Romberg - SIAM Journal on Mathematics of Data …, 2021 - SIAM

… we study a distributed variant of the temporaldifference learning method for solving the policy
evaluation problem in multi-agent reinforcement learning… , followed by a local TD(λ) update. …

被引用次数：53 相关文章所有 3 个版本

[PDF] neurips.cc

Analysis of temporal-diffference learning with function approximation

J Tsitsiklis, B Van Roy - Advances in neural information …, 1996 - proceedings.neurips.cc

… of a Markov chain using linear function approximators. The algorithm we … P and that at
time t the parameter vector r has been set to some value rt. We define the temporal difference dt …

被引用次数：2299 相关文章所有 28 个版本

[PDF] arxiv.org

Adaptive temporal difference learning with linear function approximation

T Sun, H Shen, T Chen, D Li - … Transactions on Pattern Analysis …, 2021 - ieeexplore.ieee.org

… of the TD(0) learning algorithm with linear function approximation that we term AdaTD (0). In
contrast to the TD(0)… Singal, “A finite time analysis of temporal difference learning with linear …

被引用次数：31 相关文章所有 6 个版本

[PDF] mit.edu

[PDF][PDF] Improved temporal difference methods with linear function approximation

DP Bertsekas, VS Borkar, A Nedic - Learning and Approximate Dynamic …, 2004 - mit.edu

… Summary: This chapter considers temporal difference algorithms within the context of
infinite-horizon finite… discounted cost and linear cost function approximation. This problem arises …

被引用次数：103 相关文章所有 14 个版本

[PDF] springer.com

On the convergence of temporal-difference learning with linear function approximation

V Tadić - Machine learning, 2001 - Springer

… and asymptotic approximation error of temporal-difference learning algorithms … linear function
approximation are analyzed. The analysis is carried out in the context of the approximation …

被引用次数：76 相关文章所有 13 个版本

[PDF] incompleteideas.net

Fast gradient-descent methods for temporal-difference learning with linear function approximation

RS Sutton, HR Maei, D Precup, S Bhatnagar… - … on machine learning, 2009 - dl.acm.org

… We have introduced two new gradient-based temporaldifference learning algorithms … linear
function approximation in a general setting that includes both on-policy and off-policy learning…

被引用次数：745 相关文章所有 28 个版本

[PDF] neurips.cc

A Convergent Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation

RS Sutton, H Maei… - Advances in neural …, 2008 - proceedings.neurips.cc

… be practical to approximate the value of each state individually. Here we consider linear
function approximation, in which … The approximation to the value function is then required to be …

被引用次数：283 相关文章所有 4 个版本