Adaptive temporal difference learning with linear function approximation

T Sun, H Shen, T Chen, D Li - IEEE Transactions on Pattern …, 2021 - ieeexplore.ieee.org
This paper revisits the temporal difference (TD) learning algorithm for the policy evaluation
tasks in reinforcement learning. Typically, the performance of TD (0) and TD () is very …

Adaptive Temporal Difference Learning with Linear Function Approximation

T Sun, H Shen, T Chen, D Li - arXiv preprint arXiv:2002.08537, 2020 - arxiv.org
This paper revisits the temporal difference (TD) learning algorithm for the policy evaluation
tasks in reinforcement learning. Typically, the performance of TD (0) and TD ($\lambda $) is …

Adaptive Temporal Difference Learning With Linear Function Approximation

T Sun, H Shen, T Chen, D Li - IEEE transactions on …, 2022 - pubmed.ncbi.nlm.nih.gov
This paper revisits the temporal difference (TD) learning algorithm for the policy evaluation
tasks in reinforcement learning. Typically, the performance of TD (0) and TD (λ) is very …

Adaptive Temporal Difference Learning With Linear Function Approximation

T Sun, H Shen, T Chen, D Li - IEEE Transactions on Pattern Analysis …, 2022 - computer.org
This paper revisits the temporal difference (TD) learning algorithm for the policy evaluation
tasks in reinforcement learning. Typically, the performance of TD (0) and TD (-) is very …

Adaptive Temporal Difference Learning with Linear Function Approximation

T Sun, H Shen, T Chen, D Li - arXiv e-prints, 2020 - ui.adsabs.harvard.edu
This paper revisits the temporal difference (TD) learning algorithm for the policy evaluation
tasks in reinforcement learning. Typically, the performance of TD (0) and TD ($\lambda $) is …

Adaptive Temporal Difference Learning With Linear Function Approximation.

T Sun, H Shen, T Chen, D Li - IEEE Transactions on Pattern …, 2022 - europepmc.org
This paper revisits the temporal difference (TD) learning algorithm for the policy evaluation
tasks in reinforcement learning. Typically, the performance of TD (0) and TD (λ) is very …