Analysis of temporal-diffference learning with function approximation

J Tsitsiklis, B Van Roy - Advances in neural information …, 1996 - proceedings.neurips.cc
We present new results about the temporal-difference learning al (cid: 173) gorithm, as
applied to approximating the cost-to-go function of a Markov chain using linear function …

[引用][C] An analysis of temporal-difference learning with function approximation

JN Tsitsiklis, B Van Roy - IEEE Transactions on Automatic Control, 1997 - cir.nii.ac.jp
An analysis of temporal-difference learning with function approximation | CiNii Research
CiNii 国立情報学研究所 学術情報ナビゲータ[サイニィ] 詳細へ移動 検索フォームへ移動 論文・データを …

An analysis of temporal-difference learning with function approximation

JN Tsitsiklis, B Van Roy - IEEE Transactions on Automatic …, 1997 - ieeexplore.ieee.org
We discuss the temporal-difference learning algorithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze …

[PDF][PDF] An Analysis of Temporal-Difference Learning with Function Approximation

JN Tsitsiklis, B Van Roy - IEEE TRANSACTIONS ON AUTOMATIC …, 1997 - web.mit.edu
We discuss the temporal-difference learning algo-rithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze …

[PDF][PDF] An Analysis of Temporal-Difference Learning with Function Approximation

JN Tsitsiklis, B Van Roy - IEEE TRANSACTIONS ON AUTOMATIC …, 1997 - w3.mit.edu
We discuss the temporal-difference learning algo-rithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze …

[PDF][PDF] LIDS-P-2322 March 6, 1996

JN Tsitsiklis, B Van Roy - 1996 - academia.edu
We discuss the temporal-difference learning algorithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain, using a function approximator …

[PDF][PDF] An Analysis of Temporal-Difference Learning with Function Approximation

JN Tsitsiklis, B Van Roy - IEEE TRANSACTIONS ON AUTOMATIC …, 1997 - stanford.edu
We discuss the temporal-difference learning algo-rithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze …

[PDF][PDF] An Analysis of Temporal-Difference Learning with Function Approximation

JN Tsitsiklis, B Van Roy - IEEE TRANSACTIONS ON AUTOMATIC …, 1997 - rll.berkeley.edu
We discuss the temporal-difference learning algo-rithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze …

Analysis of temporal-difference learning with function approximation

JN Tsitsiklis, B Van Roy - … of the 9th International Conference on Neural …, 1996 - dl.acm.org
We present new results about the temporal-difference learning algorithm, as applied to
approximating the cost-to-go function of a Markov chain using linear function approximators …

[PDF][PDF] An Analysis of Temporal-Difference Learning with Function Approximation

JN Tsitsiklis, B Van Roy - IEEE TRANSACTIONS ON AUTOMATIC …, 1997 - derongliu.org
We discuss the temporal-difference learning algo-rithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze …