所有版本 - 学术资源搜索

Analysis of temporal-diffference learning with function approximation

J Tsitsiklis, B Van Roy - Advances in neural information …, 1996 - proceedings.neurips.cc

We present new results about the temporal-difference learning al (cid: 173) gorithm, as
applied to approximating the cost-to-go function of a Markov chain using linear function …

被引用次数：2299 相关文章

[引用][C] An analysis of temporal-difference learning with function approximation

JN Tsitsiklis, B Van Roy - IEEE Transactions on Automatic Control, 1997 - cir.nii.ac.jp

An analysis of temporal-difference learning with function approximation | CiNii Research
CiNii 国立情報学研究所学術情報ナビゲータ[サイニィ] 詳細へ移動検索フォームへ移動論文・データを …

An analysis of temporal-difference learning with function approximation

JN Tsitsiklis, B Van Roy - IEEE Transactions on Automatic …, 1997 - ieeexplore.ieee.org

We discuss the temporal-difference learning algorithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze …

[PDF] mit.edu

[PDF][PDF] An Analysis of Temporal-Difference Learning with Function Approximation

JN Tsitsiklis, B Van Roy - IEEE TRANSACTIONS ON AUTOMATIC …, 1997 - web.mit.edu

We discuss the temporal-difference learning algo-rithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze …

[PDF] mit.edu

[PDF][PDF] An Analysis of Temporal-Difference Learning with Function Approximation

JN Tsitsiklis, B Van Roy - IEEE TRANSACTIONS ON AUTOMATIC …, 1997 - w3.mit.edu

We discuss the temporal-difference learning algo-rithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze …

[PDF] academia.edu

[PDF][PDF] LIDS-P-2322 March 6, 1996

JN Tsitsiklis, B Van Roy - 1996 - academia.edu

We discuss the temporal-difference learning algorithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain, using a function approximator …

[PDF] stanford.edu

[PDF][PDF] An Analysis of Temporal-Difference Learning with Function Approximation

JN Tsitsiklis, B Van Roy - IEEE TRANSACTIONS ON AUTOMATIC …, 1997 - stanford.edu

We discuss the temporal-difference learning algo-rithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze …

[PDF] berkeley.edu

[PDF][PDF] An Analysis of Temporal-Difference Learning with Function Approximation

JN Tsitsiklis, B Van Roy - IEEE TRANSACTIONS ON AUTOMATIC …, 1997 - rll.berkeley.edu

We discuss the temporal-difference learning algo-rithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze …

Analysis of temporal-difference learning with function approximation

JN Tsitsiklis, B Van Roy - … of the 9th International Conference on Neural …, 1996 - dl.acm.org

We present new results about the temporal-difference learning algorithm, as applied to
approximating the cost-to-go function of a Markov chain using linear function approximators …

[PDF] derongliu.org

[PDF][PDF] An Analysis of Temporal-Difference Learning with Function Approximation

JN Tsitsiklis, B Van Roy - IEEE TRANSACTIONS ON AUTOMATIC …, 1997 - derongliu.org

We discuss the temporal-difference learning algo-rithm, as applied to approximating the cost-
to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze …