Reinforcement learning algorithms with function approximation: Recent advances and applications
X Xu, L Zuo, Z Huang - Information sciences, 2014 - Elsevier
In recent years, the research on reinforcement learning (RL) has focused on function
approximation in learning prediction and control of Markov decision processes (MDPs). The …
approximation in learning prediction and control of Markov decision processes (MDPs). The …
[图书][B] Algorithms for reinforcement learning
C Szepesvári - 2022 - books.google.com
Reinforcement learning is a learning paradigm concerned with learning to control a system
so as to maximize a numerical performance measure that expresses a long-term objective …
so as to maximize a numerical performance measure that expresses a long-term objective …
[图书][B] Reinforcement learning and dynamic programming using function approximators
From household appliances to applications in robotics, engineered systems involving
complex dynamics can only be as effective as the algorithms that control them. While …
complex dynamics can only be as effective as the algorithms that control them. While …
[PDF][PDF] Policy evaluation with temporal differences: A survey and comparison
Policy evaluation is an essential step in most reinforcement learning approaches. It yields a
value function, the quality assessment of states for a given policy, which can be used in a …
value function, the quality assessment of states for a given policy, which can be used in a …
Stochastic variance reduction methods for policy evaluation
Policy evaluation is concerned with estimating the value function that predicts long-term
values of states under a given policy. It is a crucial step in many reinforcement-learning …
values of states under a given policy. It is a crucial step in many reinforcement-learning …
Reinforcement learning in continuous state and action spaces
H Van Hasselt - Reinforcement Learning: State-of-the-Art, 2012 - Springer
Many traditional reinforcement-learning algorithms have been designed for problems with
small finite state and action spaces. Learning in such discrete problems can been difficult …
small finite state and action spaces. Learning in such discrete problems can been difficult …
Error propagation for approximate policy and value iteration
A Farahmand, C Szepesvári… - Advances in neural …, 2010 - proceedings.neurips.cc
We address the question of how the approximation error/Bellman residual at each iteration
of the Approximate Policy/Value Iteration algorithms influences the quality of the resulted …
of the Approximate Policy/Value Iteration algorithms influences the quality of the resulted …
Loss dynamics of temporal difference reinforcement learning
Reinforcement learning has been successful across several applications in which agents
have to learn to act in environments with sparse feedback. However, despite this empirical …
have to learn to act in environments with sparse feedback. However, despite this empirical …
Least-squares methods for policy iteration
Approximate reinforcement learning deals with the essential problem of applying
reinforcement learning in large and continuous state-action spaces, by using function …
reinforcement learning in large and continuous state-action spaces, by using function …
Accelerated gradient temporal difference learning
The family of temporal difference (TD) methods span a spectrum from computationally frugal
linear methods like TD (λ) to data efficient least squares methods. Least square methods …
linear methods like TD (λ) to data efficient least squares methods. Least square methods …