Recent advances in deep reinforcement learning applications for solving partially observable markov decision processes (pomdp) problems: Part 1—fundamentals …
X Xiang, S Foo - Machine Learning and Knowledge Extraction, 2021 - mdpi.com
The first part of a two-part series of papers provides a survey on recent advances in Deep
Reinforcement Learning (DRL) applications for solving partially observable Markov decision …
Reinforcement Learning (DRL) applications for solving partially observable Markov decision …
[图书][B] Neural networks and deep learning
CC Aggarwal - 2018 - Springer
“Any AI smart enough to pass a Turing test is smart enough to know to fail it.”–*** Ian
McDonald Neural networks were developed to simulate the human nervous system for …
McDonald Neural networks were developed to simulate the human nervous system for …
[HTML][HTML] The hanabi challenge: A new frontier for ai research
From the early days of computing, games have been important testbeds for studying how
well machines can do sophisticated decision making. In recent years, machine learning has …
well machines can do sophisticated decision making. In recent years, machine learning has …
Bayesian reinforcement learning: A survey
Bayesian methods for machine learning have been widely investigated, yielding principled
methods for incorporating prior information into inference algorithms. In this survey, we …
methods for incorporating prior information into inference algorithms. In this survey, we …
Incremental natural actor-critic algorithms
S Bhatnagar, M Ghavamzadeh… - Advances in neural …, 2007 - proceedings.neurips.cc
We present four new reinforcement learning algorithms based on actor-critic and natural-
gradient ideas, and provide their convergence proofs. Actor-critic rein-forcement learning …
gradient ideas, and provide their convergence proofs. Actor-critic rein-forcement learning …
[PDF][PDF] Intelligent traffic light control
Vehicular travel is increasing throughout the world, particularly in large urban areas.
Therefore the need arises for simulating and optimizing traffic control algorithms to better …
Therefore the need arises for simulating and optimizing traffic control algorithms to better …
Programming backgammon using self-teaching neural nets
G Tesauro - Artificial Intelligence, 2002 - Elsevier
TD-Gammon is a neural network that is able to teach itself to play backgammon solely by
playing against itself and learning from the results. Starting from random initial play, TD …
playing against itself and learning from the results. Starting from random initial play, TD …
Learning to search with mctsnets
Planning problems are among the most important and well-studied problems in artificial
intelligence. They are most typically solved by tree search algorithms that simulate ahead …
intelligence. They are most typically solved by tree search algorithms that simulate ahead …
Td-gammon: A self-teaching backgammon program
G Tesauro - Applications of neural networks, 1995 - Springer
Furthermore, when a set of hand-crafted features is added to the network's input
representation, the result is a truly staggering level of performance: TO-Gammon is now …
representation, the result is a truly staggering level of performance: TO-Gammon is now …
[PDF][PDF] Reinforcement learning in board games
I Ghory - Department of Computer Science, University of Bristol …, 2004 - 107.167.189.191
This project investigates the application of the TD (λ) reinforcement learning algorithm and
neural networks to the problem of producing an agent that can play board games. It provides …
neural networks to the problem of producing an agent that can play board games. It provides …