Deep reinforcement learning based video games: A review
KA ElDahshan, H Farouk… - 2022 2nd International …, 2022 - ieeexplore.ieee.org
Video game development is getting increasingly effective as AI paradigms advance. Deep
Reinforcement Learning (DRL) is a promising artificial intelligence (AI) approach. It …
Reinforcement Learning (DRL) is a promising artificial intelligence (AI) approach. It …
Adaptive ensemble q-learning: Minimizing estimation bias via error feedback
The ensemble method is a promising way to mitigate the overestimation issue in Q-learning,
where multiple function approximators are used to estimate the action values. It is known …
where multiple function approximators are used to estimate the action values. It is known …
On the estimation bias in double q-learning
Double Q-learning is a classical method for reducing overestimation bias, which is caused
by taking maximum estimated values in the Bellman operation. Its variants in the deep Q …
by taking maximum estimated values in the Bellman operation. Its variants in the deep Q …
Two-Step Q-Learning
A Vijesh - arXiv preprint arXiv:2407.02369, 2024 - arxiv.org
Q-learning is a stochastic approximation version of the classic value iteration. The literature
has established that Q-learning suffers from both maximization bias and slower …
has established that Q-learning suffers from both maximization bias and slower …
ISFORS-MIX: Multi-Agent Reinforcement Learning with Importance-Sampling-Free Off-policy learning and Regularized-Softmax Mixing Network
J Rao, C Wang, M Liu, J Lei, W Giernacki - Knowledge-Based Systems, 2024 - Elsevier
In multi-agent reinforcement learning (MARL), the low quality of value function and the
estimation bias and variance in value function decomposition (VFD) are critical challenges …
estimation bias and variance in value function decomposition (VFD) are critical challenges …
A Meta-Learning Approach to Mitigating the Estimation Bias of Q-Learning
It is a longstanding problem that Q-learning suffers from the overestimation bias. This issue
originates from the fact that Q-learning uses the expectation of maximum Q-value to …
originates from the fact that Q-learning uses the expectation of maximum Q-value to …
Adaptive moving average Q-learning
A variety of algorithms have been proposed to address the long-standing overestimation
bias problem of Q-learning. Reducing this overestimation bias may lead to an …
bias problem of Q-learning. Reducing this overestimation bias may lead to an …
Q-learning with heterogeneous update strategy
A variety of algorithms has been proposed to mitigate the overestimation bias of Q-learning.
These algorithms reduce the estimation of maximum Q-value, ie, homogeneous update. As …
These algorithms reduce the estimation of maximum Q-value, ie, homogeneous update. As …
Smoothed Q-learning
D Barber - arXiv preprint arXiv:2303.08631, 2023 - arxiv.org
In Reinforcement Learning the Q-learning algorithm provably converges to the optimal
solution. However, as others have demonstrated, Q-learning can also overestimate the …
solution. However, as others have demonstrated, Q-learning can also overestimate the …
Research on Energy-Saving Speed Curve of Heavy Haul Train Based on Reinforcement Learning
W Zhang, X Sun, Z Liu, L Yang - 2023 IEEE 26th International …, 2023 - ieeexplore.ieee.org
Total freight transportation is increasing year by year, and freight railway energy
consumption is also increasing. This paper studies the energy-saving speed curve of heavy …
consumption is also increasing. This paper studies the energy-saving speed curve of heavy …