Deep reinforcement learning based video games: A review

KA ElDahshan, H Farouk… - 2022 2nd International …, 2022 - ieeexplore.ieee.org
Video game development is getting increasingly effective as AI paradigms advance. Deep
Reinforcement Learning (DRL) is a promising artificial intelligence (AI) approach. It …

Adaptive ensemble q-learning: Minimizing estimation bias via error feedback

H Wang, S Lin, J Zhang - Advances in neural information …, 2021 - proceedings.neurips.cc
The ensemble method is a promising way to mitigate the overestimation issue in Q-learning,
where multiple function approximators are used to estimate the action values. It is known …

On the estimation bias in double q-learning

Z Ren, G Zhu, H Hu, B Han, J Chen… - Advances in Neural …, 2021 - proceedings.neurips.cc
Double Q-learning is a classical method for reducing overestimation bias, which is caused
by taking maximum estimated values in the Bellman operation. Its variants in the deep Q …

Two-Step Q-Learning

A Vijesh - arXiv preprint arXiv:2407.02369, 2024 - arxiv.org
Q-learning is a stochastic approximation version of the classic value iteration. The literature
has established that Q-learning suffers from both maximization bias and slower …

ISFORS-MIX: Multi-Agent Reinforcement Learning with Importance-Sampling-Free Off-policy learning and Regularized-Softmax Mixing Network

J Rao, C Wang, M Liu, J Lei, W Giernacki - Knowledge-Based Systems, 2024 - Elsevier
In multi-agent reinforcement learning (MARL), the low quality of value function and the
estimation bias and variance in value function decomposition (VFD) are critical challenges …

A Meta-Learning Approach to Mitigating the Estimation Bias of Q-Learning

T Tan, H Xie, X Shi, M Shang - ACM Transactions on Knowledge …, 2024 - dl.acm.org
It is a longstanding problem that Q-learning suffers from the overestimation bias. This issue
originates from the fact that Q-learning uses the expectation of maximum Q-value to …

Adaptive moving average Q-learning

T Tan, H Xie, Y Xia, X Shi, M Shang - Knowledge and Information Systems, 2024 - Springer
A variety of algorithms have been proposed to address the long-standing overestimation
bias problem of Q-learning. Reducing this overestimation bias may lead to an …

Q-learning with heterogeneous update strategy

T Tan, H Xie, L Feng - Information Sciences, 2024 - Elsevier
A variety of algorithms has been proposed to mitigate the overestimation bias of Q-learning.
These algorithms reduce the estimation of maximum Q-value, ie, homogeneous update. As …

Smoothed Q-learning

D Barber - arXiv preprint arXiv:2303.08631, 2023 - arxiv.org
In Reinforcement Learning the Q-learning algorithm provably converges to the optimal
solution. However, as others have demonstrated, Q-learning can also overestimate the …

Research on Energy-Saving Speed Curve of Heavy Haul Train Based on Reinforcement Learning

W Zhang, X Sun, Z Liu, L Yang - 2023 IEEE 26th International …, 2023 - ieeexplore.ieee.org
Total freight transportation is increasing year by year, and freight railway energy
consumption is also increasing. This paper studies the energy-saving speed curve of heavy …