Self-correcting q-learning

KA ElDahshan, H Farouk… - 2022 2nd International …, 2022 - ieeexplore.ieee.org

Video game development is getting increasingly effective as AI paradigms advance. Deep
Reinforcement Learning (DRL) is a promising artificial intelligence (AI) approach. It …

被引用次数：11 相关文章

[PDF] neurips.cc

Adaptive ensemble q-learning: Minimizing estimation bias via error feedback

H Wang, S Lin, J Zhang - Advances in neural information …, 2021 - proceedings.neurips.cc

The ensemble method is a promising way to mitigate the overestimation issue in Q-learning,
where multiple function approximators are used to estimate the action values. It is known …

被引用次数：23 相关文章所有 9 个版本

[PDF] neurips.cc

On the estimation bias in double q-learning

Z Ren, G Zhu, H Hu, B Han, J Chen… - Advances in Neural …, 2021 - proceedings.neurips.cc

Double Q-learning is a classical method for reducing overestimation bias, which is caused
by taking maximum estimated values in the Bellman operation. Its variants in the deep Q …

被引用次数：25 相关文章所有 8 个版本

[PDF] arxiv.org

Two-Step Q-Learning

A Vijesh - arXiv preprint arXiv:2407.02369, 2024 - arxiv.org

Q-learning is a stochastic approximation version of the classic value iteration. The literature
has established that Q-learning suffers from both maximization bias and slower …

被引用次数：2 相关文章所有 2 个版本

ISFORS-MIX: Multi-Agent Reinforcement Learning with Importance-Sampling-Free Off-policy learning and Regularized-Softmax Mixing Network

J Rao, C Wang, M Liu, J Lei, W Giernacki - Knowledge-Based Systems, 2024 - Elsevier

In multi-agent reinforcement learning (MARL), the low quality of value function and the
estimation bias and variance in value function decomposition (VFD) are critical challenges …

A Meta-Learning Approach to Mitigating the Estimation Bias of Q-Learning

T Tan, H Xie, X Shi, M Shang - ACM Transactions on Knowledge …, 2024 - dl.acm.org

It is a longstanding problem that Q-learning suffers from the overestimation bias. This issue
originates from the fact that Q-learning uses the expectation of maximum Q-value to …

Adaptive moving average Q-learning

T Tan, H Xie, Y Xia, X Shi, M Shang - Knowledge and Information Systems, 2024 - Springer

A variety of algorithms have been proposed to address the long-standing overestimation
bias problem of Q-learning. Reducing this overestimation bias may lead to an …

Q-learning with heterogeneous update strategy

T Tan, H Xie, L Feng - Information Sciences, 2024 - Elsevier

A variety of algorithms has been proposed to mitigate the overestimation bias of Q-learning.
These algorithms reduce the estimation of maximum Q-value, ie, homogeneous update. As …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Smoothed Q-learning

D Barber - arXiv preprint arXiv:2303.08631, 2023 - arxiv.org

In Reinforcement Learning the Q-learning algorithm provably converges to the optimal
solution. However, as others have demonstrated, Q-learning can also overestimate the …

被引用次数：3 相关文章所有 2 个版本

Research on Energy-Saving Speed Curve of Heavy Haul Train Based on Reinforcement Learning

W Zhang, X Sun, Z Liu, L Yang - 2023 IEEE 26th International …, 2023 - ieeexplore.ieee.org

Total freight transportation is increasing year by year, and freight railway energy
consumption is also increasing. This paper studies the energy-saving speed curve of heavy …

被引用次数：1 相关文章

Deep reinforcement learning based video games: A review