TD learning with constrained gradients

T Pohlen, B Piot, T Hester, MG Azar, D Horgan… - arXiv preprint arXiv …, 2018 - arxiv.org

Despite significant advances in the field of deep Reinforcement Learning (RL), today's
algorithms still fail to learn human-level policies consistently over a set of diverse tasks such …

被引用次数：140 相关文章所有 2 个版本

[PDF] arxiv.org

Towards characterizing divergence in deep q-learning

J Achiam, E Knight, P Abbeel - arXiv preprint arXiv:1903.08894, 2019 - arxiv.org

Deep Q-Learning (DQL), a family of temporal difference algorithms for control, employs three
techniques collectively known as thedeadly triad'in reinforcement learning: bootstrapping, off …

被引用次数：113 相关文章所有 3 个版本

[PDF] frontiersin.org

Constrained deep q-learning gradually approaching ordinary q-learning

S Ohnishi, E Uchibe, Y Yamaguchi… - Frontiers in …, 2019 - frontiersin.org

A deep Q network (DQN)(Mnih et al.,) is an extension of Q learning, which is a typical deep
reinforcement learning method. In DQN, a Q function expresses all action values under all …

被引用次数：71 相关文章所有 8 个版本

[PDF] ieee.org

Vehicle to grid frequency regulation capacity optimal scheduling for battery swapping station using deep Q-network

X Wang, J Wang, J Liu - IEEE Transactions on Industrial …, 2020 - ieeexplore.ieee.org

Battery swapping stations (BSSs) are ideal candidates for fast frequency regulation services
(FFRS) due to their large battery stock capacity. In addition, BSSs can precharge batteries …

被引用次数：67 相关文章所有 4 个版本

[PDF] arxiv.org

Two-stage WECC composite load modeling: A double deep Q-learning networks approach

X Wang, Y Wang, D Shi, J Wang… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

With the increasing complexity of modern power system, conventional dynamic load
modeling with ZIP and induction motors (ZIP+ IM) is no longer adequate to address the …

被引用次数：72 相关文章所有 5 个版本

[PDF] mlr.press

Grac: Self-guided and self-regularized actor-critic

L Shao, Y You, M Yan, S Yuan… - Conference on Robot …, 2022 - proceedings.mlr.press

Deep reinforcement learning (DRL) algorithms have successfully been demonstrated on a
range of challenging decision making and control tasks. One dominant component of recent …

被引用次数：25 相关文章所有 5 个版本

[PDF] arxiv.org

Adaptively calibrated critic estimates for deep reinforcement learning

N Dorka, T Welschehold, J Bödecker… - IEEE Robotics and …, 2022 - ieeexplore.ieee.org

Accurate value estimates are important for off-policy reinforcement learning. Algorithms
based on temporal difference learning typically are prone to an over-or underestimation bias …

被引用次数：10 相关文章所有 8 个版本

Stabilizing deep Q-learning with Q-graph-based bounds

S Hoppe, M Giftthaler, R Krug… - … International Journal of …, 2023 - journals.sagepub.com

State-of-the art deep reinforcement learning has enabled autonomous agents to learn
complex strategies from scratch on many problems including continuous control tasks. Deep …

被引用次数：1 相关文章所有 3 个版本

[PDF] wiley.com

Deep Q‐learning: A robust control approach

B Varga, B Kulcsár… - International Journal of …, 2023 - Wiley Online Library

This work aims at constructing a bridge between robust control theory and reinforcement
learning. Although, reinforcement learning has shown admirable results in complex control …

被引用次数：12 相关文章所有 5 个版本

[PDF] mdpi.com

Temporal Consistency-Based Loss Function for Both Deep Q-Networks and Deep Deterministic Policy Gradients for Continuous Actions

C Kim - Symmetry, 2021 - mdpi.com

Artificial intelligence (AI) techniques in power grid control and energy management in
building automation require both deep Q-networks (DQNs) and deep deterministic policy …

被引用次数：4 相关文章所有 4 个版本