Observe and look further: Achieving consistent performance on atari
Despite significant advances in the field of deep Reinforcement Learning (RL), today's
algorithms still fail to learn human-level policies consistently over a set of diverse tasks such …
algorithms still fail to learn human-level policies consistently over a set of diverse tasks such …
Towards characterizing divergence in deep q-learning
Deep Q-Learning (DQL), a family of temporal difference algorithms for control, employs three
techniques collectively known as thedeadly triad'in reinforcement learning: bootstrapping, off …
techniques collectively known as thedeadly triad'in reinforcement learning: bootstrapping, off …
Constrained deep q-learning gradually approaching ordinary q-learning
S Ohnishi, E Uchibe, Y Yamaguchi… - Frontiers in …, 2019 - frontiersin.org
A deep Q network (DQN)(Mnih et al.,) is an extension of Q learning, which is a typical deep
reinforcement learning method. In DQN, a Q function expresses all action values under all …
reinforcement learning method. In DQN, a Q function expresses all action values under all …
Vehicle to grid frequency regulation capacity optimal scheduling for battery swapping station using deep Q-network
Battery swapping stations (BSSs) are ideal candidates for fast frequency regulation services
(FFRS) due to their large battery stock capacity. In addition, BSSs can precharge batteries …
(FFRS) due to their large battery stock capacity. In addition, BSSs can precharge batteries …
Two-stage WECC composite load modeling: A double deep Q-learning networks approach
With the increasing complexity of modern power system, conventional dynamic load
modeling with ZIP and induction motors (ZIP+ IM) is no longer adequate to address the …
modeling with ZIP and induction motors (ZIP+ IM) is no longer adequate to address the …
Grac: Self-guided and self-regularized actor-critic
Deep reinforcement learning (DRL) algorithms have successfully been demonstrated on a
range of challenging decision making and control tasks. One dominant component of recent …
range of challenging decision making and control tasks. One dominant component of recent …
Adaptively calibrated critic estimates for deep reinforcement learning
Accurate value estimates are important for off-policy reinforcement learning. Algorithms
based on temporal difference learning typically are prone to an over-or underestimation bias …
based on temporal difference learning typically are prone to an over-or underestimation bias …
Stabilizing deep Q-learning with Q-graph-based bounds
State-of-the art deep reinforcement learning has enabled autonomous agents to learn
complex strategies from scratch on many problems including continuous control tasks. Deep …
complex strategies from scratch on many problems including continuous control tasks. Deep …
Deep Q‐learning: A robust control approach
This work aims at constructing a bridge between robust control theory and reinforcement
learning. Although, reinforcement learning has shown admirable results in complex control …
learning. Although, reinforcement learning has shown admirable results in complex control …
Temporal Consistency-Based Loss Function for Both Deep Q-Networks and Deep Deterministic Policy Gradients for Continuous Actions
C Kim - Symmetry, 2021 - mdpi.com
Artificial intelligence (AI) techniques in power grid control and energy management in
building automation require both deep Q-networks (DQNs) and deep deterministic policy …
building automation require both deep Q-networks (DQNs) and deep deterministic policy …