Transformers in reinforcement learning: a survey
P Agarwal, AA Rahman, PL St-Charles… - arXiv preprint arXiv …, 2023 - arxiv.org
Transformers have significantly impacted domains like natural language processing,
computer vision, and robotics, where they improve performance compared to other neural …
computer vision, and robotics, where they improve performance compared to other neural …
Exploiting the flexibility inside park-level commercial buildings considering heat transfer time delay: A memory-augmented deep reinforcement learning approach
The energy consumed by commercial buildings for heating and cooling is significantly
increased. To better cope with the uncertainty introduced by the high penetration of …
increased. To better cope with the uncertainty introduced by the high penetration of …
Learning belief representations for partially observable deep RL
Many important real-world Reinforcement Learning (RL) problems involve partial
observability and require policies with memory. Unfortunately, standard deep RL algorithms …
observability and require policies with memory. Unfortunately, standard deep RL algorithms …
Learning reward machines: A study in partially observable reinforcement learning
Reinforcement Learning (RL) is a machine learning paradigm wherein an artificial agent
interacts with an environment with the purpose of learning behaviour that maximizes the …
interacts with an environment with the purpose of learning behaviour that maximizes the …
Learning what to memorize: Using intrinsic motivation to form useful memory in partially observable reinforcement learning
A Demir - Applied Intelligence, 2023 - Springer
Reinforcement Learning faces an important challenge in partially observable environments
with long-term dependencies. In order to learn in an ambiguous environment, an agent has …
with long-term dependencies. In order to learn in an ambiguous environment, an agent has …
Geometry of Optimization in Markov Decision Processes and Neural Network-Based PDE Solvers
J Müller - 2023 - ul.qucosa.de
Abstract (EN) This thesis is divided into two parts dealing with the optimization problems in
Markov decision processes (MDPs) and different neural network-based numerical solvers …
Markov decision processes (MDPs) and different neural network-based numerical solvers …
Augmenting decision with hypothesis in reinforcement learning
MQ NGUYEN, HW LAUW - 2024 - ink.library.smu.edu.sg
Value-based reinforcement learning is the current State-Of-The-Art due to high sampling
efficiency. However, our study shows it suffers from low exploitation in early training period …
efficiency. However, our study shows it suffers from low exploitation in early training period …
Augmenting Decision with Hypothesis in Reinforcement Learning
Value-based reinforcement learning is the current State-Of-The-Art due to high sampling
efficiency. However, our study shows it suffers from low exploitation in early training period …
efficiency. However, our study shows it suffers from low exploitation in early training period …
How memory architecture affects learning in a simple POMDP: the two-hypothesis testing problem
Reinforcement learning is generally difficult for partially observable Markov decision
processes (POMDPs), which occurs when the agent's observation is partial or noisy. To seek …
processes (POMDPs), which occurs when the agent's observation is partial or noisy. To seek …