Transformers in reinforcement learning: a survey

P Agarwal, AA Rahman, PL St-Charles… - arXiv preprint arXiv …, 2023 - arxiv.org
Transformers have significantly impacted domains like natural language processing,
computer vision, and robotics, where they improve performance compared to other neural …

Exploiting the flexibility inside park-level commercial buildings considering heat transfer time delay: A memory-augmented deep reinforcement learning approach

H Zhao, B Wang, H Liu, H Sun, Z Pan… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
The energy consumed by commercial buildings for heating and cooling is significantly
increased. To better cope with the uncertainty introduced by the high penetration of …

Learning belief representations for partially observable deep RL

A Wang, AC Li, TQ Klassen, RT Icarte… - International …, 2023 - proceedings.mlr.press
Many important real-world Reinforcement Learning (RL) problems involve partial
observability and require policies with memory. Unfortunately, standard deep RL algorithms …

Learning reward machines: A study in partially observable reinforcement learning

RT Icarte, TQ Klassen, R Valenzano, MP Castro… - Artificial Intelligence, 2023 - Elsevier
Reinforcement Learning (RL) is a machine learning paradigm wherein an artificial agent
interacts with an environment with the purpose of learning behaviour that maximizes the …

Learning what to memorize: Using intrinsic motivation to form useful memory in partially observable reinforcement learning

A Demir - Applied Intelligence, 2023 - Springer
Reinforcement Learning faces an important challenge in partially observable environments
with long-term dependencies. In order to learn in an ambiguous environment, an agent has …

Geometry of Optimization in Markov Decision Processes and Neural Network-Based PDE Solvers

J Müller - 2023 - ul.qucosa.de
Abstract (EN) This thesis is divided into two parts dealing with the optimization problems in
Markov decision processes (MDPs) and different neural network-based numerical solvers …

Augmenting decision with hypothesis in reinforcement learning

MQ NGUYEN, HW LAUW - 2024 - ink.library.smu.edu.sg
Value-based reinforcement learning is the current State-Of-The-Art due to high sampling
efficiency. However, our study shows it suffers from low exploitation in early training period …

Augmenting Decision with Hypothesis in Reinforcement Learning

NM Quang, HW Lauw - Forty-first International Conference on Machine … - openreview.net
Value-based reinforcement learning is the current State-Of-The-Art due to high sampling
efficiency. However, our study shows it suffers from low exploitation in early training period …

How memory architecture affects learning in a simple POMDP: the two-hypothesis testing problem

M Geiger, C Eloy, M Wyart - arXiv preprint arXiv:2106.08849, 2021 - arxiv.org
Reinforcement learning is generally difficult for partially observable Markov decision
processes (POMDPs), which occurs when the agent's observation is partial or noisy. To seek …