- 学术资源搜索

Transformers in reinforcement learning: a survey

P Agarwal, AA Rahman, PL St-Charles… - arXiv preprint arXiv …, 2023 - arxiv.org

Transformers have significantly impacted domains like natural language processing,
computer vision, and robotics, where they improve performance compared to other neural …

被引用次数：16 相关文章所有 2 个版本

Exploiting the flexibility inside park-level commercial buildings considering heat transfer time delay: A memory-augmented deep reinforcement learning approach

H Zhao, B Wang, H Liu, H Sun, Z Pan… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

The energy consumed by commercial buildings for heating and cooling is significantly
increased. To better cope with the uncertainty introduced by the high penetration of …

被引用次数：26 相关文章所有 2 个版本

[PDF] mlr.press

Learning belief representations for partially observable deep RL

A Wang, AC Li, TQ Klassen, RT Icarte… - International …, 2023 - proceedings.mlr.press

Many important real-world Reinforcement Learning (RL) problems involve partial
observability and require policies with memory. Unfortunately, standard deep RL algorithms …

被引用次数：9 相关文章所有 6 个版本

[PDF] arxiv.org

Learning reward machines: A study in partially observable reinforcement learning

RT Icarte, TQ Klassen, R Valenzano, MP Castro… - Artificial Intelligence, 2023 - Elsevier

Reinforcement Learning (RL) is a machine learning paradigm wherein an artificial agent
interacts with an environment with the purpose of learning behaviour that maximizes the …

被引用次数：19 相关文章所有 9 个版本

[PDF] arxiv.org

Learning what to memorize: Using intrinsic motivation to form useful memory in partially observable reinforcement learning

A Demir - Applied Intelligence, 2023 - Springer

Reinforcement Learning faces an important challenge in partially observable environments
with long-term dependencies. In order to learn in an ambiguous environment, an agent has …

被引用次数：5 相关文章所有 7 个版本

Geometry of Optimization in Markov Decision Processes and Neural Network-Based PDE Solvers

J Müller - 2023 - ul.qucosa.de

Abstract (EN) This thesis is divided into two parts dealing with the optimization problems in
Markov decision processes (MDPs) and different neural network-based numerical solvers …

被引用次数：2 相关文章

[PDF] smu.edu.sg

Augmenting decision with hypothesis in reinforcement learning

MQ NGUYEN, HW LAUW - 2024 - ink.library.smu.edu.sg

Value-based reinforcement learning is the current State-Of-The-Art due to high sampling
efficiency. However, our study shows it suffers from low exploitation in early training period …

[PDF] openreview.net

Augmenting Decision with Hypothesis in Reinforcement Learning

NM Quang, HW Lauw - Forty-first International Conference on Machine … - openreview.net

Value-based reinforcement learning is the current State-Of-The-Art due to high sampling
efficiency. However, our study shows it suffers from low exploitation in early training period …

[PDF] arxiv.org

How memory architecture affects learning in a simple POMDP: the two-hypothesis testing problem

M Geiger, C Eloy, M Wyart - arXiv preprint arXiv:2106.08849, 2021 - arxiv.org

Reinforcement learning is generally difficult for partially observable Markov decision
processes (POMDPs), which occurs when the agent's observation is partial or noisy. To seek …