Reinforcement learning in POMDPs with memoryless options and option-observation initiation sets

MZ Alom, TM Taha, C Yakopcic, S Westberg, P Sidike… - electronics, 2019 - mdpi.com

In recent years, deep learning has garnered tremendous success in a variety of application
domains. This new field of machine learning has been growing rapidly and has been …

被引用次数：1821 相关文章所有 9 个版本

[PDF] saulius-grazulis.lt

The history began from alexnet: A comprehensive survey on deep learning approaches

MZ Alom, TM Taha, C Yakopcic, S Westberg… - arXiv preprint arXiv …, 2018 - arxiv.org

Deep learning has demonstrated tremendous success in variety of application domains in
the past few years. This new field of machine learning has been growing rapidly and applied …

被引用次数：1583 相关文章所有 8 个版本

[PDF] arxiv.org

A survey and critique of multiagent deep reinforcement learning

P Hernandez-Leal, B Kartal, ME Taylor - Autonomous Agents and Multi …, 2019 - Springer

Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has
led to a dramatic increase in the number of applications and methods. Recent works have …

被引用次数：685 相关文章所有 8 个版本

[PDF] researchgate.net

Pomdp and hierarchical options mdp with continuous actions for autonomous driving at intersections

Z Qiao, K Muelling, J Dolan… - 2018 21st …, 2018 - ieeexplore.ieee.org

When applying autonomous driving technology to real-world scenarios, environmental
uncertainties make the development of decision-making algorithms difficult. Modeling the …

被引用次数：75 相关文章所有 4 个版本

[PDF] researchgate.net

[PDF][PDF] Multi-objective reinforcement learning for the expected utility of the return

DM Roijers, D Steckelmacher… - Proceedings of the …, 2018 - researchgate.net

Real-world decision problems often have multiple, possibly conflicting, objectives. In multi-
objective reinforcement learning, the effects of actions in terms of these objectives must be …

被引用次数：49 相关文章所有 8 个版本

[PDF] springer.com

Influence-aware memory architectures for deep reinforcement learning in POMDPs

M Suau, J He, E Congeduti, RAN Starre… - Neural Computing and …, 2022 - Springer

Due to its perceptual limitations, an agent may have too little information about the
environment to act optimally. In such cases, it is important to keep track of the action …

被引用次数：6 相关文章所有 8 个版本

[PDF] neurips.cc

Reinforcement learning in Newcomblike environments

J Bell, L Linsefors, C Oesterheld… - Advances in Neural …, 2021 - proceedings.neurips.cc

Newcomblike decision problems have been studied extensively in the decision theory
literature, but they have so far been largely absent in the reinforcement learning literature. In …

被引用次数：17 相关文章所有 8 个版本

[PDF] vub.be

Deep multi-agent reinforcement learning in a homogeneous open population

R Rădulescu, M Legrand, K Efthymiadis… - … , BNAIC 2018,'s …, 2019 - Springer

Advances in reinforcement learning research have recently produced agents that are
competent, or sometimes exceed human performance, in complex tasks. Most interesting …

被引用次数：26 相关文章所有 10 个版本

[PDF] arxiv.org

Periodic agent-state based Q-learning for POMDPs

A Sinha, M Geist, A Mahajan - arXiv preprint arXiv:2407.06121, 2024 - arxiv.org

The standard approach for Partially Observable Markov Decision Processes (POMDPs) is to
convert them to a fully observed belief-state MDP. However, the belief state depends on the …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Deep reinforcement learning with modulated Hebbian plus Q-network architecture

P Ladosz, E Ben-Iwhiwhu, J Dick, N Ketz… - … on Neural Networks …, 2021 - ieeexplore.ieee.org

In this article, we consider a subclass of partially observable Markov decision process
(POMDP) problems which we termed confounding POMDPs. In these types of POMDPs …

被引用次数：19 相关文章所有 8 个版本