Learning a belief representation for delayed reinforcement learning

S Holt, A Hüyük, Z Qian, H Sun… - International …, 2023 - proceedings.mlr.press

Many real-world offline reinforcement learning (RL) problems involve continuous-time
environments with delays. Such environments are characterized by two distinctive features …

被引用次数：10 相关文章所有 3 个版本

[PDF] arxiv.org

Boosting Long-Delayed Reinforcement Learning with Auxiliary Short-Delayed Task

Q Wu, SS Zhan, Y Wang, CW Lin, C Lv, Q Zhu… - arXiv preprint arXiv …, 2024 - arxiv.org

Reinforcement learning is challenging in delayed scenarios, a common real-world situation
where observations and interactions occur with delays. State-of-the-art (SOTA) state …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Delays in reinforcement learning

P Liotet - arXiv preprint arXiv:2309.11096, 2023 - arxiv.org

Delays are inherent to most dynamical systems. Besides shifting the process in time, they
can significantly affect their performance. For this reason, it is usually valuable to study the …

被引用次数：2 相关文章所有 4 个版本

A delay-robust method for enhanced real-time reinforcement learning

B Xia, H Sun, B Yuan, Z Li, B Liang, X Wang - Neural Networks, 2024 - Elsevier

In reinforcement learning, the Markov Decision Process (MDP) framework typically operates
under a blocking paradigm, assuming a static environment during the agent's decision …

[HTML][HTML] A pipelining task offloading strategy via delay-aware multi-agent reinforcement learning in Cybertwin-enabled 6G network

H Niu, L Wang, K Du, Z Lu, X Wen, Y Liu - Digital Communications and …, 2023 - Elsevier

Abstract Cybertwin-enabled 6th Generation (6G) network is envisioned to support artificial
intelligence-native management to meet changing demands of 6G applications. Multi-Agent …

被引用次数：1 相关文章

[PDF] arxiv.org

Variational Delayed Policy Optimization

Q Wu, SS Zhan, Y Wang, Y Wang, CW Lin, C Lv… - arXiv preprint arXiv …, 2024 - arxiv.org

In environments with delayed observation, state augmentation by including actions within
the delay window is adopted to retrieve Markovian property to enable reinforcement learning …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

被引用次数：1 相关文章