Agent-temporal attention for reward redistribution in episodic multi-agent reinforcement learning

J She, JK Gupta, MJ Kochenderfer - arXiv preprint arXiv:2210.17540, 2022 - arxiv.org

Sparse and delayed rewards pose a challenge to single agent reinforcement learning. This
challenge is amplified in multi-agent reinforcement learning (MARL) where credit …

被引用次数：16 相关文章所有 7 个版本

[PDF] aaai.org

STAS: Spatial-Temporal Return Decomposition for Solving Sparse Rewards Problems in Multi-agent Reinforcement Learning

S Chen, Z Zhang, Y Yang, Y Du - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

Centralized Training with Decentralized Execution (CTDE) has been proven to be an
effective paradigm in cooperative multi-agent reinforcement learning (MARL). One of the …

被引用次数：2 相关文章

[PDF] google.com

RevAP: A bankruptcy-based algorithm to solve the multi-agent credit assignment problem in task start threshold-based multi-agent systems

H Yarahmadi, ME Shiri, H Navidi, A Sharifi… - Robotics and …, 2024 - Elsevier

Abstract Multi-Agent Systems (MASs) are the prominent symbol of Distributed Artificial
Intelligence (DAI). Learning in MAS, which is commonly based on Reinforcement Learning …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning

Y Qu, Y Jiang, B Wang, Y Mao, C Wang, C Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

Reinforcement learning (RL) often encounters delayed and sparse feedback in real-world
applications, even with only episodic rewards. Previous approaches have made some …

Learning Individual Potential-Based Rewards in Multi-Agent Reinforcement Learning

C Yang, P Xu, J Zhang - IEEE Transactions on Games, 2024 - ieeexplore.ieee.org

A great challenge for applying multi-agent reinforcement learning (MARL) in the field of
game AI is to enable agents to learn diversified policies to handle different gamespecific …

[PDF] arxiv.org

STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning

S Chen, Z Zhang, Y Yang, Y Du - arXiv preprint arXiv:2304.07520, 2023 - arxiv.org

Centralized Training with Decentralized Execution (CTDE) has been proven to be an
effective paradigm in cooperative multi-agent reinforcement learning (MARL). One of the …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

Graph Q-Learning for Combinatorial Optimization

VM Dax, J Li, K Leahy, MJ Kochenderfer - arXiv preprint arXiv:2401.05610, 2024 - arxiv.org

Graph-structured data is ubiquitous throughout natural and social sciences, and Graph
Neural Networks (GNNs) have recently been shown to be effective at solving prediction and …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org