Agent-time attention for sparse rewards multi-agent reinforcement learning

J She, JK Gupta, MJ Kochenderfer - arXiv preprint arXiv:2210.17540, 2022 - arxiv.org
Sparse and delayed rewards pose a challenge to single agent reinforcement learning. This
challenge is amplified in multi-agent reinforcement learning (MARL) where credit …

STAS: Spatial-Temporal Return Decomposition for Solving Sparse Rewards Problems in Multi-agent Reinforcement Learning

S Chen, Z Zhang, Y Yang, Y Du - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Centralized Training with Decentralized Execution (CTDE) has been proven to be an
effective paradigm in cooperative multi-agent reinforcement learning (MARL). One of the …

RevAP: A bankruptcy-based algorithm to solve the multi-agent credit assignment problem in task start threshold-based multi-agent systems

H Yarahmadi, ME Shiri, H Navidi, A Sharifi… - Robotics and …, 2024 - Elsevier
Abstract Multi-Agent Systems (MASs) are the prominent symbol of Distributed Artificial
Intelligence (DAI). Learning in MAS, which is commonly based on Reinforcement Learning …

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning

Y Qu, Y Jiang, B Wang, Y Mao, C Wang, C Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Reinforcement learning (RL) often encounters delayed and sparse feedback in real-world
applications, even with only episodic rewards. Previous approaches have made some …

Learning Individual Potential-Based Rewards in Multi-Agent Reinforcement Learning

C Yang, P Xu, J Zhang - IEEE Transactions on Games, 2024 - ieeexplore.ieee.org
A great challenge for applying multi-agent reinforcement learning (MARL) in the field of
game AI is to enable agents to learn diversified policies to handle different gamespecific …

STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning

S Chen, Z Zhang, Y Yang, Y Du - arXiv preprint arXiv:2304.07520, 2023 - arxiv.org
Centralized Training with Decentralized Execution (CTDE) has been proven to be an
effective paradigm in cooperative multi-agent reinforcement learning (MARL). One of the …

Graph Q-Learning for Combinatorial Optimization

VM Dax, J Li, K Leahy, MJ Kochenderfer - arXiv preprint arXiv:2401.05610, 2024 - arxiv.org
Graph-structured data is ubiquitous throughout natural and social sciences, and Graph
Neural Networks (GNNs) have recently been shown to be effective at solving prediction and …

Agent-Temporal Credit Assignment for Optimal Policy Preservation in Sparse Multi-Agent Reinforcement Learning

A Kapoor, S Swamy, K Tessera, M Baranwal… - arXiv preprint arXiv …, 2024 - arxiv.org
In multi-agent environments, agents often struggle to learn optimal policies due to sparse or
delayed global rewards, particularly in long-horizon tasks where it is challenging to evaluate …

GOV-REK: Governed Reward Engineering Kernels for designing robust multi-agent reinforcement learning systems

A Rana, M Oesterle, J Brinkmann - arXiv preprint arXiv:2404.01131, 2024 - arxiv.org
For multi-agent reinforcement learning systems (MARLS), the problem formulation generally
involves investing massive reward engineering effort specific to a given problem. However …

Classifying ambiguous identities in hidden-role Stochastic games with multi-agent reinforcement learning

S Han, S Li, B An, W Zhao, P Liu - Autonomous Agents and Multi-Agent …, 2023 - Springer
Multi-agent reinforcement learning (MARL) is a prevalent learning paradigm for solving
stochastic games. In most MARL studies, agents in a game are defined as teammates or …