Monotonic improvement guarantees under non-stationarity for decentralized ppo

Multi-agent reinforcement learning with policy clipping and average evaluation for UAV-assisted communication Markov game

Z Feng, M Huang, D Wu, EQ Wu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Unmanned aerial vehicle (UAV)-assisted communication is a significant technology in 6G
communication. In order to cope with the dynamic trajectory optimization problem of the air …

被引用次数：20 相关文章所有 3 个版本

Approximating Nash equilibrium for anti-UAV jamming Markov game using a novel event-triggered multi-agent reinforcement learning

Z Feng, M Huang, Y Wu, D Wu, J Cao, I Korovin… - Neural Networks, 2023 - Elsevier

In the downlink communication, it is currently challenging for ground users to cope with the
uncertain interference from aerial intelligent jammers. The cooperation and competition …

被引用次数：21 相关文章所有 4 个版本

[PDF] arxiv.org

Order matters: Agent-by-agent policy optimization

X Wang, Z Tian, Z Wan, Y Wen, J Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

While multi-agent trust region algorithms have achieved great success empirically in solving
coordination tasks, most of them, however, suffer from a non-stationarity problem since …

被引用次数：22 相关文章所有 4 个版本

[PDF] arxiv.org

Less is more: Robust robot learning via partially observable multi-agent reinforcement learning

W Zhao, EA Rantala, J Pajarinen… - arXiv preprint arXiv …, 2023 - arxiv.org

In many multi-agent and high-dimensional robotic tasks, the controller can be designed in
either a centralized or decentralized way. Correspondingly, it is possible to use either single …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

Tizero: Mastering multi-agent football with curriculum learning and self-play

F Lin, S Huang, T Pearce, W Chen, WW Tu - arXiv preprint arXiv …, 2023 - arxiv.org

Multi-agent football poses an unsolved challenge in AI research. Existing work has focused
on tackling simplified scenarios of the game, or else leveraging expert demonstrations. In …

被引用次数：18 相关文章所有 5 个版本

[PDF] mlr.press

Dealing with non-stationarity in decentralized cooperative multi-agent deep reinforcement learning via multi-timescale learning

H Nekoei, A Badrinaaraayanan… - Conference on …, 2023 - proceedings.mlr.press

Decentralized cooperative multi-agent deep reinforcement learning (MARL) can be a
versatile learning framework, particularly in scenarios where centralized training is either not …

被引用次数：11 相关文章所有 4 个版本

[PDF] ijcai.org

[PDF][PDF] Dynamic Belief for Decentralized Multi-Agent Cooperative Learning.

Y Zhai, P Peng, C Su, Y Tian - IJCAI, 2023 - ijcai.org

Decentralized multi-agent cooperative learning is a practical task due to the partially
observed setting both in training and execution. Every agent learns to cooperate without …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Decentralized policy optimization

K Su, Z Lu - arXiv preprint arXiv:2211.03032, 2022 - arxiv.org

The study of decentralized learning or independent learning in cooperative multi-agent
reinforcement learning has a history of decades. Recently empirical studies show that …

被引用次数：9 相关文章所有 3 个版本

[PDF] arxiv.org

Optimistic Multi-Agent Policy Gradient for Cooperative Tasks

W Zhao, Y Zhao, Z Li, J Kannala, J Pajarinen - arXiv preprint arXiv …, 2023 - arxiv.org

\textit {Relative overgeneralization}(RO) occurs in cooperative multi-agent learning tasks
when agents converge towards a suboptimal joint policy due to overfitting to suboptimal …

被引用次数：1 相关文章所有 2 个版本

[PDF] southampton.ac.uk

[PDF][PDF] Counterexample-Guided Policy Refinement in Multi-Agent Reinforcement Learning

B Gangopadhyay, P Dasgupta… - Proceedings of the 2023 …, 2023 - southampton.ac.uk

Single-agent Deep Reinforcement Learning (DRL) is a popular control technique where the
policy controlling agent learns to choose actions that maximize a discounted long-term …

被引用次数：1 相关文章所有 3 个版本