Settling the variance of multi-agent policy gradients

S Gu, L Yang, Y Du, G Chen, F Walter, J Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

Reinforcement learning (RL) has achieved tremendous success in many complex decision
making tasks. When it comes to deploying RL in the real world, safety concerns are usually …

被引用次数：202 相关文章所有 2 个版本

[PDF] neurips.cc

Multi-agent reinforcement learning is a sequence modeling problem

M Wen, J Kuba, R Lin, W Zhang… - Advances in …, 2022 - proceedings.neurips.cc

Large sequence models (SM) such as GPT series and BERT have displayed outstanding
performance and generalization capabilities in natural language process, vision and …

被引用次数：133 相关文章所有 7 个版本

[PDF] arxiv.org

On Transforming Reinforcement Learning With Transformers: The Development Trajectory

S Hu, L Shen, Y Zhang, Y Chen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Transformers, originally devised for natural language processing (NLP), have also produced
significant successes in computer vision (CV). Due to their strong expression power …

被引用次数：14 相关文章所有 5 个版本

[PDF] neurips.cc

Towards human-level bimanual dexterous manipulation with reinforcement learning

Y Chen, T Wu, S Wang, X Feng… - Advances in …, 2022 - proceedings.neurips.cc

Achieving human-level dexterity is an important open problem in robotics. However, tasks of
dexterous hand manipulation even at the baby level are challenging to solve through …

被引用次数：71 相关文章所有 5 个版本

[PDF] google.com

Safe multi-agent reinforcement learning for multi-robot control

S Gu, JG Kuba, Y Chen, Y Du, L Yang, A Knoll… - Artificial Intelligence, 2023 - Elsevier

A challenging problem in robotics is how to control multiple robots cooperatively and safely
in real-world applications. Yet, developing multi-robot control methods from the perspective …

被引用次数：35 相关文章所有 6 个版本

[PDF] neurips.cc

Towards a standardised performance evaluation protocol for cooperative marl

R Gorsane, O Mahjoub, RJ de Kock… - Advances in …, 2022 - proceedings.neurips.cc

Multi-agent reinforcement learning (MARL) has emerged as a useful approach to solving
decentralised decision-making problems at scale. Research in the field has been growing …

被引用次数：32 相关文章所有 5 个版本

[PDF] springer.com

Offline pre-trained multi-agent decision transformer

L Meng, M Wen, C Le, X Li, D Xing, W Zhang… - Machine Intelligence …, 2023 - Springer

Offline reinforcement learning leverages previously collected offline datasets to learn
optimal policies with no necessity to access the real environment. Such a paradigm is also …

被引用次数：59 相关文章所有 8 个版本

[PDF] arxiv.org

Multi-agent constrained policy optimisation

S Gu, JG Kuba, M Wen, R Chen, Z Wang, Z Tian… - arXiv preprint arXiv …, 2021 - arxiv.org

Developing reinforcement learning algorithms that satisfy safety constraints is becoming
increasingly important in real-world applications. In multi-agent reinforcement learning …

被引用次数：51 相关文章所有 4 个版本

[PDF] jmlr.org

[PDF][PDF] Heterogeneous-agent reinforcement learning

Y Zhong, JG Kuba, X Feng, S Hu, J Ji, Y Yang - Journal of Machine …, 2024 - jmlr.org

The necessity for cooperation among intelligent machines has popularised cooperative multi-
agent reinforcement learning (MARL) in AI research. However, many research endeavours …

被引用次数：13 相关文章所有 3 个版本

[PDF] aaai.org

Ace: Cooperative multi-agent q-learning with bidirectional action-dependency

C Li, J Liu, Y Zhang, Y Wei, Y Niu, Y Yang… - Proceedings of the …, 2023 - ojs.aaai.org

Multi-agent reinforcement learning (MARL) suffers from the non-stationarity problem, which
is the ever-changing targets at every iteration when multiple agents update their policies at …

被引用次数：16 相关文章所有 4 个版本