A review of safe reinforcement learning: Methods, theory and applications

S Gu, L Yang, Y Du, G Chen, F Walter, J Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Reinforcement learning (RL) has achieved tremendous success in many complex decision
making tasks. When it comes to deploying RL in the real world, safety concerns are usually …

Multi-agent reinforcement learning is a sequence modeling problem

M Wen, J Kuba, R Lin, W Zhang… - Advances in …, 2022 - proceedings.neurips.cc
Large sequence models (SM) such as GPT series and BERT have displayed outstanding
performance and generalization capabilities in natural language process, vision and …

On Transforming Reinforcement Learning With Transformers: The Development Trajectory

S Hu, L Shen, Y Zhang, Y Chen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Transformers, originally devised for natural language processing (NLP), have also produced
significant successes in computer vision (CV). Due to their strong expression power …

Towards human-level bimanual dexterous manipulation with reinforcement learning

Y Chen, T Wu, S Wang, X Feng… - Advances in …, 2022 - proceedings.neurips.cc
Achieving human-level dexterity is an important open problem in robotics. However, tasks of
dexterous hand manipulation even at the baby level are challenging to solve through …

Safe multi-agent reinforcement learning for multi-robot control

S Gu, JG Kuba, Y Chen, Y Du, L Yang, A Knoll… - Artificial Intelligence, 2023 - Elsevier
A challenging problem in robotics is how to control multiple robots cooperatively and safely
in real-world applications. Yet, developing multi-robot control methods from the perspective …

Towards a standardised performance evaluation protocol for cooperative marl

R Gorsane, O Mahjoub, RJ de Kock… - Advances in …, 2022 - proceedings.neurips.cc
Multi-agent reinforcement learning (MARL) has emerged as a useful approach to solving
decentralised decision-making problems at scale. Research in the field has been growing …

Offline pre-trained multi-agent decision transformer

L Meng, M Wen, C Le, X Li, D Xing, W Zhang… - Machine Intelligence …, 2023 - Springer
Offline reinforcement learning leverages previously collected offline datasets to learn
optimal policies with no necessity to access the real environment. Such a paradigm is also …

Multi-agent constrained policy optimisation

S Gu, JG Kuba, M Wen, R Chen, Z Wang, Z Tian… - arXiv preprint arXiv …, 2021 - arxiv.org
Developing reinforcement learning algorithms that satisfy safety constraints is becoming
increasingly important in real-world applications. In multi-agent reinforcement learning …

[PDF][PDF] Heterogeneous-agent reinforcement learning

Y Zhong, JG Kuba, X Feng, S Hu, J Ji, Y Yang - Journal of Machine …, 2024 - jmlr.org
The necessity for cooperation among intelligent machines has popularised cooperative multi-
agent reinforcement learning (MARL) in AI research. However, many research endeavours …

Ace: Cooperative multi-agent q-learning with bidirectional action-dependency

C Li, J Liu, Y Zhang, Y Wei, Y Niu, Y Yang… - Proceedings of the …, 2023 - ojs.aaai.org
Multi-agent reinforcement learning (MARL) suffers from the non-stationarity problem, which
is the ever-changing targets at every iteration when multiple agents update their policies at …