Multi-agent reinforcement learning: A selective overview of theories and algorithms

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

An overview of multi-agent reinforcement learning from game theoretical perspective

Y Yang, J Wang - arXiv preprint arXiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

Learning agile soccer skills for a bipedal robot with deep reinforcement learning

T Haarnoja, B Moran, G Lever, SH Huang… - Science Robotics, 2024 - science.org
We investigated whether deep reinforcement learning (deep RL) is able to synthesize
sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be …

[PDF][PDF] 深度强化学习综述

刘全, 翟建伟, 章宗长, 钟珊, 周倩, 章鹏, 徐进 - 计算机学报, 2018 - cdn.jsdelivr.net
:强化学习是学习环境状态到动作的一种映射,并且能够获得最大的奖赏信号.在大规模状 Page 1
第40 卷 计算机学报 Vol. 40 2017 年论文在线出版号No.1 CHINESE JOURNAL OF …

Grandmaster level in StarCraft II using multi-agent reinforcement learning

O Vinyals, I Babuschkin, WM Czarnecki, M Mathieu… - nature, 2019 - nature.com
Many real-world applications require artificial agents to compete and coordinate with other
agents in complex environments. As a stepping stone to this goal, the domain of StarCraft …

Collaborating with humans without human data

DJ Strouse, K McKee, M Botvinick… - Advances in …, 2021 - proceedings.neurips.cc
Collaborating with humans requires rapidly adapting to their individual strengths,
weaknesses, and preferences. Unfortunately, most standard multi-agent reinforcement …

A survey and critique of multiagent deep reinforcement learning

P Hernandez-Leal, B Kartal, ME Taylor - Autonomous Agents and Multi …, 2019 - Springer
Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has
led to a dramatic increase in the number of applications and methods. Recent works have …

Nash learning from human feedback

R Munos, M Valko, D Calandriello, MG Azar… - arXiv preprint arXiv …, 2023 - arxiv.org
Reinforcement learning from human feedback (RLHF) has emerged as the main paradigm
for aligning large language models (LLMs) with human preferences. Typically, RLHF …

A unified game-theoretic approach to multiagent reinforcement learning

M Lanctot, V Zambaldi, A Gruslys… - Advances in neural …, 2017 - proceedings.neurips.cc
There has been a resurgence of interest in multiagent reinforcement learning (MARL), due
partly to the recent success of deep neural networks. The simplest form of MARL is …

Ab initio quantum chemistry with neural-network wavefunctions

J Hermann, J Spencer, K Choo, A Mezzacapo… - Nature Reviews …, 2023 - nature.com
Deep learning methods outperform human capabilities in pattern recognition and data
processing problems and now have an increasingly important role in scientific discovery. A …