A Bayesian approach for learning and tracking switching, non-stationary opponents

A Wong, T Bäck, AV Kononova, A Plaat - Artificial Intelligence Review, 2023 - Springer

This paper surveys the field of deep multiagent reinforcement learning (RL). The
combination of deep neural networks with RL has gained increased traction in recent years …

被引用次数：115 相关文章所有 8 个版本

[PDF] arxiv.org

A survey of learning in multiagent environments: Dealing with non-stationarity

P Hernandez-Leal, M Kaisers, T Baarslag… - arXiv preprint arXiv …, 2017 - arxiv.org

The key challenge in multiagent learning is learning a best response to the behaviour of
other agents, which may be non-stationary: if the other agents adapt their strategy as well …

被引用次数：362 相关文章所有 5 个版本

[PDF] neurips.cc

A deep bayesian policy reuse approach against non-stationary agents

Y Zheng, Z Meng, J Hao, Z Zhang… - Advances in neural …, 2018 - proceedings.neurips.cc

In multiagent domains, coping with non-stationary agents that change behaviors from time to
time is a challenging problem, where an agent is usually required to be able to quickly …

被引用次数：95 相关文章所有 6 个版本

Detecting and learning against unknown opponents for automated negotiations

L Wu, S Chen, X Gao, Y Zheng, J Hao - Pacific Rim International …, 2021 - Springer

Learning in automated negotiations, while successful for many tasks in recent years, is still
hard when coping with different types of opponents with unknown strategies. It is critically …

被引用次数：17 相关文章所有 2 个版本

[PDF] thecvf.com

Smart Help: Strategic Opponent Modeling for Proactive and Adaptive Robot Assistance in Households

Z Cao, Z Wang, S Xie, A Liu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Despite the significant demand for assistive technology among vulnerable groups (eg the
elderly children and the disabled) in daily tasks research into advanced AI-driven assistive …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

Conditional imitation learning for multi-agent games

A Shih, S Ermon, D Sadigh - 2022 17th ACM/IEEE International …, 2022 - ieeexplore.ieee.org

While advances in multi-agent learning have enabled the training of increasingly complex
agents, most existing techniques produce a final policy that is not designed to adapt to a …

被引用次数：15 相关文章所有 7 个版本

[PDF] arxiv.org

Towards cooperation in sequential prisoner's dilemmas: a deep multiagent reinforcement learning approach

W Wang, J Hao, Y Wang, M Taylor - arXiv preprint arXiv:1803.00162, 2018 - arxiv.org

The Iterated Prisoner's Dilemma has guided research on social dilemmas for decades.
However, it distinguishes between only two atomic actions: cooperate and defect. In real …

被引用次数：40 相关文章所有 3 个版本

[HTML] springer.com

[HTML][HTML] Efficiently detecting switches against non-stationary opponents

P Hernandez-Leal, Y Zhan, ME Taylor… - Autonomous Agents and …, 2017 - Springer

Interactions in multiagent systems are generally more complicated than single agent ones.
Game theory provides solutions on how to act in multiagent scenarios; however, it assumes …

被引用次数：45 相关文章所有 4 个版本

[PDF] researchcommons.org

Research progress of opponent modeling based on deep reinforcement learning

H Xu, L Qin, J Zeng, Y Hu… - Journal of …, 2023 - dc-china-simulation …

Deep reinforcement learning is an agent modeling method with both deep learning feature
extraction ability and reinforcement learning sequence decision-making ability, which can …

被引用次数：3 相关文章所有 3 个版本

[PDF] neurips.cc

Learning others' intentional models in multi-agent settings using interactive POMDPs

Y Han, P Gmytrasiewicz - Advances in Neural Information …, 2018 - proceedings.neurips.cc

Interactive partially observable Markov decision processes (I-POMDPs) provide a principled
framework for planning and acting in a partially observable, stochastic and multi-agent …

被引用次数：38 相关文章所有 8 个版本