A framework for learning and planning against switching strategies in repeated games

P Hernandez-Leal, M Kaisers, T Baarslag… - arXiv preprint arXiv …, 2017 - arxiv.org

The key challenge in multiagent learning is learning a best response to the behaviour of
other agents, which may be non-stationary: if the other agents adapt their strategy as well …

被引用次数：359 相关文章所有 5 个版本

[PDF] china-simulation.com

面向多智能体博弈对抗的对手建模框架

罗俊仁，张万鹏，袁唯淋，胡振震，陈少飞… - 系统仿真学报, 2022 - china-simulation.com

对手建模作为多智能体博弈对抗的关键技术, 是一种典型的智能体认知行为建模方法.
介绍了多智能体博弈对抗几类典型模型, 非平稳问题和元博弈相关理论; 梳理总结对手建模方法 …

被引用次数：8 相关文章所有 2 个版本

[PDF] cwi.nl

Efficiently detecting switches against non-stationary opponents

P Hernandez-Leal, Y Zhan, ME Taylor… - Autonomous Agents and …, 2017 - Springer

Interactions in multiagent systems are generally more complicated than single agent ones.
Game theory provides solutions on how to act in multiagent scenarios; however, it assumes …

被引用次数：45 相关文章所有 4 个版本

[PDF] researchcommons.org

Research on opponent modeling framework for multi-agent game confrontation

J Luo, W Zhang, W Yuan, Z Hu… - Journal of …, 2022 - dc-china-simulation …

As the key technology of multi-agent game confrontation, opponent modeling is a typical
cognitive modeling method of agent's behavior. Several typical models of multi-agent game …

被引用次数：11 相关文章所有 2 个版本

[PDF] google.com

Efficient policy detecting and reusing for non-stationarity in markov games

Y Zheng, J Hao, Z Zhang, Z Meng, T Yang, Y Li… - Autonomous Agents and …, 2021 - Springer

One challenging problem in multiagent systems is to cooperate or compete with non-
stationary agents that change behavior from time to time. An agent in such a non-stationary …

被引用次数：19 相关文章所有 3 个版本

[PDF] researchgate.net

Towards a fast detection of opponents in repeated stochastic games

P Hernandez-Leal, M Kaisers - … 2017 Workshops, Best Papers, São Paulo …, 2017 - Springer

Multi-agent algorithms aim to find the best response in strategic interactions. While many
state-of-the-art algorithms assume repeated interaction with a fixed set of opponents (or …

被引用次数：35 相关文章所有 4 个版本

[PDF] aaai.org

[PDF][PDF] Identifying and tracking switching, non-stationary opponents: A Bayesian approach

P Hernandez-Leal, ME Taylor, BS Rosman, LE Sucar… - 2016 - cdn.aaai.org

In many situations, agents are required to use a set of strategies (behaviors) and switch
among them during the course of an interaction. This work focuses on the problem of …

被引用次数：39 相关文章所有 9 个版本

[HTML] sciencedirect.com

[HTML][HTML] An online learning algorithm to play discounted repeated games in wireless networks

J Parras, PA Apellániz, S Zazo - Engineering Applications of Artificial …, 2022 - Elsevier

Discounted repeated games are currently being used to model the conflicts that arise
between the nodes in a wireless network, such as distributed resource allocation …

被引用次数：6 相关文章所有 3 个版本

[PDF] core.ac.uk

An exploration strategy for non-stationary opponents

P Hernandez-Leal, Y Zhan, ME Taylor… - Autonomous Agents and …, 2017 - Springer

The success or failure of any learning algorithm is partially due to the exploration strategy it
exerts. However, most exploration strategies assume that the environment is stationary and …

被引用次数：23 相关文章所有 4 个版本

[PDF] tandfonline.com Full View

Learning adversarial policy in multiple scenes environment via multi-agent reinforcement learning

Y Li, X Wang, W Wang, Z Zhang, J Wang… - Connection …, 2021 - Taylor & Francis

Learning adversarial policy aims to learn behavioural strategies for agents with different
goals, is one of the most significant tasks in multi-agent systems. Multi-agent reinforcement …

被引用次数：9 相关文章所有 3 个版本