Solving transition-independent multi-agent MDPs with sparse interactions

P Hernandez-Leal, M Kaisers, T Baarslag… - arXiv preprint arXiv …, 2017 - arxiv.org

The key challenge in multiagent learning is learning a best response to the behaviour of
other agents, which may be non-stationary: if the other agents adapt their strategy as well …

被引用次数：362 相关文章所有 5 个版本

[PDF] arxiv.org

Multi-objective multi-agent decision making: a utility-based analysis and survey

R Rădulescu, P Mannion, DM Roijers… - Autonomous Agents and …, 2020 - Springer

The majority of multi-agent system implementations aim to optimise agents' policies with
respect to a single objective, despite the fact that many real-world problem domains are …

被引用次数：166 相关文章所有 18 个版本

[PDF] neurips.cc

Convergent policy optimization for safe reinforcement learning

M Yu, Z Yang, M Kolar, Z Wang - Advances in Neural …, 2019 - proceedings.neurips.cc

We study the safe reinforcement learning problem with nonlinear function approximation,
where policy optimization is formulated as a constrained optimization problem with both the …

被引用次数：133 相关文章所有 8 个版本

[PDF] aaai.org

Multi-agent path finding with delay probabilities

H Ma, TKS Kumar, S Koenig - Proceedings of the AAAI Conference on …, 2017 - ojs.aaai.org

Abstract Several recently developed Multi-Agent Path Finding (MAPF) solvers scale to large
MAPF instances by searching for MAPF plans on 2 levels: The high-level search resolves …

被引用次数：149 相关文章所有 15 个版本

[PDF] ox.ac.uk

[图书][B] Multi-objective decision making

DM Roijers, S Whiteson, R Brachman, P Stone - 2017 - Springer

Many real-world decision problems have multiple objectives. For example, when choosing a
medical treatment plan, we want to maximize the efficacy of the treatment, but also minimize …

被引用次数：114 相关文章所有 10 个版本

[PDF] jair.org Full View

Constrained multiagent Markov decision processes: A taxonomy of problems and algorithms

F De Nijs, E Walraven, M De Weerdt, M Spaan - Journal of Artificial …, 2021 - jair.org

In domains such as electric vehicle charging, smart distribution grids and autonomous
warehouses, multiple agents share the same resources. When planning the use of these …

被引用次数：38 相关文章所有 12 个版本

[PDF] arxiv.org

Simultaneous task allocation and planning under uncertainty

F Faruq, D Parker, B Laccrda… - 2018 IEEE/RSJ …, 2018 - ieeexplore.ieee.org

We propose novel techniques for task allocation and planning in multi-robot systems
operating in uncertain environments. Task allocation is performed simultaneously with …

被引用次数：69 相关文章所有 18 个版本

[PDF] aaai.org

Finite-time frequentist regret bounds of multi-agent thompson sampling on sparse hypergraphs

T Jin, HL Hsu, W Chang, P Xu - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

We study the multi-agent multi-armed bandit (MAMAB) problem, where agents are factored
into overlapping groups. Each group represents a hyperedge, forming a hypergraph over …

被引用次数：3 相关文章所有 4 个版本

A prioritized planning algorithm of trajectory coordination based on time windows for multiple AGVs with delay disturbance

R Tai, J Wang, W Chen - Assembly Automation, 2019 - emerald.com

Purpose In the running of multiple automated guided vehicles (AGVs) in warehouses, delay
problems in motions happen unavoidably as there might exist some disabled components of …

被引用次数：41 相关文章所有 3 个版本

[PDF] nature.com

Multi-agent thompson sampling for bandit applications with sparse neighbourhood structures

T Verstraeten, E Bargiacchi, PJK Libin, J Helsen… - Scientific reports, 2020 - nature.com

Multi-agent coordination is prevalent in many real-world applications. However, such
coordination is challenging due to its combinatorial nature. An important observation in this …

被引用次数：29 相关文章所有 9 个版本