A survey of learning in multiagent environments: Dealing with non-stationarity
The key challenge in multiagent learning is learning a best response to the behaviour of
other agents, which may be non-stationary: if the other agents adapt their strategy as well …
other agents, which may be non-stationary: if the other agents adapt their strategy as well …
Multi-objective multi-agent decision making: a utility-based analysis and survey
The majority of multi-agent system implementations aim to optimise agents' policies with
respect to a single objective, despite the fact that many real-world problem domains are …
respect to a single objective, despite the fact that many real-world problem domains are …
Convergent policy optimization for safe reinforcement learning
We study the safe reinforcement learning problem with nonlinear function approximation,
where policy optimization is formulated as a constrained optimization problem with both the …
where policy optimization is formulated as a constrained optimization problem with both the …
Multi-agent path finding with delay probabilities
Abstract Several recently developed Multi-Agent Path Finding (MAPF) solvers scale to large
MAPF instances by searching for MAPF plans on 2 levels: The high-level search resolves …
MAPF instances by searching for MAPF plans on 2 levels: The high-level search resolves …
[图书][B] Multi-objective decision making
Many real-world decision problems have multiple objectives. For example, when choosing a
medical treatment plan, we want to maximize the efficacy of the treatment, but also minimize …
medical treatment plan, we want to maximize the efficacy of the treatment, but also minimize …
Constrained multiagent Markov decision processes: A taxonomy of problems and algorithms
In domains such as electric vehicle charging, smart distribution grids and autonomous
warehouses, multiple agents share the same resources. When planning the use of these …
warehouses, multiple agents share the same resources. When planning the use of these …
Simultaneous task allocation and planning under uncertainty
We propose novel techniques for task allocation and planning in multi-robot systems
operating in uncertain environments. Task allocation is performed simultaneously with …
operating in uncertain environments. Task allocation is performed simultaneously with …
Finite-time frequentist regret bounds of multi-agent thompson sampling on sparse hypergraphs
We study the multi-agent multi-armed bandit (MAMAB) problem, where agents are factored
into overlapping groups. Each group represents a hyperedge, forming a hypergraph over …
into overlapping groups. Each group represents a hyperedge, forming a hypergraph over …
A prioritized planning algorithm of trajectory coordination based on time windows for multiple AGVs with delay disturbance
Purpose In the running of multiple automated guided vehicles (AGVs) in warehouses, delay
problems in motions happen unavoidably as there might exist some disabled components of …
problems in motions happen unavoidably as there might exist some disabled components of …
Multi-agent thompson sampling for bandit applications with sparse neighbourhood structures
Multi-agent coordination is prevalent in many real-world applications. However, such
coordination is challenging due to its combinatorial nature. An important observation in this …
coordination is challenging due to its combinatorial nature. An important observation in this …