11 research outputs found
Learning to Resolve Conflicts for Multi-Agent Path Finding with Conflict-Based Search
Conflict-Based Search (CBS) is a state-of-the-art algorithm for multi-agent
path finding. At the high level, CBS repeatedly detects conflicts and resolves
one of them by splitting the current problem into two subproblems. Previous
work chooses the conflict to resolve by categorizing the conflict into three
classes and always picking a conflict from the highest-priority class. In this
work, we propose an oracle for conflict selection that results in smaller
search tree sizes than the one used in previous work. However, the computation
of the oracle is slow. Thus, we propose a machine-learning framework for
conflict selection that observes the decisions made by the oracle and learns a
conflict-selection strategy represented by a linear ranking function that
imitates the oracle's decisions accurately and quickly. Experiments on
benchmark maps indicate that our method significantly improves the success
rates, the search tree sizes and runtimes over the current state-of-the-art CBS
solver
Adaptive Anytime Multi-Agent Path Finding Using Bandit-Based Large Neighborhood Search
Anytime multi-agent path finding (MAPF) is a promising approach to scalable
path optimization in large-scale multi-agent systems. State-of-the-art anytime
MAPF is based on Large Neighborhood Search (LNS), where a fast initial solution
is iteratively optimized by destroying and repairing a fixed number of parts,
i.e., the neighborhood, of the solution, using randomized destroy heuristics
and prioritized planning. Despite their recent success in various MAPF
instances, current LNS-based approaches lack exploration and flexibility due to
greedy optimization with a fixed neighborhood size which can lead to low
quality solutions in general. So far, these limitations have been addressed
with extensive prior effort in tuning or offline machine learning beyond actual
planning. In this paper, we focus on online learning in LNS and propose
Bandit-based Adaptive LArge Neighborhood search Combined with Exploration
(BALANCE). BALANCE uses a bi-level multi-armed bandit scheme to adapt the
selection of destroy heuristics and neighborhood sizes on the fly during
search. We evaluate BALANCE on multiple maps from the MAPF benchmark set and
empirically demonstrate cost improvements of at least 50% compared to
state-of-the-art anytime MAPF in large-scale scenarios. We find that Thompson
Sampling performs particularly well compared to alternative multi-armed bandit
algorithms.Comment: Accepted to AAAI 202
Local Branching Relaxation Heuristics for Integer Linear Programs
Large Neighborhood Search (LNS) is a popular heuristic algorithm for solving
combinatorial optimization problems (COP). It starts with an initial solution
to the problem and iteratively improves it by searching a large neighborhood
around the current best solution. LNS relies on heuristics to select
neighborhoods to search in. In this paper, we focus on designing effective and
efficient heuristics in LNS for integer linear programs (ILP) since a wide
range of COPs can be represented as ILPs. Local Branching (LB) is a heuristic
that selects the neighborhood that leads to the largest improvement over the
current solution in each iteration of LNS. LB is often slow since it needs to
solve an ILP of the same size as input. Our proposed heuristics, LB-RELAX and
its variants, use the linear programming relaxation of LB to select
neighborhoods. Empirically, LB-RELAX and its variants compute as effective
neighborhoods as LB but run faster. They achieve state-of-the-art anytime
performance on several ILP benchmarks
Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning
Integer Linear Programs (ILPs) are powerful tools for modeling and solving a
large number of combinatorial optimization problems. Recently, it has been
shown that Large Neighborhood Search (LNS), as a heuristic algorithm, can find
high quality solutions to ILPs faster than Branch and Bound. However, how to
find the right heuristics to maximize the performance of LNS remains an open
problem. In this paper, we propose a novel approach, CL-LNS, that delivers
state-of-the-art anytime performance on several ILP benchmarks measured by
metrics including the primal gap, the primal integral, survival rates and the
best performing rate. Specifically, CL-LNS collects positive and negative
solution samples from an expert heuristic that is slow to compute and learns a
new one with a contrastive loss. We use graph attention networks and a richer
set of features to further improve its performance
Landscape Surrogate: Learning Decision Losses for Mathematical Optimization Under Partial Information
Recent works in learning-integrated optimization have shown promise in
settings where the optimization problem is only partially observed or where
general-purpose optimizers perform poorly without expert tuning. By learning an
optimizer to tackle these challenging problems with as the
objective, the optimization process can be substantially accelerated by
leveraging past experience. The optimizer can be trained with supervision from
known optimal solutions or implicitly by optimizing the compound function
. The implicit approach may not require optimal solutions as
labels and is capable of handling problem uncertainty; however, it is slow to
train and deploy due to frequent calls to optimizer during both
training and testing. The training is further challenged by sparse gradients of
, especially for combinatorial solvers. To address these
challenges, we propose using a smooth and learnable Landscape Surrogate as
a replacement for . This surrogate, learnable by neural
networks, can be computed faster than the solver , provides dense
and smooth gradients during training, can generalize to unseen optimization
problems, and is efficiently learned via alternating optimization. We test our
approach on both synthetic problems, including shortest path and
multidimensional knapsack, and real-world problems such as portfolio
optimization, achieving comparable or superior objective values compared to
state-of-the-art baselines while reducing the number of calls to .
Notably, our approach outperforms existing methods for computationally
expensive high-dimensional problems
Anytime Multi-Agent Path Finding via Machine Learning-Guided Large Neighborhood Search
Multi-Agent Path Finding (MAPF) is the problem of finding a set of collision-free paths for a team of agents in a common environment. MAPF is NP-hard to solve optimally and, in some cases, also bounded-suboptimally. It is thus time-consuming for (bounded-sub)optimal solvers to solve large MAPF instances. Anytime algorithms find solutions quickly for large instances and then improve them to close-to-optimal ones over time. In this paper, we improve the current state-of-the-art anytime solver MAPF-LNS, that first finds an initial solution fast and then repeatedly replans the paths of subsets of agents via Large Neighborhood Search (LNS). It generates the subsets of agents for replanning by randomized destroy heuristics, but not all of them increase the solution quality substantially. We propose to use machine learning to learn how to select a subset of agents from a collection of subsets, such that replanning increases the solution quality more. We show experimentally that our solver, MAPF-ML-LNS, significantly outperforms MAPF-LNS on the standard MAPF benchmark set in terms of both the speed of improving the solution and the final solution quality
Synthesizing Priority Planning Formulae for Multi-Agent Pathfinding
Prioritized planning is a popular approach to multi-agent pathfinding. It prioritizes the agents and then repeatedly invokes a single-agent pathfinding algorithm for each agent such that it avoids the paths of higher-priority agents. Performance of prioritized planning depends critically on cleverly ordering the agents. Such an ordering is provided by a priority function. Recent work successfully used machine learning to automatically produce such a priority function given good orderings as the training data. In this paper we explore a different technique for synthesizing priority functions, namely program synthesis in the space of arithmetic formulae. We synthesize priority functions expressed as arithmetic formulae over a set of meaningful problem features via a genetic search in the space induced by a context-free grammar. Furthermore we regularize the fitness function by formula length to synthesize short, human-readable formulae. Such readability is an advantage over previous numeric machine-learning methods and may help explain the importance of features and how to combine them into a good priority function for a given domain. Moreover, our experimental results show that our formula-based priority functions outperform existing machine-learning methods on the standard benchmarks in terms of success rate, run time and solution quality without using more training data
Learning a Priority Ordering for Prioritized Planning in Multi-Agent Path Finding
Prioritized Planning (PP) is a fast and popular framework for solving Multi-Agent Path Finding, but its solution quality depends heavily on the predetermined priority ordering of the agents. Current PP algorithms use either greedy policies or random assignments to determine a total priority ordering, but none of them dominates the others in terms of the success rate and solution quality (measured by the sum-of-costs). We propose a machine-learning (ML) framework to learn a good priority ordering for PP. We develop two models, namely ML-T, which is trained on a total priority ordering, and ML-P, which is trained on a partial priority ordering. We propose to boost the effectiveness of PP by further applying stochastic ranking and random restarts. The results show that our ML-guided PP algorithms outperform the existing PP algorithms in success rate, runtime, and solution quality on small maps in most cases and are competitive with them on large maps despite the difficulty of collecting training data on these maps