705 research outputs found
Scalable High-Quality Graph and Hypergraph Partitioning
The balanced hypergraph partitioning problem (HGP) asks for a partition of the node set
of a hypergraph into blocks of roughly equal size, such that an objective function defined
on the hyperedges is minimized. In this work, we optimize the connectivity metric,
which is the most prominent objective function for HGP.
The hypergraph partitioning problem is NP-hard and there exists no constant factor approximation.
Thus, heuristic algorithms are used in practice with the multilevel scheme as
the most successful approach to solve the problem: First, the input hypergraph is coarsened to
obtain a hierarchy of successively smaller and structurally similar approximations.
The smallest hypergraph is then initially partitioned into blocks, and subsequently,
the contractions are reverted level-by-level, and, on each level, local search algorithms are used
to improve the partition (refinement phase).
In recent years, several new techniques were developed for sequential multilevel partitioning
that substantially improved solution quality at the cost of an increased running time.
These developments divide the landscape of existing partitioning algorithms into systems that either aim for
speed or high solution quality with the former often being more than an order of magnitude faster
than the latter. Due to the high running times of the best sequential algorithms, it is currently not
feasible to partition the largest real-world hypergraphs with the highest possible quality.
Thus, it becomes increasingly important to parallelize the techniques used in these algorithms.
However, existing state-of-the-art parallel partitioners currently do not achieve the same solution
quality as their sequential counterparts because they use comparatively weak components that are easier to parallelize.
Moreover, there has been a recent trend toward simpler methods for partitioning large hypergraphs
that even omit the multilevel scheme.
In contrast to this development, we present two shared-memory multilevel hypergraph partitioners
with parallel implementations of techniques used by the highest-quality sequential systems.
Our first multilevel algorithm uses a parallel clustering-based coarsening scheme,
which uses substantially fewer locking mechanisms than previous approaches.
The contraction decisions are guided by the community structure of the input hypergraph
obtained via a parallel community detection algorithm.
For initial partitioning, we implement parallel multilevel recursive bipartitioning with a
novel work-stealing approach and a portfolio of initial bipartitioning techniques to
compute an initial solution. In the refinement phase, we use three different parallel improvement
algorithms: label propagation refinement, a highly-localized direct -way
FM algorithm, and a novel parallelization of flow-based refinement.
These algorithms build on our highly-engineered partition data structure, for which we propose
several novel techniques to compute accurate gain values of node moves in the parallel setting.
Our second multilevel algorithm parallelizes the -level partitioning scheme used in
the highest-quality sequential partitioner KaHyPar. Here, only a single node
is contracted on each level, leading to a hierarchy with approximately levels where
is the number of nodes. Correspondingly, in each refinement step, only a single node is uncontracted, allowing
a highly-localized search for improvements.
We show that this approach, which seems inherently sequential, can be parallelized efficiently without compromises in solution quality.
To this end, we design a forest-based representation of contractions from which we derive a feasible parallel
schedule of the contraction operations that we apply on a novel dynamic hypergraph data structure on-the-fly.
In the uncoarsening phase, we decompose the contraction forest into batches, each containing
a fixed number of nodes. We then uncontract each batch in parallel and use highly-localized
versions of our refinement algorithms to improve the partition around the uncontracted nodes.
We further show that existing sequential partitioning algorithms considerably struggle to find balanced partitions
for weighted real-world hypergraphs. To this end, we present a technique that enables partitioners based on recursive
bipartitioning to reliably compute balanced solutions. The idea is to preassign a small portion of the
heaviest nodes to one of the two blocks of each bipartition and optimize the objective function on the
remaining nodes. We integrated the approach into the sequential hypergraph partitioner KaHyPar
and show that our new approach can compute balanced solutions for all tested instances without negatively affecting the solution
quality and running time of KaHyPar.
In our experimental evaluation, we compare our new shared-memory (hyper)graph partitioner Mt-KaHyPar
to different graph and hypergraph partitioners on over (hyper)graphs with up to two billion edges/pins.
The results indicate that already our fastest configuration outperforms almost all existing
hypergraph partitioners with regards to both solution quality and running time. Our highest-quality configuration
(-level with flow-based refinement) achieves the same solution quality as the currently best
sequential partitioner KaHyPar, while being almost an order of magnitude faster with ten threads.
In addition, we optimize our data structures for graph partitioning, which improves the running times of both multilevel partitioners by
almost a factor of two for graphs. As a result, Mt-KaHyPar also outperforms most of the existing
graph partitioning algorithms. While the shared-memory graph partitioner KaMinPar is still faster than
Mt-KaHyPar, its produced solutions are worse by in the median. The best sequential graph
partitioner KaFFPa-StrongS computes slightly better partitions than Mt-KaHyPar
(median improvement is ), but is more than an order of magnitude slower on average
Memetic Multilevel Hypergraph Partitioning
Hypergraph partitioning has a wide range of important applications such as
VLSI design or scientific computing. With focus on solution quality, we develop
the first multilevel memetic algorithm to tackle the problem. Key components of
our contribution are new effective multilevel recombination and mutation
operations that provide a large amount of diversity. We perform a wide range of
experiments on a benchmark set containing instances from application areas such
VLSI, SAT solving, social networks, and scientific computing. Compared to the
state-of-the-art hypergraph partitioning tools hMetis, PaToH, and KaHyPar, our
new algorithm computes the best result on almost all instances
Relaxation-Based Coarsening for Multilevel Hypergraph Partitioning
Multilevel partitioning methods that are inspired by principles of
multiscaling are the most powerful practical hypergraph partitioning solvers.
Hypergraph partitioning has many applications in disciplines ranging from
scientific computing to data science. In this paper we introduce the concept of
algebraic distance on hypergraphs and demonstrate its use as an algorithmic
component in the coarsening stage of multilevel hypergraph partitioning
solvers. The algebraic distance is a vertex distance measure that extends
hyperedge weights for capturing the local connectivity of vertices which is
critical for hypergraph coarsening schemes. The practical effectiveness of the
proposed measure and corresponding coarsening scheme is demonstrated through
extensive computational experiments on a diverse set of problems. Finally, we
propose a benchmark of hypergraph partitioning problems to compare the quality
of other solvers
Recent Advances in Graph Partitioning
We survey recent trends in practical algorithms for balanced graph
partitioning together with applications and future research directions
Partitioning a call graph
Splitting a large software system into smaller and more manageable units has become an important problem for many organizations. The basic structure of a software system is given by a directed graph with vertices representing the programs of the system and arcs representing calls from one program to another. Generating a good partitioning into smaller modules becomes a minimization problem for the number of programs being called by external programs. First, we formulate an equivalent integer linear programming problem with 0–1 variables. theoretically, with this approach the problem can be solved to optimality, but this becomes very costly with increasing size of the software system. Second, we formulate the problem as a hypergraph partitioning problem. This is a heuristic method using a multilevel strategy, but it turns out to be very fast and to deliver solutions that are close to optimal
- …