109,372 research outputs found
Scalable High-Quality Graph and Hypergraph Partitioning
The balanced hypergraph partitioning problem (HGP) asks for a partition of the node set
of a hypergraph into blocks of roughly equal size, such that an objective function defined
on the hyperedges is minimized. In this work, we optimize the connectivity metric,
which is the most prominent objective function for HGP.
The hypergraph partitioning problem is NP-hard and there exists no constant factor approximation.
Thus, heuristic algorithms are used in practice with the multilevel scheme as
the most successful approach to solve the problem: First, the input hypergraph is coarsened to
obtain a hierarchy of successively smaller and structurally similar approximations.
The smallest hypergraph is then initially partitioned into blocks, and subsequently,
the contractions are reverted level-by-level, and, on each level, local search algorithms are used
to improve the partition (refinement phase).
In recent years, several new techniques were developed for sequential multilevel partitioning
that substantially improved solution quality at the cost of an increased running time.
These developments divide the landscape of existing partitioning algorithms into systems that either aim for
speed or high solution quality with the former often being more than an order of magnitude faster
than the latter. Due to the high running times of the best sequential algorithms, it is currently not
feasible to partition the largest real-world hypergraphs with the highest possible quality.
Thus, it becomes increasingly important to parallelize the techniques used in these algorithms.
However, existing state-of-the-art parallel partitioners currently do not achieve the same solution
quality as their sequential counterparts because they use comparatively weak components that are easier to parallelize.
Moreover, there has been a recent trend toward simpler methods for partitioning large hypergraphs
that even omit the multilevel scheme.
In contrast to this development, we present two shared-memory multilevel hypergraph partitioners
with parallel implementations of techniques used by the highest-quality sequential systems.
Our first multilevel algorithm uses a parallel clustering-based coarsening scheme,
which uses substantially fewer locking mechanisms than previous approaches.
The contraction decisions are guided by the community structure of the input hypergraph
obtained via a parallel community detection algorithm.
For initial partitioning, we implement parallel multilevel recursive bipartitioning with a
novel work-stealing approach and a portfolio of initial bipartitioning techniques to
compute an initial solution. In the refinement phase, we use three different parallel improvement
algorithms: label propagation refinement, a highly-localized direct -way
FM algorithm, and a novel parallelization of flow-based refinement.
These algorithms build on our highly-engineered partition data structure, for which we propose
several novel techniques to compute accurate gain values of node moves in the parallel setting.
Our second multilevel algorithm parallelizes the -level partitioning scheme used in
the highest-quality sequential partitioner KaHyPar. Here, only a single node
is contracted on each level, leading to a hierarchy with approximately levels where
is the number of nodes. Correspondingly, in each refinement step, only a single node is uncontracted, allowing
a highly-localized search for improvements.
We show that this approach, which seems inherently sequential, can be parallelized efficiently without compromises in solution quality.
To this end, we design a forest-based representation of contractions from which we derive a feasible parallel
schedule of the contraction operations that we apply on a novel dynamic hypergraph data structure on-the-fly.
In the uncoarsening phase, we decompose the contraction forest into batches, each containing
a fixed number of nodes. We then uncontract each batch in parallel and use highly-localized
versions of our refinement algorithms to improve the partition around the uncontracted nodes.
We further show that existing sequential partitioning algorithms considerably struggle to find balanced partitions
for weighted real-world hypergraphs. To this end, we present a technique that enables partitioners based on recursive
bipartitioning to reliably compute balanced solutions. The idea is to preassign a small portion of the
heaviest nodes to one of the two blocks of each bipartition and optimize the objective function on the
remaining nodes. We integrated the approach into the sequential hypergraph partitioner KaHyPar
and show that our new approach can compute balanced solutions for all tested instances without negatively affecting the solution
quality and running time of KaHyPar.
In our experimental evaluation, we compare our new shared-memory (hyper)graph partitioner Mt-KaHyPar
to different graph and hypergraph partitioners on over (hyper)graphs with up to two billion edges/pins.
The results indicate that already our fastest configuration outperforms almost all existing
hypergraph partitioners with regards to both solution quality and running time. Our highest-quality configuration
(-level with flow-based refinement) achieves the same solution quality as the currently best
sequential partitioner KaHyPar, while being almost an order of magnitude faster with ten threads.
In addition, we optimize our data structures for graph partitioning, which improves the running times of both multilevel partitioners by
almost a factor of two for graphs. As a result, Mt-KaHyPar also outperforms most of the existing
graph partitioning algorithms. While the shared-memory graph partitioner KaMinPar is still faster than
Mt-KaHyPar, its produced solutions are worse by in the median. The best sequential graph
partitioner KaFFPa-StrongS computes slightly better partitions than Mt-KaHyPar
(median improvement is ), but is more than an order of magnitude slower on average
High-Quality Hypergraph Partitioning
This dissertation focuses on computing high-quality solutions for the NP-hard balanced hypergraph partitioning problem: Given a hypergraph and an integer , partition its vertex set into disjoint blocks of bounded size, while minimizing an objective function over the hyperedges. Here, we consider the two most commonly used objectives: the cut-net metric and the connectivity metric.
Since the problem is computationally intractable, heuristics are used in practice - the most prominent being the three-phase multi-level paradigm: During coarsening, the hypergraph is successively contracted to obtain a hierarchy of smaller instances. After applying an initial partitioning algorithm to the smallest hypergraph, contraction is undone and, at each level, refinement algorithms try to improve the current solution.
With this work, we give a brief overview of the field and present several algorithmic improvements to the multi-level paradigm. Instead of using a logarithmic number of levels like traditional algorithms, we present two coarsening algorithms that create a hierarchy of (nearly) levels, where is the number of vertices. This makes consecutive levels as similar as possible and provides many opportunities for refinement algorithms to improve the partition. This approach is made feasible in practice by tailoring all algorithms and data structures to the -level paradigm, and developing lazy-evaluation techniques, caching mechanisms and early stopping criteria to speed up the partitioning process. Furthermore, we propose a sparsification algorithm based on locality-sensitive hashing that improves the running time for hypergraphs with large hyperedges, and show that incorporating global information about the community structure into the coarsening process improves quality. Moreover, we present a portfolio-based initial partitioning approach, and propose three refinement algorithms. Two are based on the Fiduccia-Mattheyses (FM) heuristic, but perform a highly localized search at each level. While one is designed for two-way partitioning, the other is the first FM-style algorithm that can be efficiently employed in the multi-level setting to directly improve -way partitions. The third algorithm uses max-flow computations on pairs of blocks to refine -way partitions. Finally, we present the first memetic multi-level hypergraph partitioning algorithm for an extensive exploration of the global solution space.
All contributions are made available through our open-source framework KaHyPar. In a comprehensive experimental study, we compare KaHyPar with hMETIS, PaToH, Mondriaan, Zoltan-AlgD, and HYPE on a wide range of hypergraphs from several application areas. Our results indicate that KaHyPar, already without the memetic component, computes better solutions than all competing algorithms for both the cut-net and the connectivity metric, while being faster than Zoltan-AlgD and equally fast as hMETIS. Moreover, KaHyPar compares favorably with the current best graph partitioning system KaFFPa - both in terms of solution quality and running time
Parallel Graph Partitioning for Complex Networks
Processing large complex networks like social networks or web graphs has
recently attracted considerable interest. In order to do this in parallel, we
need to partition them into pieces of about equal size. Unfortunately, previous
parallel graph partitioners originally developed for more regular mesh-like
networks do not work well for these networks. This paper addresses this problem
by parallelizing and adapting the label propagation technique originally
developed for graph clustering. By introducing size constraints, label
propagation becomes applicable for both the coarsening and the refinement phase
of multilevel graph partitioning. We obtain very high quality by applying a
highly parallel evolutionary algorithm to the coarsened graph. The resulting
system is both more scalable and achieves higher quality than state-of-the-art
systems like ParMetis or PT-Scotch. For large complex networks the performance
differences are very big. For example, our algorithm can partition a web graph
with 3.3 billion edges in less than sixteen seconds using 512 cores of a high
performance cluster while producing a high quality partition -- none of the
competing systems can handle this graph on our system.Comment: Review article. Parallelization of our previous approach
arXiv:1402.328
High-Quality Shared-Memory Graph Partitioning
Partitioning graphs into blocks of roughly equal size such that few edges run
between blocks is a frequently needed operation in processing graphs. Recently,
size, variety, and structural complexity of these networks has grown
dramatically. Unfortunately, previous approaches to parallel graph partitioning
have problems in this context since they often show a negative trade-off
between speed and quality. We present an approach to multi-level shared-memory
parallel graph partitioning that guarantees balanced solutions, shows high
speed-ups for a variety of large graphs and yields very good quality
independently of the number of cores used. For example, on 31 cores, our
algorithm partitions our largest test instance into 16 blocks cutting less than
half the number of edges than our main competitor when both algorithms are
given the same amount of time. Important ingredients include parallel label
propagation for both coarsening and improvement, parallel initial partitioning,
a simple yet effective approach to parallel localized local search, and fast
locality preserving hash tables
Partitioning Complex Networks via Size-constrained Clustering
The most commonly used method to tackle the graph partitioning problem in
practice is the multilevel approach. During a coarsening phase, a multilevel
graph partitioning algorithm reduces the graph size by iteratively contracting
nodes and edges until the graph is small enough to be partitioned by some other
algorithm. A partition of the input graph is then constructed by successively
transferring the solution to the next finer graph and applying a local search
algorithm to improve the current solution.
In this paper, we describe a novel approach to partition graphs effectively
especially if the networks have a highly irregular structure. More precisely,
our algorithm provides graph coarsening by iteratively contracting
size-constrained clusterings that are computed using a label propagation
algorithm. The same algorithm that provides the size-constrained clusterings
can also be used during uncoarsening as a fast and simple local search
algorithm.
Depending on the algorithm's configuration, we are able to compute partitions
of very high quality outperforming all competitors, or partitions that are
comparable to the best competitor in terms of quality, hMetis, while being
nearly an order of magnitude faster on average. The fastest configuration
partitions the largest graph available to us with 3.3 billion edges using a
single machine in about ten minutes while cutting less than half of the edges
than the fastest competitor, kMetis
Magic-State Functional Units: Mapping and Scheduling Multi-Level Distillation Circuits for Fault-Tolerant Quantum Architectures
Quantum computers have recently made great strides and are on a long-term
path towards useful fault-tolerant computation. A dominant overhead in
fault-tolerant quantum computation is the production of high-fidelity encoded
qubits, called magic states, which enable reliable error-corrected computation.
We present the first detailed designs of hardware functional units that
implement space-time optimized magic-state factories for surface code
error-corrected machines. Interactions among distant qubits require surface
code braids (physical pathways on chip) which must be routed. Magic-state
factories are circuits comprised of a complex set of braids that is more
difficult to route than quantum circuits considered in previous work [1]. This
paper explores the impact of scheduling techniques, such as gate reordering and
qubit renaming, and we propose two novel mapping techniques: braid repulsion
and dipole moment braid rotation. We combine these techniques with graph
partitioning and community detection algorithms, and further introduce a
stitching algorithm for mapping subgraphs onto a physical machine. Our results
show a factor of 5.64 reduction in space-time volume compared to the best-known
previous designs for magic-state factories.Comment: 13 pages, 10 figure
- …