95 research outputs found

    Evaluation of a Flow-Based Hypergraph Bipartitioning Algorithm

    Get PDF
    In this paper, we propose HyperFlowCutter, an algorithm for balanced hypergraph bipartitioning that is based on minimum S-T hyperedge cuts and maximum flows. It computes a sequence of bipartitions that optimize cut size and balance in the Pareto sense, being able to trade one for the other. HyperFlowCutter builds on the FlowCutter algorithm for partitioning graphs. We propose additional features, such as handling disconnected hypergraphs, novel methods for obtaining starting S,T pairs as well as an approach to refine a given partition with HyperFlowCutter. Our main contribution is ReBaHFC, a new algorithm which obtains an initial partition with the fast multilevel hypergraph partitioner PaToH and then improves it using HyperFlowCutter as a refinement algorithm. ReBaHFC is able to significantly improve the solution quality of PaToH at little additional running time. The solution quality is only marginally worse than that of the best-performing hypergraph partitioners KaHyPar and hMETIS, while being one order of magnitude faster. Thus ReBaHFC offers a new time-quality trade-off in the current spectrum of hypergraph partitioners. For the special case of perfectly balanced bipartitioning, only the much slower plain HyperFlowCutter yields slightly better solutions than ReBaHFC, while only PaToH is faster than ReBaHFC

    High-Quality Hypergraph Partitioning

    Get PDF
    This dissertation focuses on computing high-quality solutions for the NP-hard balanced hypergraph partitioning problem: Given a hypergraph and an integer kk, partition its vertex set into kk disjoint blocks of bounded size, while minimizing an objective function over the hyperedges. Here, we consider the two most commonly used objectives: the cut-net metric and the connectivity metric. Since the problem is computationally intractable, heuristics are used in practice - the most prominent being the three-phase multi-level paradigm: During coarsening, the hypergraph is successively contracted to obtain a hierarchy of smaller instances. After applying an initial partitioning algorithm to the smallest hypergraph, contraction is undone and, at each level, refinement algorithms try to improve the current solution. With this work, we give a brief overview of the field and present several algorithmic improvements to the multi-level paradigm. Instead of using a logarithmic number of levels like traditional algorithms, we present two coarsening algorithms that create a hierarchy of (nearly) nn levels, where nn is the number of vertices. This makes consecutive levels as similar as possible and provides many opportunities for refinement algorithms to improve the partition. This approach is made feasible in practice by tailoring all algorithms and data structures to the nn-level paradigm, and developing lazy-evaluation techniques, caching mechanisms and early stopping criteria to speed up the partitioning process. Furthermore, we propose a sparsification algorithm based on locality-sensitive hashing that improves the running time for hypergraphs with large hyperedges, and show that incorporating global information about the community structure into the coarsening process improves quality. Moreover, we present a portfolio-based initial partitioning approach, and propose three refinement algorithms. Two are based on the Fiduccia-Mattheyses (FM) heuristic, but perform a highly localized search at each level. While one is designed for two-way partitioning, the other is the first FM-style algorithm that can be efficiently employed in the multi-level setting to directly improve kk-way partitions. The third algorithm uses max-flow computations on pairs of blocks to refine kk-way partitions. Finally, we present the first memetic multi-level hypergraph partitioning algorithm for an extensive exploration of the global solution space. All contributions are made available through our open-source framework KaHyPar. In a comprehensive experimental study, we compare KaHyPar with hMETIS, PaToH, Mondriaan, Zoltan-AlgD, and HYPE on a wide range of hypergraphs from several application areas. Our results indicate that KaHyPar, already without the memetic component, computes better solutions than all competing algorithms for both the cut-net and the connectivity metric, while being faster than Zoltan-AlgD and equally fast as hMETIS. Moreover, KaHyPar compares favorably with the current best graph partitioning system KaFFPa - both in terms of solution quality and running time

    Scalable High-Quality Graph and Hypergraph Partitioning

    Get PDF
    The balanced hypergraph partitioning problem (HGP) asks for a partition of the node set of a hypergraph into kk blocks of roughly equal size, such that an objective function defined on the hyperedges is minimized. In this work, we optimize the connectivity metric, which is the most prominent objective function for HGP. The hypergraph partitioning problem is NP-hard and there exists no constant factor approximation. Thus, heuristic algorithms are used in practice with the multilevel scheme as the most successful approach to solve the problem: First, the input hypergraph is coarsened to obtain a hierarchy of successively smaller and structurally similar approximations. The smallest hypergraph is then initially partitioned into kk blocks, and subsequently, the contractions are reverted level-by-level, and, on each level, local search algorithms are used to improve the partition (refinement phase). In recent years, several new techniques were developed for sequential multilevel partitioning that substantially improved solution quality at the cost of an increased running time. These developments divide the landscape of existing partitioning algorithms into systems that either aim for speed or high solution quality with the former often being more than an order of magnitude faster than the latter. Due to the high running times of the best sequential algorithms, it is currently not feasible to partition the largest real-world hypergraphs with the highest possible quality. Thus, it becomes increasingly important to parallelize the techniques used in these algorithms. However, existing state-of-the-art parallel partitioners currently do not achieve the same solution quality as their sequential counterparts because they use comparatively weak components that are easier to parallelize. Moreover, there has been a recent trend toward simpler methods for partitioning large hypergraphs that even omit the multilevel scheme. In contrast to this development, we present two shared-memory multilevel hypergraph partitioners with parallel implementations of techniques used by the highest-quality sequential systems. Our first multilevel algorithm uses a parallel clustering-based coarsening scheme, which uses substantially fewer locking mechanisms than previous approaches. The contraction decisions are guided by the community structure of the input hypergraph obtained via a parallel community detection algorithm. For initial partitioning, we implement parallel multilevel recursive bipartitioning with a novel work-stealing approach and a portfolio of initial bipartitioning techniques to compute an initial solution. In the refinement phase, we use three different parallel improvement algorithms: label propagation refinement, a highly-localized direct kk-way FM algorithm, and a novel parallelization of flow-based refinement. These algorithms build on our highly-engineered partition data structure, for which we propose several novel techniques to compute accurate gain values of node moves in the parallel setting. Our second multilevel algorithm parallelizes the nn-level partitioning scheme used in the highest-quality sequential partitioner KaHyPar. Here, only a single node is contracted on each level, leading to a hierarchy with approximately nn levels where nn is the number of nodes. Correspondingly, in each refinement step, only a single node is uncontracted, allowing a highly-localized search for improvements. We show that this approach, which seems inherently sequential, can be parallelized efficiently without compromises in solution quality. To this end, we design a forest-based representation of contractions from which we derive a feasible parallel schedule of the contraction operations that we apply on a novel dynamic hypergraph data structure on-the-fly. In the uncoarsening phase, we decompose the contraction forest into batches, each containing a fixed number of nodes. We then uncontract each batch in parallel and use highly-localized versions of our refinement algorithms to improve the partition around the uncontracted nodes. We further show that existing sequential partitioning algorithms considerably struggle to find balanced partitions for weighted real-world hypergraphs. To this end, we present a technique that enables partitioners based on recursive bipartitioning to reliably compute balanced solutions. The idea is to preassign a small portion of the heaviest nodes to one of the two blocks of each bipartition and optimize the objective function on the remaining nodes. We integrated the approach into the sequential hypergraph partitioner KaHyPar and show that our new approach can compute balanced solutions for all tested instances without negatively affecting the solution quality and running time of KaHyPar. In our experimental evaluation, we compare our new shared-memory (hyper)graph partitioner Mt-KaHyPar to 2525 different graph and hypergraph partitioners on over 800800 (hyper)graphs with up to two billion edges/pins. The results indicate that already our fastest configuration outperforms almost all existing hypergraph partitioners with regards to both solution quality and running time. Our highest-quality configuration (nn-level with flow-based refinement) achieves the same solution quality as the currently best sequential partitioner KaHyPar, while being almost an order of magnitude faster with ten threads. In addition, we optimize our data structures for graph partitioning, which improves the running times of both multilevel partitioners by almost a factor of two for graphs. As a result, Mt-KaHyPar also outperforms most of the existing graph partitioning algorithms. While the shared-memory graph partitioner KaMinPar is still faster than Mt-KaHyPar, its produced solutions are worse by 10%10\% in the median. The best sequential graph partitioner KaFFPa-StrongS computes slightly better partitions than Mt-KaHyPar (median improvement is 1%1\%), but is more than an order of magnitude slower on average

    Hypergraph Partitioning in the Cloud

    Get PDF
    The thesis investigates the partitioning and load balancing problem which has many applications in High Performance Computing (HPC). The application to be partitioned is described with a graph or hypergraph. The latter is of greater interest as hypergraphs, compared to graphs, have a more general structure and can be used to model more complex relationships between groups of objects such as non-symmetric dependencies. Optimal graph and hypergraph partitioning is known to be NP-Hard but good polynomial time heuristic algorithms have been proposed. In this thesis, we propose two multi-level hypergraph partitioning algorithms. The algorithms are based on rough set clustering techniques. The first algorithm, which is a serial algorithm, obtains high quality partitionings and improves the partitioning cut by up to 71\% compared to the state-of-the-art serial hypergraph partitioning algorithms. Furthermore, the capacity of serial algorithms is limited due to the rapid growth of problem sizes of distributed applications. Consequently, we also propose a parallel hypergraph partitioning algorithm. Considering the generality of the hypergraph model, designing a parallel algorithm is difficult and the available parallel hypergraph algorithms offer less scalability compared to their graph counterparts. The issue is twofold: the parallel algorithm and the complexity of the hypergraph structure. Our parallel algorithm provides a trade-off between global and local vertex clustering decisions. By employing novel techniques and approaches, our algorithm achieves better scalability than the state-of-the-art parallel hypergraph partitioner in the Zoltan tool on a set of benchmarks, especially ones with irregular structure. Furthermore, recent advances in cloud computing and the services they provide have led to a trend in moving HPC and large scale distributed applications into the cloud. Despite its advantages, some aspects of the cloud, such as limited network resources, present a challenge to running communication-intensive applications and make them non-scalable in the cloud. While hypergraph partitioning is proposed as a solution for decreasing the communication overhead within parallel distributed applications, it can also offer advantages for running these applications in the cloud. The partitioning is usually done as a pre-processing step before running the parallel application. As parallel hypergraph partitioning itself is a communication-intensive operation, running it in the cloud is hard and suffers from poor scalability. The thesis also investigates the scalability of parallel hypergraph partitioning algorithms in the cloud, the challenges they present, and proposes solutions to improve the cost/performance ratio for running the partitioning problem in the cloud. Our algorithms are implemented as a new hypergraph partitioning package within Zoltan. It is an open source Linux-based toolkit for parallel partitioning, load balancing and data-management designed at Sandia National Labs. The algorithms are known as FEHG and PFEHG algorithms

    Multilevel Hypergraph Partitioning with Vertex Weights Revisited

    Get PDF
    The balanced hypergraph partitioning problem (HGP) is to partition the vertex set of a hypergraph into k disjoint blocks of bounded weight, while minimizing an objective function defined on the hyperedges. Whereas real-world applications often use vertex and edge weights to accurately model the underlying problem, the HGP research community commonly works with unweighted instances. In this paper, we argue that, in the presence of vertex weights, current balance constraint definitions either yield infeasible partitioning problems or allow unnecessarily large imbalances and propose a new definition that overcomes these problems. We show that state-of-the-art hypergraph partitioners often struggle considerably with weighted instances and tight balance constraints (even with our new balance definition). Thus, we present a recursive-bipartitioning technique that is able to reliably compute balanced (and hence feasible) solutions. The proposed method balances the partition by pre-assigning a small subset of the heaviest vertices to the two blocks of each bipartition (using an algorithm originally developed for the job scheduling problem) and optimizes the actual partitioning objective on the remaining vertices. We integrate our algorithm into the multilevel hypergraph partitioner KaHyPar and show that our approach is able to compute balanced partitions of high quality on a diverse set of benchmark instances

    SAT-based optimal hypergraph partitioning with replication

    Get PDF
    We propose a methodology for optimal k-way partitioning with replication of directed hypergraphs via Boolean satisfiability. We begin by leveraging the power of existing and emerging SAT solvers to attack traditional logic bipartitioning and show good scaling behavior. We continue to present the first optimal partitioning results that admit generation and assignment of replicated nodes concurrently. Our framework is general enough that we also give the first published optimal results for partitioning with respect to the maximum subdomain degree metric and the sum of external degrees metric. We show that for the bipartitioning case we can feasibly solve problems of up to 150 nodes with simultaneous replication in hundreds of seconds. For other partitioning metrics, we are able to solve problems up to 40 nodes in hundreds of seconds

    Parallel and Flow-Based High Quality Hypergraph Partitioning

    Get PDF
    Balanced hypergraph partitioning is a classic NP-hard optimization problem that is a fundamental tool in such diverse disciplines as VLSI circuit design, route planning, sharding distributed databases, optimizing communication volume in parallel computing, and accelerating the simulation of quantum circuits. Given a hypergraph and an integer kk, the task is to divide the vertices into kk disjoint blocks with bounded size, while minimizing an objective function on the hyperedges that span multiple blocks. In this dissertation we consider the most commonly used objective, the connectivity metric, where we aim to minimize the number of different blocks connected by each hyperedge. The most successful heuristic for balanced partitioning is the multilevel approach, which consists of three phases. In the coarsening phase, vertex clusters are contracted to obtain a sequence of structurally similar but successively smaller hypergraphs. Once sufficiently small, an initial partition is computed. Lastly, the contractions are successively undone in reverse order, and an iterative improvement algorithm is employed to refine the projected partition on each level. An important aspect in designing practical heuristics for optimization problems is the trade-off between solution quality and running time. The appropriate trade-off depends on the specific application, the size of the data sets, and the computational resources available to solve the problem. Existing algorithms are either slow, sequential and offer high solution quality, or are simple, fast, easy to parallelize, and offer low quality. While this trade-off cannot be avoided entirely, our goal is to close the gaps as much as possible. We achieve this by improving the state of the art in all non-trivial areas of the trade-off landscape with only a few techniques, but employed in two different ways. Furthermore, most research on parallelization has focused on distributed memory, which neglects the greater flexibility of shared-memory algorithms and the wide availability of commodity multi-core machines. In this thesis, we therefore design and revisit fundamental techniques for each phase of the multilevel approach, and develop highly efficient shared-memory parallel implementations thereof. We consider two iterative improvement algorithms, one based on the Fiduccia-Mattheyses (FM) heuristic, and one based on label propagation. For these, we propose a variety of techniques to improve the accuracy of gains when moving vertices in parallel, as well as low-level algorithmic improvements. For coarsening, we present a parallel variant of greedy agglomerative clustering with a novel method to resolve cluster join conflicts on-the-fly. Combined with a preprocessing phase for coarsening based on community detection, a portfolio of from-scratch partitioning algorithms, as well as recursive partitioning with work-stealing, we obtain our first parallel multilevel framework. It is the fastest partitioner known, and achieves medium-high quality, beating all parallel partitioners, and is close to the highest quality sequential partitioner. Our second contribution is a parallelization of an n-level approach, where only one vertex is contracted and uncontracted on each level. This extreme approach aims at high solution quality via very fine-grained, localized refinement, but seems inherently sequential. We devise an asynchronous n-level coarsening scheme based on a hierarchical decomposition of the contractions, as well as a batch-synchronous uncoarsening, and later fully asynchronous uncoarsening. In addition, we adapt our refinement algorithms, and also use the preprocessing and portfolio. This scheme is highly scalable, and achieves the same quality as the highest quality sequential partitioner (which is based on the same components), but is of course slower than our first framework due to fine-grained uncoarsening. The last ingredient for high quality is an iterative improvement algorithm based on maximum flows. In the sequential setting, we first improve an existing idea by solving incremental maximum flow problems, which leads to smaller cuts and is faster due to engineering efforts. Subsequently, we parallelize the maximum flow algorithm and schedule refinements in parallel. Beyond the strive for highest quality, we present a deterministically parallel partitioning framework. We develop deterministic versions of the preprocessing, coarsening, and label propagation refinement. Experimentally, we demonstrate that the penalties for determinism in terms of partition quality and running time are very small. All of our claims are validated through extensive experiments, comparing our algorithms with state-of-the-art solvers on large and diverse benchmark sets. To foster further research, we make our contributions available in our open-source framework Mt-KaHyPar. While it seems inevitable, that with ever increasing problem sizes, we must transition to distributed memory algorithms, the study of shared-memory techniques is not in vain. With the multilevel approach, even the inherently slow techniques have a role to play in fast systems, as they can be employed to boost quality on coarse levels at little expense. Similarly, techniques for shared-memory parallelism are important, both as soon as a coarse graph fits into memory, and as local building blocks in the distributed algorithm

    Replicated partitioning for undirected hypergraphs

    Get PDF
    Cataloged from PDF version of article.Hypergraph partitioning (HP) and replication are diverse but powerful tools that are traditionally applied separately to minimize the costs of parallel and sequential systems that access related data or process related tasks. When combined together, these two techniques have the potential of achieving significant improvements in performance of many applications. In this study, we provide an approach involving a tool that simultaneously performs replication and partitioning of the vertices of an undirected hypergraph whose vertices represent data and nets represent task dependencies among these data. In this approach, we propose an iterative-improvement-based replicated bipartitioning heuristic, which is capable of move, replication, and unreplication of vertices. In order to utilize our replicated bipartitioning heuristic in a recursive bipartitioning framework, we also propose appropriate cut-net removal, cut-net splitting, and pin selection algorithms to correctly encapsulate the two most commonly used cutsize metrics. We embed our replicated bipartitioning scheme into the state-of-the-art multilevel HP tool PaToH to provide an effective and efficient replicated HP tool, rpPaToH. The performance of the techniques proposed and the tools developed is tested over the undirected hypergraphs that model the communication costs of parallel query processing in information retrieval systems. Our experimental analysis indicates that the proposed technique provides significant improvements in the quality of the partitions, especially under low replication ratios. (C) 2012 Elsevier Inc. All rights reserved
    corecore