Search CORE

124 research outputs found

Scalable High-Quality Graph and Hypergraph Partitioning

Author: Heuer Tobias
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 18/11/2022
Field of study

The balanced hypergraph partitioning problem (HGP) asks for a partition of the node set of a hypergraph into

k

blocks of roughly equal size, such that an objective function defined on the hyperedges is minimized. In this work, we optimize the connectivity metric, which is the most prominent objective function for HGP. The hypergraph partitioning problem is NP-hard and there exists no constant factor approximation. Thus, heuristic algorithms are used in practice with the multilevel scheme as the most successful approach to solve the problem: First, the input hypergraph is coarsened to obtain a hierarchy of successively smaller and structurally similar approximations. The smallest hypergraph is then initially partitioned into

k

blocks, and subsequently, the contractions are reverted level-by-level, and, on each level, local search algorithms are used to improve the partition (refinement phase). In recent years, several new techniques were developed for sequential multilevel partitioning that substantially improved solution quality at the cost of an increased running time. These developments divide the landscape of existing partitioning algorithms into systems that either aim for speed or high solution quality with the former often being more than an order of magnitude faster than the latter. Due to the high running times of the best sequential algorithms, it is currently not feasible to partition the largest real-world hypergraphs with the highest possible quality. Thus, it becomes increasingly important to parallelize the techniques used in these algorithms. However, existing state-of-the-art parallel partitioners currently do not achieve the same solution quality as their sequential counterparts because they use comparatively weak components that are easier to parallelize. Moreover, there has been a recent trend toward simpler methods for partitioning large hypergraphs that even omit the multilevel scheme. In contrast to this development, we present two shared-memory multilevel hypergraph partitioners with parallel implementations of techniques used by the highest-quality sequential systems. Our first multilevel algorithm uses a parallel clustering-based coarsening scheme, which uses substantially fewer locking mechanisms than previous approaches. The contraction decisions are guided by the community structure of the input hypergraph obtained via a parallel community detection algorithm. For initial partitioning, we implement parallel multilevel recursive bipartitioning with a novel work-stealing approach and a portfolio of initial bipartitioning techniques to compute an initial solution. In the refinement phase, we use three different parallel improvement algorithms: label propagation refinement, a highly-localized direct

k

-way FM algorithm, and a novel parallelization of flow-based refinement. These algorithms build on our highly-engineered partition data structure, for which we propose several novel techniques to compute accurate gain values of node moves in the parallel setting. Our second multilevel algorithm parallelizes the

n

-level partitioning scheme used in the highest-quality sequential partitioner KaHyPar. Here, only a single node is contracted on each level, leading to a hierarchy with approximately

n

levels where

n

is the number of nodes. Correspondingly, in each refinement step, only a single node is uncontracted, allowing a highly-localized search for improvements. We show that this approach, which seems inherently sequential, can be parallelized efficiently without compromises in solution quality. To this end, we design a forest-based representation of contractions from which we derive a feasible parallel schedule of the contraction operations that we apply on a novel dynamic hypergraph data structure on-the-fly. In the uncoarsening phase, we decompose the contraction forest into batches, each containing a fixed number of nodes. We then uncontract each batch in parallel and use highly-localized versions of our refinement algorithms to improve the partition around the uncontracted nodes. We further show that existing sequential partitioning algorithms considerably struggle to find balanced partitions for weighted real-world hypergraphs. To this end, we present a technique that enables partitioners based on recursive bipartitioning to reliably compute balanced solutions. The idea is to preassign a small portion of the heaviest nodes to one of the two blocks of each bipartition and optimize the objective function on the remaining nodes. We integrated the approach into the sequential hypergraph partitioner KaHyPar and show that our new approach can compute balanced solutions for all tested instances without negatively affecting the solution quality and running time of KaHyPar. In our experimental evaluation, we compare our new shared-memory (hyper)graph partitioner Mt-KaHyPar to

25

different graph and hypergraph partitioners on over

800

(hyper)graphs with up to two billion edges/pins. The results indicate that already our fastest configuration outperforms almost all existing hypergraph partitioners with regards to both solution quality and running time. Our highest-quality configuration (

n

-level with flow-based refinement) achieves the same solution quality as the currently best sequential partitioner KaHyPar, while being almost an order of magnitude faster with ten threads. In addition, we optimize our data structures for graph partitioning, which improves the running times of both multilevel partitioners by almost a factor of two for graphs. As a result, Mt-KaHyPar also outperforms most of the existing graph partitioning algorithms. While the shared-memory graph partitioner KaMinPar is still faster than Mt-KaHyPar, its produced solutions are worse by

10\%

in the median. The best sequential graph partitioner KaFFPa-StrongS computes slightly better partitions than Mt-KaHyPar (median improvement is

1\%

), but is more than an order of magnitude slower on average

KITopen

Improving Coarsening Schemes for Hypergraph Partitioning by Exploiting Community Structure

Author: Heuer Tobias
Schlag Sebastian
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 16th International Symposium on Experimental Algorithms (SEA 2017)
Publication date: 01/01/2017
Field of study

We present an improved coarsening process for multilevel hypergraph partitioning that incorporates global information about the community structure. Community detection is performed via modularity maximization on a bipartite graph representation. The approach is made suitable for different classes of hypergraphs by defining weights for the graph edges that express structural properties of the hypergraph. We integrate our approach into a leading multilevel hypergraph partitioner with strong local search algorithms and perform extensive experiments on a large benchmark set of hypergraphs stemming from application areas such as VLSI design, SAT solving, and scientific computing. Our results indicate that respecting community structure during coarsening not only significantly improves the solutions found by the initial partitioning algorithm, but also consistently improves overall solution quality

Dagstuhl Research Online Publication Server

The PACE 2022 Parameterized Algorithms and Computational Experiments Challenge: Directed Feedback Vertex Set

Author: Heuer Tobias
Schulz Christian
Strash Darren
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 17th International Symposium on Parameterized and Exact Computation (IPEC 2022)
Publication date: 01/01/2022
Field of study

Dagstuhl Research Online Publication Server

Network Flow-Based Refinement for Multilevel Hypergraph Partitioning

Author: Heuer Tobias
Sanders Peter
Schlag Sebastian
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 17th International Symposium on Experimental Algorithms (SEA 2018)
Publication date: 01/01/2018
Field of study

We present a refinement framework for multilevel hypergraph partitioning that uses max-flow computations on pairs of blocks to improve the solution quality of a k-way partition. The framework generalizes the flow-based improvement algorithm of KaFFPa from graphs to hypergraphs and is integrated into the hypergraph partitioner KaHyPar. By reducing the size of hypergraph flow networks, improving the flow model used in KaFFPa, and developing techniques to improve the running time of our algorithm, we obtain a partitioner that computes the best solutions for a wide range of benchmark hypergraphs from different application areas while still having a running time comparable to that of hMetis

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Multilevel Hypergraph Partitioning with Vertex Weights Revisited

Author: Heuer Tobias
Maas Nikolai
Schlag Sebastian
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 19th International Symposium on Experimental Algorithms (SEA 2021)
Publication date: 01/01/2021
Field of study

The balanced hypergraph partitioning problem (HGP) is to partition the vertex set of a hypergraph into k disjoint blocks of bounded weight, while minimizing an objective function defined on the hyperedges. Whereas real-world applications often use vertex and edge weights to accurately model the underlying problem, the HGP research community commonly works with unweighted instances. In this paper, we argue that, in the presence of vertex weights, current balance constraint definitions either yield infeasible partitioning problems or allow unnecessarily large imbalances and propose a new definition that overcomes these problems. We show that state-of-the-art hypergraph partitioners often struggle considerably with weighted instances and tight balance constraints (even with our new balance definition). Thus, we present a recursive-bipartitioning technique that is able to reliably compute balanced (and hence feasible) solutions. The proposed method balances the partition by pre-assigning a small subset of the heaviest vertices to the two blocks of each bipartition (using an algorithm originally developed for the job scheduling problem) and optimizes the actual partitioning objective on the remaining vertices. We integrate our algorithm into the multilevel hypergraph partitioner KaHyPar and show that our approach is able to compute balanced partitions of high quality on a diverse set of benchmark instances

arXiv.org e-Print Archive

KITopen

Dagstuhl Research Online Publication Server

Parallel Flow-Based Hypergraph Partitioning

Author: Gottesbüren Lars
Heuer Tobias
Sanders Peter
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH
Publication date: 17/11/2022
Field of study

We present a shared-memory parallelization of flow-based refinement, which is considered the most powerful iterative improvement technique for hypergraph partitioning at the moment. Flow-based refinement works on bipartitions, so current sequential partitioners schedule it on different block pairs to improve k-way partitions. We investigate two different sources of parallelism: a parallel scheduling scheme and a parallel maximum flow algorithm based on the well-known push-relabel algorithm. In addition to thoroughly engineered implementations, we propose several optimizations that substantially accelerate the algorithm in practice, enabling the use on extremely large hypergraphs (up to 1 billion pins). We integrate our approach in the state-of-the-art parallel multilevel framework Mt-KaHyPar and conduct extensive experiments on a benchmark set of more than 500 real-world hypergraphs, to show that the partition quality of our code is on par with the highest quality sequential code (KaHyPar), while being an order of magnitude faster with 10 threads

KITopen

Quality Hypergraph Partitioning via Max-Flow-Min-Cut Computations

Author: Heuer Tobias
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2018
Field of study

In dieser Arbeit wird ein Framework basierend auf Max-Flow-Min-Cut Berechnungen vorgestellt, zur Verbesserung einer balancierten k-teilige Aufteilung eines Hypergraphen. Aktuell werden Varianten des FM Algorithmus [17] in allen modernen Multilevel Hypergraph Partitionierer als lokaler Suchalgorithmus verwendet. Solche bewegungsbasierenden Heuristiken haben den Nachteil, dass sie nur lokale Informationen über die Problemstruktur in die Berechnungen miteinfließen lassen. Wenn viele Knotenbewegungen den selben Einfluss auf die Lösungsqualität haben, dann hängt das Ergebnis oft von zufälligen Entscheidungen ab, welche der Algorithmus selbst trifft [15, 31, 36]. Flussbasierende Ansätze sind nicht bewegungbasiert und finden einen globalen minimalen Schnitt, welcher zwei Knoten s und t eines Graphen trennt [18]. Unser Framework ist durch die Arbeit von Sanders und Schulz [44] inspiriert. Diese integrierten eine flussbasierende Heuristik erfolgreich in Ihren Multilevel Graph Partitionierer. Wir generalisieren viele Ihrer Ideen, sodass sie im Multilevel Hypergraph Partitionierung-Kontext anwendbar sind. Wir entwickeln mehrere Techniken, um das aktuelle Hypergraph Flussnetzwerk zu verkleinern, welche die resultierende Problemgröße im Vergleich zu der aktuellen Representation [33], um den Faktor 2 reduziert. Zusätzlich zeigen wir, wie ein Flussproblem auf einem Subhypergraphen konfiguriert werden kann, sodass das eine Max-Flow-Min-Cut Berechnung eine bessere Qualität erzielt, als die Modellierung von Sanders und Schulz. Am Ende haben wir unsere Arbeit als Verbesserungsstrategie in den n-level Hypergraph Partitionierer KaHyPar integriert [25]. Wir haben unser Framework auf 3216 verschiedenen Instanzen getestet. Im Vergleich mit 5 verschiedenen Systemen erzielt unsere neue Konfiguration, auf 73% der Instanzen, die besten Ergebnisse. Im Vergleich zu der aktuellen Variante von KaHyPar ist die Qualität der Lösungen um 2.5% gestiegen, während die Laufzeit lediglich um den Faktor 2 langsamer ist. Jedoch hat unser Algorithmus eine vergleichbare Laufzeit mit hMetis und erzielt auf 84% der Instanzen bessere Ergebnisse

KITopen

The Quantile Index - Succinct Self-Index for Top-k Document Retrieval

Author: Baumstark Niklas
Gog Simon
Heuer Tobias
Labeit Julian
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 16th International Symposium on Experimental Algorithms (SEA 2017)
Publication date: 01/01/2017
Field of study

One of the central problems in information retrieval is that of finding the k documents in a large text collection that best match a query given by a user. A recent result of Navarro & Nekrich (SODA 2012) showed that single term and phrase queries of length m can be solved in optimal O(m+k) time using a linear word sized index. While a verbatim implementation of the index would be at least an order of magnitude larger than the original collection, various authors incrementally improved the index to a point where the space requirement is currently within a factor of 1.5 to 2.0 of the text size for standard collections. In this paper, we propose a new time/space trade-off for different top-k indexes. This is achieved by sampling only a quantile of the postings in the original inverted file or suffix array-based index. For those queries that cannot be answered using the sampled version of the index we show how to compute the query results on the fly efficiently. As an example, we apply our method to the top-k framework by Navarro & Nekrich. Under probabilistic assumptions that hold for most standard texts, and for a standard scoring function called term frequency, our index can be represented with only sublinearly many bits plus the space needed for a compressed suffix array of the text, while maintaining poly-logarithmic query times. We evaluate our solution on real-world datasets and compare its practical space usage and performance against state-of-the-art implementations. Our experiments show that our index compresses below the size of the original text. To our knowledge it is the first suffix array-based text index that is able to break this bound in practice even for non-repetitive collections, while still maintaining reasonable query times of under half a millisecond on average for top-10 queries

Dagstuhl Research Online Publication Server

Practical Range Minimum Queries Revisited

Author: Baumstark Niklas
Gog Simon
Heuer Tobias
Labeit Julian
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 16th International Symposium on Experimental Algorithms (SEA 2017)
Publication date: 01/01/2017
Field of study

Finding the position of the minimal element in a subarray A[i..j] of an array A of size n is a fundamental operation in many applications. In 2011, Fischer and Heun presented the first index of size 2n+o(n) bits which answers the operation in constant time for any subarray. The index can be computed in linear time and queries can be answered without consulting the original array. The most recent and currently fastest practical index is due to Ferrada and Navarro (DCC\u2716). It reduces the range minimum query (RMQ) to more fundamental and well studied queries on binary vectors, namely rank and select, and a RMQ query on an array of sublinear size derived from A. A range min-max tree is employed to solve this recursive RMQ call. In this paper, we review their practical design and suggest a series of changes which result in consistently faster query times. Specifically, we provide a customized select implementation, switch to two levels of recursion, and use the sparse table solution for the recursion base case instead of a range min-max tree. We provide an extensive empirical evaluation of our new implementation and also compare it to the state of the art. Our experimental study shows that our proposal significantly outperforms the previous solutions on established benchmarks (up to a factor of three) and furthermore accelerates real world applications such as traversing a succinct tree or listing all distinct elements in an interval of an array

Dagstuhl Research Online Publication Server