75 research outputs found

    Scalable Kernelization for Maximum Independent Sets

    Get PDF
    The most efficient algorithms for finding maximum independent sets in both theory and practice use reduction rules to obtain a much smaller problem instance called a kernel. The kernel can then be solved quickly using exact or heuristic algorithms---or by repeatedly kernelizing recursively in the branch-and-reduce paradigm. It is of critical importance for these algorithms that kernelization is fast and returns a small kernel. Current algorithms are either slow but produce a small kernel, or fast and give a large kernel. We attempt to accomplish both of these goals simultaneously, by giving an efficient parallel kernelization algorithm based on graph partitioning and parallel bipartite maximum matching. We combine our parallelization techniques with two techniques to accelerate kernelization further: dependency checking that prunes reductions that cannot be applied, and reduction tracking that allows us to stop kernelization when reductions become less fruitful. Our algorithm produces kernels that are orders of magnitude smaller than the fastest kernelization methods, while having a similar execution time. Furthermore, our algorithm is able to compute kernels with size comparable to the smallest known kernels, but up to two orders of magnitude faster than previously possible. Finally, we show that our kernelization algorithm can be used to accelerate existing state-of-the-art heuristic algorithms, allowing us to find larger independent sets faster on large real-world networks and synthetic instances.Comment: Extended versio

    Scalable kernelization for the maximum independent set problem

    Get PDF

    Enabling Scalability: Graph Hierarchies and Fault Tolerance

    Get PDF
    In this dissertation, we explore approaches to two techniques for building scalable algorithms. First, we look at different graph problems. We show how to exploit the input graph\u27s inherent hierarchy for scalable graph algorithms. The second technique takes a step back from concrete algorithmic problems. Here, we consider the case of node failures in large distributed systems and present techniques to quickly recover from these. In the first part of the dissertation, we investigate how hierarchies in graphs can be used to scale algorithms to large inputs. We develop algorithms for three graph problems based on two approaches to build hierarchies. The first approach reduces instance sizes for NP-hard problems by applying so-called reduction rules. These rules can be applied in polynomial time. They either find parts of the input that can be solved in polynomial time, or they identify structures that can be contracted (reduced) into smaller structures without loss of information for the specific problem. After solving the reduced instance using an exponential-time algorithm, these previously contracted structures can be uncontracted to obtain an exact solution for the original input. In addition to a simple preprocessing procedure, reduction rules can also be used in branch-and-reduce algorithms where they are successively applied after each branching step to build a hierarchy of problem kernels of increasing computational hardness. We develop reduction-based algorithms for the classical NP-hard problems Maximum Independent Set and Maximum Cut. The second approach is used for route planning in road networks where we build a hierarchy of road segments based on their importance for long distance shortest paths. By only considering important road segments when we are far away from the source and destination, we can substantially speed up shortest path queries. In the second part of this dissertation, we take a step back from concrete graph problems and look at more general problems in high performance computing (HPC). Here, due to the ever increasing size and complexity of HPC clusters, we expect hardware and software failures to become more common in massively parallel computations. We present two techniques for applications to recover from failures and resume computation. Both techniques are based on in-memory storage of redundant information and a data distribution that enables fast recovery. The first technique can be used for general purpose distributed processing frameworks: We identify data that is redundantly available on multiple machines and only introduce additional work for the remaining data that is only available on one machine. The second technique is a checkpointing library engineered for fast recovery using a data distribution method that achieves balanced communication loads. Both our techniques have in common that they work in settings where computation after a failure is continued with less machines than before. This is in contrast to many previous approaches that---in particular for checkpointing---focus on systems that keep spare resources available to replace failed machines. Overall, we present different techniques that enable scalable algorithms. While some of these techniques are specific to graph problems, we also present tools for fault tolerant algorithms and applications in a distributed setting. To show that those can be helpful in many different domains, we evaluate them for graph problems and other applications like phylogenetic tree inference

    Scalable Graph Algorithms using Practically Efficient Data Reductions

    Get PDF

    Geometric Inhomogeneous Random Graphs for Algorithm Engineering

    Get PDF
    The design and analysis of graph algorithms is heavily based on the worst case. In practice, however, many algorithms perform much better than the worst case would suggest. Furthermore, various problems can be tackled more efficiently if one assumes the input to be, in a sense, realistic. The field of network science, which studies the structure and emergence of real-world networks, identifies locality and heterogeneity as two frequently occurring properties. A popular model that captures these properties are geometric inhomogeneous random graphs (GIRGs), which is a generalization of hyperbolic random graphs (HRGs). Aside from their importance to network science, GIRGs can be an immensely valuable tool in algorithm engineering. Since they convincingly mimic real-world networks, guarantees about quality and performance of an algorithm on instances of the model can be transferred to real-world applications. They have model parameters to control the amount of heterogeneity and locality, which allows to evaluate those properties in isolation while keeping the rest fixed. Moreover, they can be efficiently generated which allows for experimental analysis. While realistic instances are often rare, generated instances are readily available. Furthermore, the underlying geometry of GIRGs helps to visualize the network, e.g.,~for debugging or to improve understanding of its structure. The aim of this work is to demonstrate the capabilities of geometric inhomogeneous random graphs in algorithm engineering and establish them as routine tools to replace previous models like the Erd\H{o}s-R{\\u27e}nyi model, where each edge exists with equal probability. We utilize geometric inhomogeneous random graphs to design, evaluate, and optimize efficient algorithms for realistic inputs. In detail, we provide the currently fastest sequential generator for GIRGs and HRGs and describe algorithms for maximum flow, directed spanning arborescence, cluster editing, and hitting set. For all four problems, our implementations beat the state-of-the-art on realistic inputs. On top of providing crucial benchmark instances, GIRGs allow us to obtain valuable insights. Most notably, our efficient generator allows us to experimentally show sublinear running time of our flow algorithm, investigate the solution structure of cluster editing, complement our benchmark set of arborescence instances with a density for which there are no real-world networks available, and generate networks with adjustable locality and heterogeneity to reveal the effects of these properties on our algorithms

    The PACE 2022 Parameterized Algorithms and Computational Experiments Challenge: Directed Feedback Vertex Set

    Get PDF

    The PACE 2022 Parameterized Algorithms and Computational Experiments Challenge: Directed Feedback Vertex Set

    Get PDF
    The Parameterized Algorithms and Computational Experiments challenge (PACE) 2022 was devoted to engineer algorithms solving the NP-hard Directed Feedback Vertex Set (DFVS) problem. The DFVS problem is to find a minimum subset X⊆VX ⊆ V in a given directed graph G=(V,E)G = (V,E) such that, when all vertices of XX and their adjacent edges are deleted from GG, the remainder is acyclic. Overall, the challenge had 90 participants from 26 teams, 12 countries, and 3 continents that submitted their implementations to this year’s competition. In this report, we briefly describe the setup of the challenge, the selection of benchmark instances, as well as the ranking of the participating teams. We also briefly outline the approaches used in the submitted solvers

    Parallélisation massive des algorithmes de branchement

    Get PDF
    Les problèmes d'optimisation et de recherche sont souvent NP-complets et des techniques de force brute doivent généralement être mises en œuvre pour trouver des solutions exactes. Des problèmes tels que le regroupement de gènes en bio-informatique ou la recherche de routes optimales dans les réseaux de distribution peuvent être résolus en temps exponentiel à l'aide de stratégies de branchement récursif. Néanmoins, ces algorithmes deviennent peu pratiques au-delà de certaines tailles d'instances en raison du grand nombre de scénarios à explorer, pour lesquels des techniques de parallélisation sont nécessaires pour améliorer les performances. Dans des travaux antérieurs, des techniques centralisées et décentralisées ont été mises en œuvre afin d'augmenter le parallélisme des algorithmes de branchement tout en essayant de réduire les coûts de communication, qui jouent un rôle important dans les implémentations massivement parallèles en raison des messages passant entre les processus. Ainsi, notre travail consiste à développer une bibliothèque entièrement générique en C++, nommée GemPBA, pour accélérer presque tous les algorithmes de branchement avec une parallélisation massive, ainsi que le développement d'un outil novateur et simpliste d'équilibrage de charge dynamique pour réduire le nombre de messages transmis en envoyant les tâches prioritaires en premier. Notre approche utilise une stratégie hybride centralisée-décentralisée, qui fait appel à un processus central chargé d'attribuer les rôles des travailleurs par des messages de quelques bits, telles que les tâches n'ont pas besoin de passer par un processeur central. De plus, un processeur en fonctionnement génère de nouvelles tâches si et seulement s'il y a des processeurs disponibles pour les recevoir, garantissant ainsi leur transfert, ce qui réduit considérablement les coûts de communication. Nous avons réalisé nos expériences sur le problème de la couverture minimale de sommets, qui a montré des résultats remarquables, étant capable de résoudre même les graphes DIMACS les plus difficiles avec un simple algorithme MVC.Abstract: Optimization and search problems are often NP-complete, and brute-force techniques must typically be implemented to find exact solutions. Problems such as clustering genes in bioinformatics or finding optimal routes in delivery networks can be solved in exponential-time using recursive branching strategies. Nevertheless, these algorithms become impractical above certain instance sizes due to the large number of scenarios that need to be explored, for which parallelization techniques are necessary to improve the performance. In previous works, centralized and decentralized techniques have been implemented aiming to scale up parallelism on branching algorithms whilst attempting to reduce communication overhead, which plays a significant role in massively parallel implementations due to the messages passing across processes. Thus, our work consists of the development of a fully generic library in C++, named GemPBA, to speed up almost any branching algorithms with massive parallelization, along with the development of a novel and simplistic Dynamic Load Balancing tool to reduce the number of passed messages by sending high priority tasks first. Our approach uses a hybrid centralized-decentralized strategy, which makes use of a center process in charge of assigning worker roles by messages of a few bits of size, such that tasks do not need to pass through a center processor. Also, a working processor will spawn new tasks if and only if there are available processors to receive them, thus, guaranteeing its transfer, and thereby the communication overhead is notably decreased. We performed our experiments on the Minimum Vertex Cover problem, which showed remarkable results, being capable of solving even the toughest DIMACS graphs with a simple MVC algorithm

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum
    • …
    corecore