25 research outputs found

    A Push-Relabel-Based Maximum Cardinality Bipartite Matching Algorithm on GPUs

    Get PDF
    International audienceWe design, develop, and evaluate an atomic- and lock-free GPU implementation of the push-relabel algorithm in the context of finding maximum cardinality matchings in bipartite graphs. The problem has applications on computer science, scientific computing, bioinformatics, and other areas. Although the GPU parallelization of the push-relabel technique has been investigated in the context of flow algorithms, to the best of our knowledge, ours is the first study which focuses on the maximum cardinality matching. We compare the proposed algorithms with serial, multicore, and manycore bipartite graph matching implementations from the literature on a large set of real-life problems. Our experiments show that the proposed push- relabel-based GPU algorithm is faster than the existing parallel and sequential implementations

    GPU accelerated maximum cardinality matching algorithms for bipartite graphs

    Get PDF
    We design, implement, and evaluate GPU-based algorithms for the maximum cardinality matching problem in bipartite graphs. Such algorithms have a variety of applications in computer science, scientific computing, bioinformatics, and other areas. To the best of our knowledge, ours is the first study which focuses on GPU implementation of the maximum cardinality matching algorithms. We compare the proposed algorithms with serial and multicore implementations from the literature on a large set of real-life problems where in majority of the cases one of our GPU-accelerated algorithms is demonstrated to be faster than both the sequential and multicore implementations.Comment: 14 pages, 5 figure

    Constraints Propagation on GPU: A Case Study for AllDifferent

    Get PDF
    The AllDifferent constraint is a fundamental tool in Constraint Programming. It naturally arises in many problems, from puzzles to scheduling and routing applications. Such popularity has prompted an extensive literature on filtering and propagation for this constraint. Motivated by the benefits that GPUs offer to other branches of AI, this paper investigates the use of GPUs to accelerate filtering and propagation. In particular, we present an efficient parallelization of the AllDifferent constraint on GPU; we analyze different design and implementation choices and evaluates the performance of the resulting system on medium to large instances of the Travelling Salesman Problem with encouraging results

    Two approximation algorithms for bipartite matching on multicore architectures

    Get PDF
    International audienceWe propose two heuristics for the bipartite matching problem that are amenable to shared-memory parallelization. The first heuristic is very intriguing from a parallelization perspective. It has no significant algorithmic synchronization overhead and no conflict resolution is needed across threads. We show that this heuristic has an approximation ratio of around 0.632 under some common conditions. The second heuristic is designed to obtain a larger matching by employing the well-known Karp-Sipser heuristic on a judiciously chosen subgraph of the original graph. We show that the Karp-Sipser heuristic always finds a maximum cardinality matching in the chosen subgraph. Although the Karp-Sipser heuristic is hard to parallelize for general graphs, we exploit the structure of the selected subgraphs to propose a specialized implementation which demonstrates very good scalability. We prove that this second heuristic has an approximation guarantee of around 0.866 under the same conditions as in the first algorithm. We discuss parallel implementations of the proposed heuristics on a multicore architecture. Experimental results, for demonstrating speed-ups and verifying the theoretical results in practice, are provided

    Asignación óptima de recursos energéticos a través de algoritmo Húngaro y Bipartite Matching para respuesta a la demanda en microredes.

    Get PDF
    En el presente documento, se implementó dos modelo matemático para encontrar en forma óptima un despacho energético al menor costo, en base a un sistema de microred de distribución eléctrica, que permita reducir el costo de energía hacia la demanda, se considera analizar los resultados de dichos algoritmos para realizar una comparación de cual modelo matemático asigna de forma más óptima al menor costo posible en comparación a un despacho convencional, se plantea el problema en base a centrales eléctricas de energías renovables y no renovables, mediante una heurística en base al modelo húngaro y el modelo bipartite matching para tener una óptima respuesta a la demanda, en conclusión en este trabajo se pretendió conseguir la mejor heurística para la asignación de los recursos energéticos y que se observe un ahorro significativo en el consumo de energía. La solución se conseguirá en base a programación lineal implementando los algoritmos que serán desarrollados en la plataforma de MATLAB para obtener la mejor respuesta en optimización. La respectiva contribución es analizar y describir un algoritmo de asignación y equilibrio en base a costos de energía para consumidores.In this document, two mathematical models were implemented to optimally find an energy dispatch at the lowest cost, based on a microgrid reductive distribution system, which allows to reduce the energy cost towards the demand, it is considered to analyze the results of these algorithms to make a comparison of which mathematical model allocates more optimally at the lowest possible cost compared to a conventional dispatch, the problem arises based on renewable and non-renewable power plants, using a heuristic based on the Hungarian model and the bipartite matching model to have an optimal response to demand, in conclusion in this work we tried to achieve the best heuristic for the allocation of energy resources and to observe a significant saving in energy consumption. The solution will be achieved based on linear programming by implementing the algorithms that will be developed in the MATLAB platform to obtain the best response in optimization. The respective contribution is to analyze and describe an allocation and equilibrium algorithm based on energy costs for consumers

    Recent Advances in Fully Dynamic Graph Algorithms

    Full text link
    In recent years, significant advances have been made in the design and analysis of fully dynamic algorithms. However, these theoretical results have received very little attention from the practical perspective. Few of the algorithms are implemented and tested on real datasets, and their practical potential is far from understood. Here, we present a quick reference guide to recent engineering and theory results in the area of fully dynamic graph algorithms

    Grabcuts for image segmentation: A comparative study of clustering techniques

    Get PDF
    Image segmentation is the partitioning of a digital image into small segments such as pixels or sets of pixels. It is significant as it allows for the visualization of structures of interest, removing unnecessary information. In addition, image segmentation is used in many fields like, for instance healthcare for image surgery, construction, etc. as it enables structure analysis. Segmentation of images can be computationally expensive especially when a large dataset is used, thus the importance of fast and effective segmentation algorithms is realised. This method is used to locate objects and boundaries (i.e. foreground and background) in images. The aim of this study is to provide a comparison of clustering techniques that would allow the Grabcuts for image segmentation algorithm to be effective and inexpensive. The Grabcuts based method, which is an extension of the graph cut based method, has been instrumental in solving many problems in computer vision i.e. image restoration, image segmentation, object recognition, tracking and analysis. According to Ramirez,et.al [47], the Grabcuts approach is an iterative and minimal user interaction algorithm as it chooses a segmentation by iteratively revising the foreground and background pixels assignments. The method uses min-cut/ max-flow algorithm to segment digital images proposed by Boykov and Jolly [9]. The input of this approach is a digital image with a selected v region of interest (ROI). The ROI is selected using a rectangular bounding box. The pixels inside the bounding box are assigned to the foreground, while the others are assigned to the background. In this study, the Grabcuts for image segmentation algorithm designed by [48] with a Gaussian Mixture Model (GMM) based on the Kmeans and Kmedoids clustering techniques are developed and compared. In addition, the algorithms developed are allowed to run on the Central Processing Unit (CPU) under two scenarios. Scenario 1 involves allowing the Kmeans and Kmedoids clustering techniques to the Squared Euclidean distance measures to calculate the similarities and dissimilarities in pixels in an image. In scenario 2, the Kmeans and Kmedoids clustering techniques will use the City Block distance measure to calculate similarities as well as dissimilarities between pixels in a given image. The same images from the Berkeley Segmentation Dataset and Benchmark 500 were used as input to the algorithms and the number of clusters, K, was varied from 2 to 5. It was observed that the Kmeans clustering technique outperformed the Kmedoids clustering technique under the two scenarios for all the test images with K varied from 2 to 5, in terms of runtime required. In addition, the Kmeans clustering technique obtained more compact and separate clusters under scenario 1, than its counterpart. On the other hand, the Kmedoids obtained more compact and separate clusters than the Kmeans clustering technique under scenario 2. The silhouette validity index favoured the smallest number of clusters for both clustering techniques as it suggested the optimal number of clusters for the Kmeans and Kmedoids clustering techniques under the two scenarios was 2. Although the Kmeans required less computation time than vi its counterpart, the generation of foreground and background took longer for the GMM based on Kmeans than it did for the GMM based on Kmedoids clustering technique. Furthermore, the Grabcuts for image segmentation algorithm with a GMM based on the Kmedoids clustering technique was computationally less expensive than the Grabcuts for image segmentation algorithm with a GMM based on the Kmeans clustering technique. This was observed to be true under both scenario 1 and 2. The Grabcuts for image with the GMM based on the Kmeans clustering techniques obtained slightly better segmentation results when the visual quality is concerned, than its counterpart under the two scenarios considered. On the other hand, the BFscores showed that the Grabcuts for image segmentation algorithm with the GMM based on Kmedoids produces images with higher BF-scores than its counterpart when K was varied from 2 to 5 for most of the test images. In addition, most of the images obtained the majority of their best segmentation results when K=2. This was observed to be true under scenario 1 as well as scenario 2. Therefore, the Kmedoids clustering technique under scenario 2 with K=2 would be the best option for the segmentation of difficult images in BSDS500. This is due to its ability to generate GMMs and segment difficult images more efficiently (i.e. time complexity, higher BF-scores, more under segmented rather than over segmented images, inter alia.) while producing comparable visual segmentation results to those obtained by the Grabcuts for image segmentation: GMM-Kmeans

    Towards Performance Portable Graph Algorithms

    Get PDF
    In today's data-driven world, our computational resources have become heterogeneous, making the processing of large-scale graphs in an architecture agnostic manner crucial. Traditionally, hand-optimized high-performance computing (HPC) solutions have been studied and used to implement highly efficient and scalable graph algorithms. In recent years, several graph processing and management systems have also been proposed. Hand optimized HPC approaches require high levels of expertise and graph processing frameworks suffer from expressibility and performance. Portability is a major concern for both approaches. The main thesis of this work is that block-based graph algorithms offer a compromise between efficient parallelism and architecture agnostic algorithm design for a wide class of graph problems. This dissertation seeks to prove this thesis by focusing the work on the three pillars; data/computation partitioning, block-based algorithm design, and performance portability. In this dissertation, we first show how we can partition the computation and the data to design efficient block-based algorithms for solving graph merging and triangle counting problems. Then, generalizing from our experiences, we propose an algorithmic framework, for shared-memory, heterogeneous machines for implementing block-based graph algorithms; PGAbB. PGAbB aims to maximally leverage different architectures by implementing a task-based execution on top of a block-based programming model. In this talk we will discuss PGAbB's programming model, algorithmic optimizations for scheduling, and load-balancing strategies for graph problems on real-world and synthetic inputs.Ph.D

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum
    corecore