4 research outputs found
GPU accelerated maximum cardinality matching algorithms for bipartite graphs
We design, implement, and evaluate GPU-based algorithms for the maximum
cardinality matching problem in bipartite graphs. Such algorithms have a
variety of applications in computer science, scientific computing,
bioinformatics, and other areas. To the best of our knowledge, ours is the
first study which focuses on GPU implementation of the maximum cardinality
matching algorithms. We compare the proposed algorithms with serial and
multicore implementations from the literature on a large set of real-life
problems where in majority of the cases one of our GPU-accelerated algorithms
is demonstrated to be faster than both the sequential and multicore
implementations.Comment: 14 pages, 5 figure
Computing Maximum Cardinality Matchings in Parallel on Bipartite Graphs via Tree-Grafting
It is difficult to obtain high performance when computing matchings on parallel processors because matching algorithms explicitly or implicitly search for paths in the graph, and when these paths become long, there is little concurrency. In spite of this limitation, we present a new algorithm and its shared-memory parallelization that achieves good performance and scalability in computing maximum cardinality matchings in bipartite graphs. Our algorithm searches for augmenting paths via specialized breadth-first searches (BFS) from multiple source vertices, hence creating more parallelism than single source algorithms. Algorithms that employ multiple-source searches cannot discard a search tree once no augmenting path is discovered from the tree, unlike algorithms that rely on single-source searches. We describe a novel tree-grafting method that eliminates most of the redundant edge traversals resulting from this property of multiple-source searches. We also employ the recent direction-optimizing BFS algorithm as a subroutine to discover augmenting paths faster. Our algorithm compares favorably with the current best algorithms in terms of the number of edges traversed, the average augmenting path length, and the number of iterations. We provide a proof of correctness for our algorithm. Our NUMA-aware implementation is scalable to 80 threads of an Intel multiprocessor and to 240 threads on an Intel Knights Corner coprocessor. On average, our parallel algorithm runs an order of magnitude faster than the fastest algorithms available. The performance improvement is more significant on graphs with small matching number
Petri-Netz-basierte Simulation biologischer Prozesse mit OpenModelica
Ochel L. Petri-Netz-basierte Simulation biologischer Prozesse mit OpenModelica. Bielefeld: Universität Bielefeld; 2017