19,980 research outputs found

    Solving Graph Coloring Problems with Abstraction and Symmetry

    Get PDF
    This paper introduces a general methodology, based on abstraction and symmetry, that applies to solve hard graph edge-coloring problems and demonstrates its use to provide further evidence that the Ramsey number R(4,3,3)=30R(4,3,3)=30. The number R(4,3,3)R(4,3,3) is often presented as the unknown Ramsey number with the best chances of being found "soon". Yet, its precise value has remained unknown for more than 50 years. We illustrate our approach by showing that: (1) there are precisely 78{,}892 (3,3,3;13)(3,3,3;13) Ramsey colorings; and (2) if there exists a (4,3,3;30)(4,3,3;30) Ramsey coloring then it is (13,8,8) regular. Specifically each node has 13 edges in the first color, 8 in the second, and 8 in the third. We conjecture that these two results will help provide a proof that no (4,3,3;30)(4,3,3;30) Ramsey coloring exists implying that R(4,3,3)=30R(4,3,3)=30

    Parallel pivoting combined with parallel reduction

    Get PDF
    Parallel algorithms for triangularization of large, sparse, and unsymmetric matrices are presented. The method combines the parallel reduction with a new parallel pivoting technique, control over generations of fill-ins and a check for numerical stability, all done in parallel with the work being distributed over the active processes. The parallel technique uses the compatibility relation between pivots to identify parallel pivot candidates and uses the Markowitz number of pivots to minimize fill-in. This technique is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds

    The Reverse Cuthill-McKee Algorithm in Distributed-Memory

    Full text link
    Ordering vertices of a graph is key to minimize fill-in and data structure size in sparse direct solvers, maximize locality in iterative solvers, and improve performance in graph algorithms. Except for naturally parallelizable ordering methods such as nested dissection, many important ordering methods have not been efficiently mapped to distributed-memory architectures. In this paper, we present the first-ever distributed-memory implementation of the reverse Cuthill-McKee (RCM) algorithm for reducing the profile of a sparse matrix. Our parallelization uses a two-dimensional sparse matrix decomposition. We achieve high performance by decomposing the problem into a small number of primitives and utilizing optimized implementations of these primitives. Our implementation shows strong scaling up to 1024 cores for smaller matrices and up to 4096 cores for larger matrices

    Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication

    Full text link
    Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on Erdos-Renyi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first ever implementation of the 3D SpGEMM formulation that also exploits multiple (intra-node and inter-node) levels of parallelism, achieving significant speedups over the state-of-the-art publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research

    Distributed-Memory Breadth-First Search on Massive Graphs

    Full text link
    This chapter studies the problem of traversing large graphs using the breadth-first search order on distributed-memory supercomputers. We consider both the traditional level-synchronous top-down algorithm as well as the recently discovered direction optimizing algorithm. We analyze the performance and scalability trade-offs in using different local data structures such as CSR and DCSC, enabling in-node multithreading, and graph decompositions such as 1D and 2D decomposition.Comment: arXiv admin note: text overlap with arXiv:1104.451
    • …
    corecore