16,524 research outputs found

    Parallel Unsmoothed Aggregation Algebraic Multigrid Algorithms on GPUs

    Full text link
    We design and implement a parallel algebraic multigrid method for isotropic graph Laplacian problems on multicore Graphical Processing Units (GPUs). The proposed AMG method is based on the aggregation framework. The setup phase of the algorithm uses a parallel maximal independent set algorithm in forming aggregates and the resulting coarse level hierarchy is then used in a K-cycle iteration solve phase with a â„“1\ell^1-Jacobi smoother. Numerical tests of a parallel implementation of the method for graphics processors are presented to demonstrate its effectiveness.Comment: 18 pages, 3 figure

    Tsirelson's problem and Kirchberg's conjecture

    Full text link
    Tsirelson's problem asks whether the set of nonlocal quantum correlations with a tensor product structure for the Hilbert space coincides with the one where only commutativity between observables located at different sites is assumed. Here it is shown that Kirchberg's QWEP conjecture on tensor products of C*-algebras would imply a positive answer to this question for all bipartite scenarios. This remains true also if one considers not only spatial correlations, but also spatiotemporal correlations, where each party is allowed to apply their measurements in temporal succession; we provide an example of a state together with observables such that ordinary spatial correlations are local, while the spatiotemporal correlations reveal nonlocality. Moreover, we find an extended version of Tsirelson's problem which, for each nontrivial Bell scenario, is equivalent to the QWEP conjecture. This extended version can be conveniently formulated in terms of steering the system of a third party. Finally, a comprehensive mathematical appendix offers background material on complete positivity, tensor products of C*-algebras, group C*-algebras, and some simple reformulations of the QWEP conjecture.Comment: 57 pages, to appear in Rev. Math. Phy

    cuIBM -- A GPU-accelerated Immersed Boundary Method

    Full text link
    A projection-based immersed boundary method is dominated by sparse linear algebra routines. Using the open-source Cusp library, we observe a speedup (with respect to a single CPU core) which reflects the constraints of a bandwidth-dominated problem on the GPU. Nevertheless, GPUs offer the capacity to solve large problems on commodity hardware. This work includes validation and a convergence study of the GPU-accelerated IBM, and various optimizations.Comment: Extended paper post-conference, presented at the 23rd International Conference on Parallel Computational Fluid Dynamics (http://www.parcfd.org), ParCFD 2011, Barcelona (unpublished

    Scalable Task-Based Algorithm for Multiplication of Block-Rank-Sparse Matrices

    Full text link
    A task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication (MM), is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum chemistry (QC). The novel features of our formulation are: (1) concurrent scheduling of multiple SUMMA iterations, and (2) fine-grained task-based composition. These features make it tolerant of the load imbalance due to the irregular matrix structure and eliminate all artifactual sources of global synchronization.Scalability of iterative computation of square-root inverse of block-rank-sparse QC matrices is demonstrated; for full-rank (dense) matrices the performance of our SUMMA formulation usually exceeds that of the state-of-the-art dense MM implementations (ScaLAPACK and Cyclops Tensor Framework).Comment: 8 pages, 6 figures, accepted to IA3 2015. arXiv admin note: text overlap with arXiv:1504.0504
    • …
    corecore