4,033 research outputs found

    Metropolis-Hastings within Partially Collapsed Gibbs Samplers

    Full text link
    The Partially Collapsed Gibbs (PCG) sampler offers a new strategy for improving the convergence of a Gibbs sampler. PCG achieves faster convergence by reducing the conditioning in some of the draws of its parent Gibbs sampler. Although this can significantly improve convergence, care must be taken to ensure that the stationary distribution is preserved. The conditional distributions sampled in a PCG sampler may be incompatible and permuting their order may upset the stationary distribution of the chain. Extra care must be taken when Metropolis-Hastings (MH) updates are used in some or all of the updates. Reducing the conditioning in an MH within Gibbs sampler can change the stationary distribution, even when the PCG sampler would work perfectly if MH were not used. In fact, a number of samplers of this sort that have been advocated in the literature do not actually have the target stationary distributions. In this article, we illustrate the challenges that may arise when using MH within a PCG sampler and develop a general strategy for using such updates while maintaining the desired stationary distribution. Theoretical arguments provide guidance when choosing between different MH within PCG sampling schemes. Finally we illustrate the MH within PCG sampler and its computational advantage using several examples from our applied work

    A GPU-based hyperbolic SVD algorithm

    Get PDF
    A one-sided Jacobi hyperbolic singular value decomposition (HSVD) algorithm, using a massively parallel graphics processing unit (GPU), is developed. The algorithm also serves as the final stage of solving a symmetric indefinite eigenvalue problem. Numerical testing demonstrates the gains in speed and accuracy over sequential and MPI-parallelized variants of similar Jacobi-type HSVD algorithms. Finally, possibilities of hybrid CPU--GPU parallelism are discussed.Comment: Accepted for publication in BIT Numerical Mathematic

    Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation

    Get PDF
    Sparse matrix-vector multiplication (spMVM) is the dominant operation in many sparse solvers. We investigate performance properties of spMVM with matrices of various sparsity patterns on the nVidia "Fermi" class of GPGPUs. A new "padded jagged diagonals storage" (pJDS) format is proposed which may substantially reduce the memory overhead intrinsic to the widespread ELLPACK-R scheme. In our test scenarios the pJDS format cuts the overall spMVM memory footprint on the GPGPU by up to 70%, and achieves 95% to 130% of the ELLPACK-R performance. Using a suitable performance model we identify performance bottlenecks on the node level that invalidate some types of matrix structures for efficient multi-GPGPU parallelization. For appropriate sparsity patterns we extend previous work on distributed-memory parallel spMVM to demonstrate a scalable hybrid MPI-GPGPU code, achieving efficient overlap of communication and computation.Comment: 10 pages, 5 figures. Added reference to other recent sparse matrix format
    corecore