154 research outputs found

    Optimization of 3-D Wavelet Decomposition on Multiprocessors

    Get PDF
    In this work we discuss various ideas for the optimization of 3-D wavelet/subband decomposition on shared memory MIMD computers. We theoretically evaluate the characteristics of these approaches and verify the results on parallel computers. Experimental results are conducted on a shared memory as well as a virtual shared memory architecture

    ARKCoS: Artifact-Suppressed Accelerated Radial Kernel Convolution on the Sphere

    Full text link
    We describe a hybrid Fourier/direct space convolution algorithm for compact radial (azimuthally symmetric) kernels on the sphere. For high resolution maps covering a large fraction of the sky, our implementation takes advantage of the inexpensive massive parallelism afforded by consumer graphics processing units (GPUs). Applications involve modeling of instrumental beam shapes in terms of compact kernels, computation of fine-scale wavelet transformations, and optimal filtering for the detection of point sources. Our algorithm works for any pixelization where pixels are grouped into isolatitude rings. Even for kernels that are not bandwidth limited, ringing features are completely absent on an ECP grid. We demonstrate that they can be highly suppressed on the popular HEALPix pixelization, for which we develop a freely available implementation of the algorithm. As an example application, we show that running on a high-end consumer graphics card our method speeds up beam convolution for simulations of a characteristic Planck high frequency instrument channel by two orders of magnitude compared to the commonly used HEALPix implementation on one CPU core while maintaining at typical a fractional RMS accuracy of about 1 part in 10^5.Comment: 10 pages, 6 figures. Submitted to Astronomy and Astrophysics. Replaced to match published version. Code can be downloaded at https://github.com/elsner/arkco

    Parallelization of image similarity analysis

    Get PDF
    The algorithmical architecture and structure is presented for the parallelization of image similarity analysis, based on obtaining multiple digital signatures for each image, in which each "signature" is composed by the most representative coefficients of the wavelet transform of the corresponding image area. In the present paper, image representation by wavelet transform coefficients is analyzed, as well as the convenience/necessity of using multiple coefficients for the study of similarity of images which may have transferred components, with change of sizes, color or texture. The complexity of the involved computation justifies parallelization, and the suggested solution constitutes a combination of a multiprocessors "pipelining", being each of them an homogeneous parallel architecture which obtains signature coefficients (wavelet). Partial reusability of computations for successive signatures makes these architectures pipelining compulsory.Facultad de Informátic

    Accelerating Wavelet-Based Video Coding on Graphics Hardware using CUDA

    Get PDF
    corecore