745 research outputs found

    Fourier Based Fast Multipole Method for the Helmholtz Equation

    Full text link
    The fast multipole method (FMM) has had great success in reducing the computational complexity of solving the boundary integral form of the Helmholtz equation. We present a formulation of the Helmholtz FMM that uses Fourier basis functions rather than spherical harmonics. By modifying the transfer function in the precomputation stage of the FMM, time-critical stages of the algorithm are accelerated by causing the interpolation operators to become straightforward applications of fast Fourier transforms, retaining the diagonality of the transfer function, and providing a simplified error analysis. Using Fourier analysis, constructive algorithms are derived to a priori determine an integration quadrature for a given error tolerance. Sharp error bounds are derived and verified numerically. Various optimizations are considered to reduce the number of quadrature points and reduce the cost of computing the transfer function.Comment: 24 pages, 13 figure

    Spectral Ewald Acceleration of Stokesian Dynamics for polydisperse suspensions

    Get PDF
    In this work we develop the Spectral Ewald Accelerated Stokesian Dynamics (SEASD), a novel computational method for dynamic simulations of polydisperse colloidal suspensions with full hydrodynamic interactions. SEASD is based on the framework of Stokesian Dynamics (SD) with extension to compressible solvents, and uses the Spectral Ewald (SE) method [Lindbo & Tornberg, J. Comput. Phys. 229 (2010) 8994] for the wave-space mobility computation. To meet the performance requirement of dynamic simulations, we use Graphic Processing Units (GPU) to evaluate the suspension mobility, and achieve an order of magnitude speedup compared to a CPU implementation. For further speedup, we develop a novel far-field block-diagonal preconditioner to reduce the far-field evaluations in the iterative solver, and SEASD-nf, a polydisperse extension of the mean-field Brownian approximation of Banchio & Brady [J. Chem. Phys. 118 (2003) 10323]. We extensively discuss implementation and parameter selection strategies in SEASD, and demonstrate the spectral accuracy in the mobility evaluation and the overall O(NlogN)\mathcal{O}(N\log N) computation scaling. We present three computational examples to further validate SEASD and SEASD-nf in monodisperse and bidisperse suspensions: the short-time transport properties, the equilibrium osmotic pressure and viscoelastic moduli, and the steady shear Brownian rheology. Our validation results show that the agreement between SEASD and SEASD-nf is satisfactory over a wide range of parameters, and also provide significant insight into the dynamics of polydisperse colloidal suspensions.Comment: 39 pages, 21 figure

    Fast algorithms and efficient GPU implementations for the Radon transform and the back-projection operator represented as convolution operators

    Full text link
    The Radon transform and its adjoint, the back-projection operator, can both be expressed as convolutions in log-polar coordinates. Hence, fast algorithms for the application of the operators can be constructed by using FFT, if data is resampled at log-polar coordinates. Radon data is typically measured on an equally spaced grid in polar coordinates, and reconstructions are represented (as images) in Cartesian coordinates. Therefore, in addition to FFT, several steps of interpolation have to be conducted in order to apply the Radon transform and the back-projection operator by means of convolutions. Both the interpolation and the FFT operations can be efficiently implemented on Graphical Processor Units (GPUs). For the interpolation, it is possible to make use of the fact that linear interpolation is hard-wired on GPUs, meaning that it has the same computational cost as direct memory access. Cubic order interpolation schemes can be constructed by combining linear interpolation steps which provides important computation speedup. We provide details about how the Radon transform and the back-projection can be implemented efficiently as convolution operators on GPUs. For large data sizes, speedups of about 10 times are obtained in relation to the computational times of other software packages based on GPU implementations of the Radon transform and the back-projection operator. Moreover, speedups of more than a 1000 times are obtained against the CPU-implementations provided in the MATLAB image processing toolbox

    High Performance Reconstruction Framework for Straight Ray Tomography:from Micro to Nano Resolution Imaging

    Get PDF
    We develop a high-performance scheme to reconstruct straight-ray tomographic scans. We preserve the quality of the state-of-the-art schemes typically found in traditional computed tomography but reduce the computational cost substantially. Our approach is based on 1) a rigorous discretization of the forward model using a generalized sampling scheme; 2) a variational formulation of the reconstruction problem; and 3) iterative reconstruction algorithms that use the alternating-direction method of multipliers. To improve the quality of the reconstruction, we take advantage of total-variation regularization and its higher-order variants. In addition, the prior information on the support and the positivity of the refractive index are both considered, which yields significant improvements. The two challenging applications to which we apply the methods of our framework are grating-based \mbox{x-ray} imaging (GI) and single-particle analysis (SPA). In the context of micro-resolution GI, three complementary characteristics are measured: the conventional absorption contrast, the differential phase contrast, and the small-angle scattering contrast. While these three measurements provide powerful insights on biological samples, up to now they were calling for a large-dose deposition which potentially was harming the specimens ({\textit{e.g.}}, in small-rodent scanners). As it turns out, we are able to preserve the image quality of filtered back-projection-type methods despite the fewer acquisition angles and the lower signal-to-noise ratio implied by a reduction in the total dose of {\textit{in-vivo}} grating interferometry. To achieve this, we first apply our reconstruction framework to differential phase-contrast imaging (DPCI). We then add Jacobian-type regularization to simultaneously reconstruct phase and absorption. The experimental results confirm the power of our method. This is a crucial step toward the deployment of DPCI in medicine and biology. Our algorithms have been implemented in the TOMCAT laboratory of the Paul Scherrer Institute. In the context of near-atomic-resolution SPA, we need to cope with hundreds or thousands of noisy projections of macromolecules onto different micrographs. Moreover, each projection has an unknown orientation and is blurred by some space-dependent point-spread function of the microscope. Consequently, the determination of the structure of a macromolecule involves not only a reconstruction task, but also the deconvolution of each projection image. We formulate this problem as a constrained regularized reconstruction. We are able to directly include the contrast transfer function in the system matrix without any extra computational cost. The experimental results suggest that our approach brings a significant improvement in the quality of the reconstruction. Our framework also provides an important step toward the application of SPA for the {\textit{de novo}} generation of macromolecular models. The corresponding algorithms have been implemented in Xmipp

    A Dual-space Multilevel Kernel-splitting Framework for Discrete and Continuous Convolution

    Full text link
    We introduce a new class of multilevel, adaptive, dual-space methods for computing fast convolutional transforms. These methods can be applied to a broad class of kernels, from the Green's functions for classical partial differential equations (PDEs) to power functions and radial basis functions such as those used in statistics and machine learning. The DMK (dual-space multilevel kernel-splitting) framework uses a hierarchy of grids, computing a smoothed interaction at the coarsest level, followed by a sequence of corrections at finer and finer scales until the problem is entirely local, at which point direct summation is applied. The main novelty of DMK is that the interaction at each scale is diagonalized by a short Fourier transform, permitting the use of separation of variables, but without requiring the FFT for its asymptotic performance. The DMK framework substantially simplifies the algorithmic structure of the fast multipole method (FMM) and unifies the FMM, Ewald summation, and multilevel summation, achieving speeds comparable to the FFT in work per gridpoint, even in a fully adaptive context. For continuous source distributions, the evaluation of local interactions is further accelerated by approximating the kernel at the finest level as a sum of Gaussians with a highly localized remainder. The Gaussian convolutions are calculated using tensor product transforms, and the remainder term is calculated using asymptotic methods. We illustrate the performance of DMK for both continuous and discrete sources with extensive numerical examples in two and three dimensions.Comment: 53 pages, 15 figure
    corecore