745 research outputs found
Fourier Based Fast Multipole Method for the Helmholtz Equation
The fast multipole method (FMM) has had great success in reducing the
computational complexity of solving the boundary integral form of the Helmholtz
equation. We present a formulation of the Helmholtz FMM that uses Fourier basis
functions rather than spherical harmonics. By modifying the transfer function
in the precomputation stage of the FMM, time-critical stages of the algorithm
are accelerated by causing the interpolation operators to become
straightforward applications of fast Fourier transforms, retaining the
diagonality of the transfer function, and providing a simplified error
analysis. Using Fourier analysis, constructive algorithms are derived to a
priori determine an integration quadrature for a given error tolerance. Sharp
error bounds are derived and verified numerically. Various optimizations are
considered to reduce the number of quadrature points and reduce the cost of
computing the transfer function.Comment: 24 pages, 13 figure
Spectral Ewald Acceleration of Stokesian Dynamics for polydisperse suspensions
In this work we develop the Spectral Ewald Accelerated Stokesian Dynamics
(SEASD), a novel computational method for dynamic simulations of polydisperse
colloidal suspensions with full hydrodynamic interactions. SEASD is based on
the framework of Stokesian Dynamics (SD) with extension to compressible
solvents, and uses the Spectral Ewald (SE) method [Lindbo & Tornberg, J.
Comput. Phys. 229 (2010) 8994] for the wave-space mobility computation. To meet
the performance requirement of dynamic simulations, we use Graphic Processing
Units (GPU) to evaluate the suspension mobility, and achieve an order of
magnitude speedup compared to a CPU implementation. For further speedup, we
develop a novel far-field block-diagonal preconditioner to reduce the far-field
evaluations in the iterative solver, and SEASD-nf, a polydisperse extension of
the mean-field Brownian approximation of Banchio & Brady [J. Chem. Phys. 118
(2003) 10323]. We extensively discuss implementation and parameter selection
strategies in SEASD, and demonstrate the spectral accuracy in the mobility
evaluation and the overall computation scaling. We
present three computational examples to further validate SEASD and SEASD-nf in
monodisperse and bidisperse suspensions: the short-time transport properties,
the equilibrium osmotic pressure and viscoelastic moduli, and the steady shear
Brownian rheology. Our validation results show that the agreement between SEASD
and SEASD-nf is satisfactory over a wide range of parameters, and also provide
significant insight into the dynamics of polydisperse colloidal suspensions.Comment: 39 pages, 21 figure
Fast algorithms and efficient GPU implementations for the Radon transform and the back-projection operator represented as convolution operators
The Radon transform and its adjoint, the back-projection operator, can both
be expressed as convolutions in log-polar coordinates. Hence, fast algorithms
for the application of the operators can be constructed by using FFT, if data
is resampled at log-polar coordinates. Radon data is typically measured on an
equally spaced grid in polar coordinates, and reconstructions are represented
(as images) in Cartesian coordinates. Therefore, in addition to FFT, several
steps of interpolation have to be conducted in order to apply the Radon
transform and the back-projection operator by means of convolutions.
Both the interpolation and the FFT operations can be efficiently implemented
on Graphical Processor Units (GPUs). For the interpolation, it is possible to
make use of the fact that linear interpolation is hard-wired on GPUs, meaning
that it has the same computational cost as direct memory access. Cubic order
interpolation schemes can be constructed by combining linear interpolation
steps which provides important computation speedup.
We provide details about how the Radon transform and the back-projection can
be implemented efficiently as convolution operators on GPUs. For large data
sizes, speedups of about 10 times are obtained in relation to the computational
times of other software packages based on GPU implementations of the Radon
transform and the back-projection operator. Moreover, speedups of more than a
1000 times are obtained against the CPU-implementations provided in the MATLAB
image processing toolbox
High Performance Reconstruction Framework for Straight Ray Tomography:from Micro to Nano Resolution Imaging
We develop a high-performance scheme to reconstruct straight-ray tomographic scans. We preserve the quality of the state-of-the-art schemes typically found in traditional computed tomography but reduce the computational cost substantially. Our approach is based on 1) a rigorous discretization of the forward model using a generalized sampling scheme; 2) a variational formulation of the reconstruction problem; and 3) iterative reconstruction algorithms that use the alternating-direction method of multipliers. To improve the quality of the reconstruction, we take advantage of total-variation regularization and its higher-order variants. In addition, the prior information on the support and the positivity of the refractive index are both considered, which yields significant improvements. The two challenging applications to which we apply the methods of our framework are grating-based \mbox{x-ray} imaging (GI) and single-particle analysis (SPA). In the context of micro-resolution GI, three complementary characteristics are measured: the conventional absorption contrast, the differential phase contrast, and the small-angle scattering contrast. While these three measurements provide powerful insights on biological samples, up to now they were calling for a large-dose deposition which potentially was harming the specimens ({\textit{e.g.}}, in small-rodent scanners). As it turns out, we are able to preserve the image quality of filtered back-projection-type methods despite the fewer acquisition angles and the lower signal-to-noise ratio implied by a reduction in the total dose of {\textit{in-vivo}} grating interferometry. To achieve this, we first apply our reconstruction framework to differential phase-contrast imaging (DPCI). We then add Jacobian-type regularization to simultaneously reconstruct phase and absorption. The experimental results confirm the power of our method. This is a crucial step toward the deployment of DPCI in medicine and biology. Our algorithms have been implemented in the TOMCAT laboratory of the Paul Scherrer Institute. In the context of near-atomic-resolution SPA, we need to cope with hundreds or thousands of noisy projections of macromolecules onto different micrographs. Moreover, each projection has an unknown orientation and is blurred by some space-dependent point-spread function of the microscope. Consequently, the determination of the structure of a macromolecule involves not only a reconstruction task, but also the deconvolution of each projection image. We formulate this problem as a constrained regularized reconstruction. We are able to directly include the contrast transfer function in the system matrix without any extra computational cost. The experimental results suggest that our approach brings a significant improvement in the quality of the reconstruction. Our framework also provides an important step toward the application of SPA for the {\textit{de novo}} generation of macromolecular models. The corresponding algorithms have been implemented in Xmipp
A Dual-space Multilevel Kernel-splitting Framework for Discrete and Continuous Convolution
We introduce a new class of multilevel, adaptive, dual-space methods for
computing fast convolutional transforms. These methods can be applied to a
broad class of kernels, from the Green's functions for classical partial
differential equations (PDEs) to power functions and radial basis functions
such as those used in statistics and machine learning. The DMK (dual-space
multilevel kernel-splitting) framework uses a hierarchy of grids, computing a
smoothed interaction at the coarsest level, followed by a sequence of
corrections at finer and finer scales until the problem is entirely local, at
which point direct summation is applied. The main novelty of DMK is that the
interaction at each scale is diagonalized by a short Fourier transform,
permitting the use of separation of variables, but without requiring the FFT
for its asymptotic performance. The DMK framework substantially simplifies the
algorithmic structure of the fast multipole method (FMM) and unifies the FMM,
Ewald summation, and multilevel summation, achieving speeds comparable to the
FFT in work per gridpoint, even in a fully adaptive context. For continuous
source distributions, the evaluation of local interactions is further
accelerated by approximating the kernel at the finest level as a sum of
Gaussians with a highly localized remainder. The Gaussian convolutions are
calculated using tensor product transforms, and the remainder term is
calculated using asymptotic methods. We illustrate the performance of DMK for
both continuous and discrete sources with extensive numerical examples in two
and three dimensions.Comment: 53 pages, 15 figure
- …