11,897 research outputs found
Computational Physics on Graphics Processing Units
The use of graphics processing units for scientific computations is an
emerging strategy that can significantly speed up various different algorithms.
In this review, we discuss advances made in the field of computational physics,
focusing on classical molecular dynamics, and on quantum simulations for
electronic structure calculations using the density functional theory, wave
function techniques, and quantum field theory.Comment: Proceedings of the 11th International Conference, PARA 2012,
Helsinki, Finland, June 10-13, 201
QCD simulations with staggered fermions on GPUs
We report on our implementation of the RHMC algorithm for the simulation of
lattice QCD with two staggered flavors on Graphics Processing Units, using the
NVIDIA CUDA programming language. The main feature of our code is that the GPU
is not used just as an accelerator, but instead the whole Molecular Dynamics
trajectory is performed on it. After pointing out the main bottlenecks and how
to circumvent them, we discuss the obtained performances. We present some
preliminary results regarding OpenCL and multiGPU extensions of our code and
discuss future perspectives.Comment: 22 pages, 14 eps figures, final version to be published in Computer
Physics Communication
Parallel accelerated cyclic reduction preconditioner for three-dimensional elliptic PDEs with variable coefficients
We present a robust and scalable preconditioner for the solution of
large-scale linear systems that arise from the discretization of elliptic PDEs
amenable to rank compression. The preconditioner is based on hierarchical
low-rank approximations and the cyclic reduction method. The setup and
application phases of the preconditioner achieve log-linear complexity in
memory footprint and number of operations, and numerical experiments exhibit
good weak and strong scalability at large processor counts in a distributed
memory environment. Numerical experiments with linear systems that feature
symmetry and nonsymmetry, definiteness and indefiniteness, constant and
variable coefficients demonstrate the preconditioner applicability and
robustness. Furthermore, it is possible to control the number of iterations via
the accuracy threshold of the hierarchical matrix approximations and their
arithmetic operations, and the tuning of the admissibility condition parameter.
Together, these parameters allow for optimization of the memory requirements
and performance of the preconditioner.Comment: 24 pages, Elsevier Journal of Computational and Applied Mathematics,
Dec 201
Faster Inversion and Other Black Box Matrix Computations Using Efficient Block Projections
Block projections have been used, in [Eberly et al. 2006], to obtain an
efficient algorithm to find solutions for sparse systems of linear equations. A
bound of softO(n^(2.5)) machine operations is obtained assuming that the input
matrix can be multiplied by a vector with constant-sized entries in softO(n)
machine operations. Unfortunately, the correctness of this algorithm depends on
the existence of efficient block projections, and this has been conjectured. In
this paper we establish the correctness of the algorithm from [Eberly et al.
2006] by proving the existence of efficient block projections over sufficiently
large fields. We demonstrate the usefulness of these projections by deriving
improved bounds for the cost of several matrix problems, considering, in
particular, ``sparse'' matrices that can be be multiplied by a vector using
softO(n) field operations. We show how to compute the inverse of a sparse
matrix over a field F using an expected number of softO(n^(2.27)) operations in
F. A basis for the null space of a sparse matrix, and a certification of its
rank, are obtained at the same cost. An application to Kaltofen and Villard's
Baby-Steps/Giant-Steps algorithms for the determinant and Smith Form of an
integer matrix yields algorithms requiring softO(n^(2.66)) machine operations.
The derived algorithms are all probabilistic of the Las Vegas type
Efficient approximation of functions of some large matrices by partial fraction expansions
Some important applicative problems require the evaluation of functions
of large and sparse and/or \emph{localized} matrices . Popular and
interesting techniques for computing and , where
is a vector, are based on partial fraction expansions. However,
some of these techniques require solving several linear systems whose matrices
differ from by a complex multiple of the identity matrix for computing
or require inverting sequences of matrices with the same
characteristics for computing . Here we study the use and the
convergence of a recent technique for generating sequences of incomplete
factorizations of matrices in order to face with both these issues. The
solution of the sequences of linear systems and approximate matrix inversions
above can be computed efficiently provided that shows certain decay
properties. These strategies have good parallel potentialities. Our claims are
confirmed by numerical tests
- …