3,715 research outputs found
A hierarchically blocked Jacobi SVD algorithm for single and multiple graphics processing units
We present a hierarchically blocked one-sided Jacobi algorithm for the
singular value decomposition (SVD), targeting both single and multiple graphics
processing units (GPUs). The blocking structure reflects the levels of GPU's
memory hierarchy. The algorithm may outperform MAGMA's dgesvd, while retaining
high relative accuracy. To this end, we developed a family of parallel pivot
strategies on GPU's shared address space, but applicable also to inter-GPU
communication. Unlike common hybrid approaches, our algorithm in a single GPU
setting needs a CPU for the controlling purposes only, while utilizing GPU's
resources to the fullest extent permitted by the hardware. When required by the
problem size, the algorithm, in principle, scales to an arbitrary number of GPU
nodes. The scalability is demonstrated by more than twofold speedup for
sufficiently large matrices on a Tesla S2050 system with four GPUs vs. a single
Fermi card.Comment: Accepted for publication in SIAM Journal on Scientific Computin
Efficient implementation of the Hardy-Ramanujan-Rademacher formula
We describe how the Hardy-Ramanujan-Rademacher formula can be implemented to
allow the partition function to be computed with softly optimal
complexity and very little overhead. A new implementation
based on these techniques achieves speedups in excess of a factor 500 over
previously published software and has been used by the author to calculate
, an exponent twice as large as in previously reported
computations.
We also investigate performance for multi-evaluation of , where our
implementation of the Hardy-Ramanujan-Rademacher formula becomes superior to
power series methods on far denser sets of indices than previous
implementations. As an application, we determine over 22 billion new
congruences for the partition function, extending Weaver's tabulation of 76,065
congruences.Comment: updated version containing an unconditional complexity proof;
accepted for publication in LMS Journal of Computation and Mathematic
Complex and Hypercomplex Discrete Fourier Transforms Based on Matrix Exponential Form of Euler's Formula
We show that the discrete complex, and numerous hypercomplex, Fourier
transforms defined and used so far by a number of researchers can be unified
into a single framework based on a matrix exponential version of Euler's
formula , and a matrix root of -1
isomorphic to the imaginary root . The transforms thus defined can be
computed using standard matrix multiplications and additions with no
hypercomplex code, the complex or hypercomplex algebra being represented by the
form of the matrix root of -1, so that the matrix multiplications are
equivalent to multiplications in the appropriate algebra. We present examples
from the complex, quaternion and biquaternion algebras, and from Clifford
algebras Cl1,1 and Cl2,0. The significance of this result is both in the
theoretical unification, and also in the scope it affords for insight into the
structure of the various transforms, since the formulation is such a simple
generalization of the classic complex case. It also shows that hypercomplex
discrete Fourier transforms may be computed using standard matrix arithmetic
packages without the need for a hypercomplex library, which is of importance in
providing a reference implementation for verifying implementations based on
hypercomplex code.Comment: The paper has been revised since the second version to make some of
the reasons for the paper clearer, to include reviews of prior hypercomplex
transforms, and to clarify some points in the conclusion
The exponentially convergent trapezoidal rule
It is well known that the trapezoidal rule converges geometrically when applied to analytic functions on periodic intervals or the real line. The mathematics and history of this phenomenon are reviewed and it is shown that far from being a curiosity, it is linked with computational methods all across scientific computing, including algorithms related to inverse Laplace transforms, special functions, complex analysis, rational approximation, integral equations, and the computation of functions and eigenvalues of matrices and operators
Quantum algorithm and circuit design solving the Poisson equation
The Poisson equation occurs in many areas of science and engineering. Here we
focus on its numerical solution for an equation in d dimensions. In particular
we present a quantum algorithm and a scalable quantum circuit design which
approximates the solution of the Poisson equation on a grid with error
\varepsilon. We assume we are given a supersposition of function evaluations of
the right hand side of the Poisson equation. The algorithm produces a quantum
state encoding the solution. The number of quantum operations and the number of
qubits used by the circuit is almost linear in d and polylog in
\varepsilon^{-1}. We present quantum circuit modules together with performance
guarantees which can be also used for other problems.Comment: 30 pages, 9 figures. This is the revised version for publication in
New Journal of Physic
- …