4,441 research outputs found
Applications and accuracy of the parallel diagonal dominant algorithm
The Parallel Diagonal Dominant (PDD) algorithm is a highly efficient, ideally scalable tridiagonal solver. In this paper, a detailed study of the PDD algorithm is given. First the PDD algorithm is introduced. Then the algorithm is extended to solve periodic tridiagonal systems. A variant, the reduced PDD algorithm, is also proposed. Accuracy analysis is provided for a class of tridiagonal systems, the symmetric, and anti-symmetric Toeplitz tridiagonal systems. Implementation results show that the analysis gives a good bound on the relative error, and the algorithm is a good candidate for the emerging massively parallel machines
Optimization viewpoint on Kalman smoothing, with applications to robust and sparse estimation
In this paper, we present the optimization formulation of the Kalman
filtering and smoothing problems, and use this perspective to develop a variety
of extensions and applications. We first formulate classic Kalman smoothing as
a least squares problem, highlight special structure, and show that the classic
filtering and smoothing algorithms are equivalent to a particular algorithm for
solving this problem. Once this equivalence is established, we present
extensions of Kalman smoothing to systems with nonlinear process and
measurement models, systems with linear and nonlinear inequality constraints,
systems with outliers in the measurements or sudden changes in the state, and
systems where the sparsity of the state sequence must be accounted for. All
extensions preserve the computational efficiency of the classic algorithms, and
most of the extensions are illustrated with numerical examples, which are part
of an open source Kalman smoothing Matlab/Octave package.Comment: 46 pages, 11 figure
Parallel Factorizations in Numerical Analysis
In this paper we review the parallel solution of sparse linear systems,
usually deriving by the discretization of ODE-IVPs or ODE-BVPs. The approach is
based on the concept of parallel factorization of a (block) tridiagonal matrix.
This allows to obtain efficient parallel extensions of many known matrix
factorizations, and to derive, as a by-product, a unifying approach to the
parallel solution of ODEs.Comment: 15 pages, 5 figure
Some fast elliptic solvers on parallel architectures and their complexities
The discretization of separable elliptic partial differential equations leads to linear systems with special block triangular matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconsistant coefficients. A method was recently proposed to parallelize and vectorize BCR. Here, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches, including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational complexity lower than that of parallel BCR
Structure Preserving Parallel Algorithms for Solving the Bethe-Salpeter Eigenvalue Problem
The Bethe-Salpeter eigenvalue problem is a dense structured eigenvalue
problem arising from discretized Bethe-Salpeter equation in the context of
computing exciton energies and states. A computational challenge is that at
least half of the eigenvalues and the associated eigenvectors are desired in
practice. We establish the equivalence between Bethe-Salpeter eigenvalue
problems and real Hamiltonian eigenvalue problems. Based on theoretical
analysis, structure preserving algorithms for a class of Bethe-Salpeter
eigenvalue problems are proposed. We also show that for this class of problems
all eigenvalues obtained from the Tamm-Dancoff approximation are overestimated.
In order to solve large scale problems of practical interest, we discuss
parallel implementations of our algorithms targeting distributed memory
systems. Several numerical examples are presented to demonstrate the efficiency
and accuracy of our algorithms
Fast Algorithms for the computation of Fourier Extensions of arbitrary length
Fourier series of smooth, non-periodic functions on are known to
exhibit the Gibbs phenomenon, and exhibit overall slow convergence. One way of
overcoming these problems is by using a Fourier series on a larger domain, say
with , a technique called Fourier extension or Fourier
continuation. When constructed as the discrete least squares minimizer in
equidistant points, the Fourier extension has been shown shown to converge
geometrically in the truncation parameter . A fast algorithm has been described to compute Fourier extensions for the case
where , compared to for solving the dense discrete
least squares problem. We present two algorithms for
the computation of these approximations for the case of general , made
possible by exploiting the connection between Fourier extensions and Prolate
Spheroidal Wave theory. The first algorithm is based on the explicit
computation of so-called periodic discrete prolate spheroidal sequences, while
the second algorithm is purely algebraic and only implicitly based on the
theory
- …