4,441 research outputs found

    Applications and accuracy of the parallel diagonal dominant algorithm

    Get PDF
    The Parallel Diagonal Dominant (PDD) algorithm is a highly efficient, ideally scalable tridiagonal solver. In this paper, a detailed study of the PDD algorithm is given. First the PDD algorithm is introduced. Then the algorithm is extended to solve periodic tridiagonal systems. A variant, the reduced PDD algorithm, is also proposed. Accuracy analysis is provided for a class of tridiagonal systems, the symmetric, and anti-symmetric Toeplitz tridiagonal systems. Implementation results show that the analysis gives a good bound on the relative error, and the algorithm is a good candidate for the emerging massively parallel machines

    Optimization viewpoint on Kalman smoothing, with applications to robust and sparse estimation

    Full text link
    In this paper, we present the optimization formulation of the Kalman filtering and smoothing problems, and use this perspective to develop a variety of extensions and applications. We first formulate classic Kalman smoothing as a least squares problem, highlight special structure, and show that the classic filtering and smoothing algorithms are equivalent to a particular algorithm for solving this problem. Once this equivalence is established, we present extensions of Kalman smoothing to systems with nonlinear process and measurement models, systems with linear and nonlinear inequality constraints, systems with outliers in the measurements or sudden changes in the state, and systems where the sparsity of the state sequence must be accounted for. All extensions preserve the computational efficiency of the classic algorithms, and most of the extensions are illustrated with numerical examples, which are part of an open source Kalman smoothing Matlab/Octave package.Comment: 46 pages, 11 figure

    Parallel Factorizations in Numerical Analysis

    Full text link
    In this paper we review the parallel solution of sparse linear systems, usually deriving by the discretization of ODE-IVPs or ODE-BVPs. The approach is based on the concept of parallel factorization of a (block) tridiagonal matrix. This allows to obtain efficient parallel extensions of many known matrix factorizations, and to derive, as a by-product, a unifying approach to the parallel solution of ODEs.Comment: 15 pages, 5 figure

    Some fast elliptic solvers on parallel architectures and their complexities

    Get PDF
    The discretization of separable elliptic partial differential equations leads to linear systems with special block triangular matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconsistant coefficients. A method was recently proposed to parallelize and vectorize BCR. Here, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches, including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational complexity lower than that of parallel BCR

    Structure Preserving Parallel Algorithms for Solving the Bethe-Salpeter Eigenvalue Problem

    Full text link
    The Bethe-Salpeter eigenvalue problem is a dense structured eigenvalue problem arising from discretized Bethe-Salpeter equation in the context of computing exciton energies and states. A computational challenge is that at least half of the eigenvalues and the associated eigenvectors are desired in practice. We establish the equivalence between Bethe-Salpeter eigenvalue problems and real Hamiltonian eigenvalue problems. Based on theoretical analysis, structure preserving algorithms for a class of Bethe-Salpeter eigenvalue problems are proposed. We also show that for this class of problems all eigenvalues obtained from the Tamm-Dancoff approximation are overestimated. In order to solve large scale problems of practical interest, we discuss parallel implementations of our algorithms targeting distributed memory systems. Several numerical examples are presented to demonstrate the efficiency and accuracy of our algorithms

    Fast Algorithms for the computation of Fourier Extensions of arbitrary length

    Get PDF
    Fourier series of smooth, non-periodic functions on [1,1][-1,1] are known to exhibit the Gibbs phenomenon, and exhibit overall slow convergence. One way of overcoming these problems is by using a Fourier series on a larger domain, say [T,T][-T,T] with T>1T>1, a technique called Fourier extension or Fourier continuation. When constructed as the discrete least squares minimizer in equidistant points, the Fourier extension has been shown shown to converge geometrically in the truncation parameter NN. A fast O(Nlog2N){\mathcal O}(N \log^2 N) algorithm has been described to compute Fourier extensions for the case where T=2T=2, compared to O(N3){\mathcal O}(N^3) for solving the dense discrete least squares problem. We present two O(Nlog2N){\mathcal O}(N\log^2 N ) algorithms for the computation of these approximations for the case of general TT, made possible by exploiting the connection between Fourier extensions and Prolate Spheroidal Wave theory. The first algorithm is based on the explicit computation of so-called periodic discrete prolate spheroidal sequences, while the second algorithm is purely algebraic and only implicitly based on the theory
    corecore