483 research outputs found

    Minimizing Communication for Eigenproblems and the Singular Value Decomposition

    Full text link
    Algorithms have two costs: arithmetic and communication. The latter represents the cost of moving data, either between levels of a memory hierarchy, or between processors over a network. Communication often dominates arithmetic and represents a rapidly increasing proportion of the total cost, so we seek algorithms that minimize communication. In \cite{BDHS10} lower bounds were presented on the amount of communication required for essentially all O(n3)O(n^3)-like algorithms for linear algebra, including eigenvalue problems and the SVD. Conventional algorithms, including those currently implemented in (Sca)LAPACK, perform asymptotically more communication than these lower bounds require. In this paper we present parallel and sequential eigenvalue algorithms (for pencils, nonsymmetric matrices, and symmetric matrices) and SVD algorithms that do attain these lower bounds, and analyze their convergence and communication costs.Comment: 43 pages, 11 figure

    Efficient numerical diagonalization of hermitian 3x3 matrices

    Full text link
    A very common problem in science is the numerical diagonalization of symmetric or hermitian 3x3 matrices. Since standard "black box" packages may be too inefficient if the number of matrices is large, we study several alternatives. We consider optimized implementations of the Jacobi, QL, and Cuppen algorithms and compare them with an analytical method relying on Cardano's formula for the eigenvalues and on vector cross products for the eigenvectors. Jacobi is the most accurate, but also the slowest method, while QL and Cuppen are good general purpose algorithms. The analytical algorithm outperforms the others by more than a factor of 2, but becomes inaccurate or may even fail completely if the matrix entries differ greatly in magnitude. This can mostly be circumvented by using a hybrid method, which falls back to QL if conditions are such that the analytical calculation might become too inaccurate. For all algorithms, we give an overview of the underlying mathematical ideas, and present detailed benchmark results. C and Fortran implementations of our code are available for download from http://www.mpi-hd.mpg.de/~globes/3x3/ .Comment: 13 pages, no figures, new hybrid algorithm added, matches published version, typo in Eq. (39) corrected; software library available at http://www.mpi-hd.mpg.de/~globes/3x3

    Jacobians and rank 1 perturbations relating to unitary Hessenberg matrices

    Get PDF
    In a recent work Killip and Nenciu gave random recurrences for the characteristic polynomials of certain unitary and real orthogonal upper Hessenberg matrices. The corresponding eigenvalue p.d.f.'s are beta-generalizations of the classical groups. Left open was the direct calculation of certain Jacobians. We provide the sought direct calculation. Furthermore, we show how a multiplicative rank 1 perturbation of the unitary Hessenberg matrices provides a joint eigenvalue p.d.f generalizing the circular beta-ensemble, and we show how this joint density is related to known inter-relations between circular ensembles. Projecting the joint density onto the real line leads to the derivation of a random three-term recurrence for polynomials with zeros distributed according to the circular Jacobi beta-ensemble.Comment: 23 page

    An O(log sup 2 N) parallel algorithm for computing the eigenvalues of a symmetric tridiagonal matrix

    Get PDF
    An O(log sup 2 N) parallel algorithm is presented for computing the eigenvalues of a symmetric tridiagonal matrix using a parallel algorithm for computing the zeros of the characteristic polynomial. The method is based on a quadratic recurrence in which the characteristic polynomial is constructed on a binary tree from polynomials whose degree doubles at each level. Intervals that contain exactly one zero are determined by the zeros of polynomials at the previous level which ensures that different processors compute different zeros. The exact behavior of the polynomials at the interval endpoints is used to eliminate the usual problems induced by finite precision arithmetic

    Subsampling Algorithms for Semidefinite Programming

    Full text link
    We derive a stochastic gradient algorithm for semidefinite optimization using randomization techniques. The algorithm uses subsampling to reduce the computational cost of each iteration and the subsampling ratio explicitly controls granularity, i.e. the tradeoff between cost per iteration and total number of iterations. Furthermore, the total computational cost is directly proportional to the complexity (i.e. rank) of the solution. We study numerical performance on some large-scale problems arising in statistical learning.Comment: Final version, to appear in Stochastic System

    Computation of all eigenvalues of matrices used in restricted maximum likelihood estimation of variance components using sparse matrix techniques

    Get PDF
    Restricted maximum likelihood (REML) estimates of variance components have desirable properties but can be very expensive computationally. Large costs result from the need for the repeated inversion of the large coefficient matrix of the mixed-model equations. This paper presents a method based on the computation of all eigenvalues using the Lanczos method, a technique reducing a large sparse symmetric matrix to a tridiagonal form. Dense matrix inversion is not required. It is accurate and not very demanding on storage requirements. The Lanczos method, the computation of eigenvalues, its application in a genetic context, and an example are presented.Les estimations du maximum de vraisemblance restreinte (REML) des composantes de variance ont des propriétés intéressantes mais peuvent être coûteuses en temps de calcul et en besoin de mémoire. Le problème vient de la nécessité d’inverser de façon répétée la matrice des coefficients des équations du modèle mixte. Cet article présente une méthode basée sur le calcul des valeurs propres et sur l’utilisation de la méthode de Lanczos, une technique permettant de réduire une matrice creuse, symétrique et de grande taille en une matrice tridiagonale. L’inversion de matrices denses n’est pas nécessaire. Cette méthode donne des résultats précis et ne demande que très peu de stockage en mémoire. La méthode de Lanczos, le calcul des valeurs propres, son application dans le contexte génétique et un exemple sont présentés

    Lanczos eigensolution method for high-performance computers

    Get PDF
    The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors
    • …
    corecore