Search CORE

483 research outputs found

Minimizing Communication for Eigenproblems and the Singular Value Decomposition

Author: Ballard Grey
Demmel James
Dumitriu Ioana
Publication venue
Publication date: 01/01/2010
Field of study

Algorithms have two costs: arithmetic and communication. The latter represents the cost of moving data, either between levels of a memory hierarchy, or between processors over a network. Communication often dominates arithmetic and represents a rapidly increasing proportion of the total cost, so we seek algorithms that minimize communication. In \cite{BDHS10} lower bounds were presented on the amount of communication required for essentially all

O(n^3)

-like algorithms for linear algebra, including eigenvalue problems and the SVD. Conventional algorithms, including those currently implemented in (Sca)LAPACK, perform asymptotically more communication than these lower bounds require. In this paper we present parallel and sequential eigenvalue algorithms (for pencils, nonsymmetric matrices, and symmetric matrices) and SVD algorithms that do attain these lower bounds, and analyze their convergence and communication costs.Comment: 43 pages, 11 figure

arXiv.org e-Print Archive

CiteSeerX

Efficient numerical diagonalization of hermitian 3x3 matrices

Author: Fadeev D.
Galassi M.
Goldstein H.
JOACHIM KOPP
Press W. H.
Ralston A.
van der Waerden B. L.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2008
Field of study

A very common problem in science is the numerical diagonalization of symmetric or hermitian 3x3 matrices. Since standard "black box" packages may be too inefficient if the number of matrices is large, we study several alternatives. We consider optimized implementations of the Jacobi, QL, and Cuppen algorithms and compare them with an analytical method relying on Cardano's formula for the eigenvalues and on vector cross products for the eigenvectors. Jacobi is the most accurate, but also the slowest method, while QL and Cuppen are good general purpose algorithms. The analytical algorithm outperforms the others by more than a factor of 2, but becomes inaccurate or may even fail completely if the matrix entries differ greatly in magnitude. This can mostly be circumvented by using a hybrid method, which falls back to QL if conditions are such that the analytical calculation might become too inaccurate. For all algorithms, we give an overview of the underlying mathematical ideas, and present detailed benchmark results. C and Fortran implementations of our code are available for download from http://www.mpi-hd.mpg.de/~globes/3x3/ .Comment: 13 pages, no figures, new hybrid algorithm added, matches published version, typo in Eq. (39) corrected; software library available at http://www.mpi-hd.mpg.de/~globes/3x3

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Jacobians and rank 1 perturbations relating to unitary Hessenberg matrices

Author: Forrester Peter J.
Rains Eric M.
Publication venue
Publication date: 25/05/2005
Field of study

In a recent work Killip and Nenciu gave random recurrences for the characteristic polynomials of certain unitary and real orthogonal upper Hessenberg matrices. The corresponding eigenvalue p.d.f.'s are beta-generalizations of the classical groups. Left open was the direct calculation of certain Jacobians. We provide the sought direct calculation. Furthermore, we show how a multiplicative rank 1 perturbation of the unitary Hessenberg matrices provides a joint eigenvalue p.d.f generalizing the circular beta-ensemble, and we show how this joint density is related to known inter-relations between circular ensembles. Projecting the joint density onto the real line leads to the derivation of a random three-term recurrence for polynomials with zeros distributed according to the circular Jacobi beta-ensemble.Comment: 23 page

arXiv.org e-Print Archive

eScholarship - University of California

Caltech Authors

An O(log sup 2 N) parallel algorithm for computing the eigenvalues of a symmetric tridiagonal matrix

Author: Swarztrauber Paul N.
Publication venue
Publication date
Field of study

An O(log sup 2 N) parallel algorithm is presented for computing the eigenvalues of a symmetric tridiagonal matrix using a parallel algorithm for computing the zeros of the characteristic polynomial. The method is based on a quadratic recurrence in which the characteristic polynomial is constructed on a binary tree from polynomials whose degree doubles at each level. Intervals that contain exactly one zero are determined by the zeros of polynomials at the previous level which ensures that different processors compute different zeros. The exact behavior of the polynomials at the interval endpoints is used to eliminate the usual problems induced by finite precision arithmetic

NASA Technical Reports Server

Subsampling Algorithms for Semidefinite Programming

Author: d'Aspremont Alexandre
Publication venue
Publication date: 01/01/2011
Field of study

We derive a stochastic gradient algorithm for semidefinite optimization using randomization techniques. The algorithm uses subsampling to reduce the computational cost of each iteration and the subsampling ratio explicitly controls granularity, i.e. the tradeoff between cost per iteration and total number of iterations. Furthermore, the total computational cost is directly proportional to the complexity (i.e. rank) of the solution. We study numerical performance on some large-scale problems arising in statistical learning.Comment: Final version, to appear in Stochastic System

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Directory of Open Access Journals

Computation of all eigenvalues of matrices used in restricted maximum likelihood estimation of variance components using sparse matrix techniques

Author: Ducrocq V
Robert C
Publication venue: BioMed Central
Publication date: 01/01/1996
Field of study

Restricted maximum likelihood (REML) estimates of variance components have desirable properties but can be very expensive computationally. Large costs result from the need for the repeated inversion of the large coefficient matrix of the mixed-model equations. This paper presents a method based on the computation of all eigenvalues using the Lanczos method, a technique reducing a large sparse symmetric matrix to a tridiagonal form. Dense matrix inversion is not required. It is accurate and not very demanding on storage requirements. The Lanczos method, the computation of eigenvalues, its application in a genetic context, and an example are presented.Les estimations du maximum de vraisemblance restreinte (REML) des composantes de variance ont des propriétés intéressantes mais peuvent être coûteuses en temps de calcul et en besoin de mémoire. Le problème vient de la nécessité d’inverser de façon répétée la matrice des coefficients des équations du modèle mixte. Cet article présente une méthode basée sur le calcul des valeurs propres et sur l’utilisation de la méthode de Lanczos, une technique permettant de réduire une matrice creuse, symétrique et de grande taille en une matrice tridiagonale. L’inversion de matrices denses n’est pas nécessaire. Cette méthode donne des résultats précis et ne demande que très peu de stockage en mémoire. La méthode de Lanczos, le calcul des valeurs propres, son application dans le contexte génétique et un exemple sont présentés

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Springer

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ProdInra

Lanczos eigensolution method for high-performance computers

Author: Bostic Susan W.
Publication venue
Publication date
Field of study

The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors

NASA Technical Reports Server