753 research outputs found

    Differential qd algorithm with shifts for rank-structured matrices

    Full text link
    Although QR iterations dominate in eigenvalue computations, there are several important cases when alternative LR-type algorithms may be preferable. In particular, in the symmetric tridiagonal case where differential qd algorithm with shifts (dqds) proposed by Fernando and Parlett enjoys often faster convergence while preserving high relative accuracy (that is not guaranteed in QR algorithm). In eigenvalue computations for rank-structured matrices QR algorithm is also a popular choice since, in the symmetric case, the rank structure is preserved. In the unsymmetric case, however, QR algorithm destroys the rank structure and, hence, LR-type algorithms come to play once again. In the current paper we discover several variants of qd algorithms for quasiseparable matrices. Remarkably, one of them, when applied to Hessenberg matrices becomes a direct generalization of dqds algorithm for tridiagonal matrices. Therefore, it can be applied to such important matrices as companion and confederate, and provides an alternative algorithm for finding roots of a polynomial represented in the basis of orthogonal polynomials. Results of preliminary numerical experiments are presented

    Minimizing Communication for Eigenproblems and the Singular Value Decomposition

    Full text link
    Algorithms have two costs: arithmetic and communication. The latter represents the cost of moving data, either between levels of a memory hierarchy, or between processors over a network. Communication often dominates arithmetic and represents a rapidly increasing proportion of the total cost, so we seek algorithms that minimize communication. In \cite{BDHS10} lower bounds were presented on the amount of communication required for essentially all O(n3)O(n^3)-like algorithms for linear algebra, including eigenvalue problems and the SVD. Conventional algorithms, including those currently implemented in (Sca)LAPACK, perform asymptotically more communication than these lower bounds require. In this paper we present parallel and sequential eigenvalue algorithms (for pencils, nonsymmetric matrices, and symmetric matrices) and SVD algorithms that do attain these lower bounds, and analyze their convergence and communication costs.Comment: 43 pages, 11 figure

    A new approximate matrix factorization for implicit time integration in air pollution modeling

    Get PDF
    Implicit time stepping typically requires solution of one or several linear systems with a matrix I−τJ per time step where J is the Jacobian matrix. If solution of these systems is expensive, replacing I−τJ with its approximate matrix factorization (AMF) (I−τR)(I−τV), R+V=J, often leads to a good compromise between stability and accuracy of the time integration on the one hand and its efficiency on the other hand. For example, in air pollution modeling, AMF has been successfully used in the framework of Rosenbrock schemes. The standard AMF gives an approximation to I−τJ with the error τ2RV, which can be significant in norm. In this paper we propose a new AMF. In assumption that −V is an M-matrix, the error of the new AMF can be shown to have an upper bound τ||R||, while still being asymptotically O(τ2)O(\tau^2). This new AMF, called AMF+, is equal in costs to standard AMF and, as both analysis and numerical experiments reveal, provides a better accuracy. We also report on our experience with another, cheaper AMF and with AMF-preconditioned GMRES

    Solution of partial differential equations on vector and parallel computers

    Get PDF
    The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

    Mixed-Precision Numerical Linear Algebra Algorithms: Integer Arithmetic Based LU Factorization and Iterative Refinement for Hermitian Eigenvalue Problem

    Get PDF
    Mixed-precision algorithms are a class of algorithms that uses low precision in part of the algorithm in order to save time and energy with less accurate computation and communication. These algorithms usually utilize iterative refinement processes to improve the approximate solution obtained from low precision to the accuracy we desire from doing all the computation in high precision. Due to the demand of deep learning applications, there are hardware developments offering different low-precision formats including half precision (FP16), Bfloat16 and integer operations for quantized integers, which uses integers with a shared scalar to represent a set of equally spaced numbers. As new hardware architectures focus on bringing performance in these formats, the mixed-precision algorithms have more potential leverage on them and outmatch traditional fixed-precision algorithms. This dissertation consists of two articles. In the first article, we adapt one of the most fundamental algorithms in numerical linear algebra---LU factorization with partial pivoting--- to use integer arithmetic. With the goal of obtaining a low accuracy factorization as the preconditioner of generalized minimal residual (GMRES) to solve systems of linear equations, the LU factorization is adapted to use two different fixed-point formats for matrices L and U. A left-looking variant is also proposed for matrices with unbounded column growth. Finally, GMRES iterative refinement has shown that it can work on matrices with condition numbers up to 10000 with the algorithm that uses int16 as input and int32 accumulator for the update step. The second article targets symmetric and Hermitian eigenvalue problems. In this section we revisit the SICE algorithm from Dongarra et al. By applying the Sherman-Morrison formula on the diagonally-shifted tridiagonal systems, we propose an updated SICE-SM algorithm. By incorporating the latest two-stage algorithms from the PLASMA and MAGMA software libraries for numerical linear algebra, we achieved up to 3.6x speedup using the mixed-precision eigensolver with the blocked SICE-SM algorithm for iterative refinement when compared with full double complex precision solvers for the cases with a portion of eigenvalues and eigenvectors requested

    A bibliography on parallel and vector numerical algorithms

    Get PDF
    This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

    Fast solvers for tridiagonal Toeplitz linear systems

    Get PDF
    Let A be a tridiagonal Toeplitz matrix denoted by A=Tritoep(β,α,γ). The matrix A is said to be: strictly diagonally dominant if |α|>|β|+|γ|, weakly diagonally dominant if |α|≥|β|+|γ|, subdiagonally dominant if |β|≥|α|+|γ|, and superdiagonally dominant if |γ|≥|α|+|β|. In this paper, we consider the solution of a tridiagonal Toeplitz system Ax=b, where A is subdiagonally dominant, superdiagonally dominant, or weakly diagonally dominant, respectively. We first consider the case of A being subdiagonally dominant. We transform A into a block 2×2 matrix by an elementary transformation and then solve such a linear system using the block LU factorization. Compared with the LU factorization method with pivoting, our algorithm takes less flops, and needs less memory storage and data transmission. In particular, our algorithm outperforms the LU factorization method with pivoting in terms of computing efficiency. Then, we deal with superdiagonally dominant and weakly diagonally dominant cases, respectively. Numerical experiments are finally given to illustrate the effectiveness of our algorithmsNational Natural Science Foundation of China under Grant no. 11371075, the Hunan Key Laboratory of mathematical modeling and analysis in engineering, the research innovation program of Changsha University of Science and Technology for postgraduate students under Grant (CX2019SS34), and the Portuguese Funds through FCT-Fundação para a Ciência, within the Project UIDB/00013/2020 and UIDP/00013/202
    • …
    corecore