2,409 research outputs found
Sparse approximate inverse preconditioners on high performance GPU platforms
Simulation with models based on partial differential equations often requires the solution of (sequences of) large and sparse algebraic linear systems. In multidimensional domains, preconditioned Krylov iterative solvers are often appropriate for these duties. Therefore, the search for efficient preconditioners for Krylov subspace methods is a crucial theme. Recent developments, especially in computing hardware, have renewed the interest in approximate inverse preconditioners in factorized form, because their application during the solution process can be more efficient. We present here some experiences focused on the approximate inverse preconditioners proposed by Benzi and Tůma from 1996 and the sparsification and inversion proposed by van Duin in 1999. Computational costs, reorderings and implementation issues are considered both on conventional and innovative computing architectures like Graphics Programming Units (GPUs)
Lecture 10: Preconditioned Iterative Methods for Linear Systems
Iterative methods for the solution of linear systems of equations – such as stationary, semi-iterative, and Krylov subspace methods – are classical methods taught in numerical analysis courses, but adapting these methods to run efficiently at large-scale on high-performance computers is challenging and a constantly evolving topic. Preconditioners – necessary to aid the convergence of iterative methods – come in many forms, from algebraic to physics-based, are regularly being developed for linear systems from different classes of problems, and similarly are evolving with high-performance computers. This lecture will cover the background and some recent developments on iterative methods and preconditioning in the context of high-performance parallel computers. Topics include asynchronous iterative methods that avoid the potentially high synchronization cost where there are very large numbers of computational threads, parallel sparse approximate inverse preconditioners, parallel incomplete factorization preconditioners and sparse triangular solvers, and preconditioning with hierarchical rank-structured matrices for kernel matrix equations
Restarted Hessenberg method for solving shifted nonsymmetric linear systems
It is known that the restarted full orthogonalization method (FOM)
outperforms the restarted generalized minimum residual (GMRES) method in
several circumstances for solving shifted linear systems when the shifts are
handled simultaneously. Many variants of them have been proposed to enhance
their performance. We show that another restarted method, the restarted
Hessenberg method [M. Heyouni, M\'ethode de Hessenberg G\'en\'eralis\'ee et
Applications, Ph.D. Thesis, Universit\'e des Sciences et Technologies de Lille,
France, 1996] based on Hessenberg procedure, can effectively be employed, which
can provide accelerating convergence rate with respect to the number of
restarts. Theoretical analysis shows that the new residual of shifted restarted
Hessenberg method is still collinear with each other. In these cases where the
proposed algorithm needs less enough CPU time elapsed to converge than the
earlier established restarted shifted FOM, weighted restarted shifted FOM, and
some other popular shifted iterative solvers based on the short-term vector
recurrence, as shown via extensive numerical experiments involving the recent
popular applications of handling the time fractional differential equations.Comment: 19 pages, 7 tables. Some corrections for updating the reference
MATEX: A Distributed Framework for Transient Simulation of Power Distribution Networks
We proposed MATEX, a distributed framework for transient simulation of power
distribution networks (PDNs). MATEX utilizes matrix exponential kernel with
Krylov subspace approximations to solve differential equations of linear
circuit. First, the whole simulation task is divided into subtasks based on
decompositions of current sources, in order to reduce the computational
overheads. Then these subtasks are distributed to different computing nodes and
processed in parallel. Within each node, after the matrix factorization at the
beginning of simulation, the adaptive time stepping solver is performed without
extra matrix re-factorizations. MATEX overcomes the stiff-ness hinder of
previous matrix exponential-based circuit simulator by rational Krylov subspace
method, which leads to larger step sizes with smaller dimensions of Krylov
subspace bases and highly accelerates the whole computation. MATEX outperforms
both traditional fixed and adaptive time stepping methods, e.g., achieving
around 13X over the trapezoidal framework with fixed time step for the IBM
power grid benchmarks.Comment: ACM/IEEE DAC 2014. arXiv admin note: substantial text overlap with
arXiv:1505.0669
Application of vector-valued rational approximations to the matrix eigenvalue problem and connections with Krylov subspace methods
Let F(z) be a vectored-valued function F: C approaches C sup N, which is analytic at z=0 and meromorphic in a neighborhood of z=0, and let its Maclaurin series be given. We use vector-valued rational approximation procedures for F(z) that are based on its Maclaurin series in conjunction with power iterations to develop bona fide generalizations of the power method for an arbitrary N X N matrix that may be diagonalizable or not. These generalizations can be used to obtain simultaneously several of the largest distinct eigenvalues and the corresponding invariant subspaces, and present a detailed convergence theory for them. In addition, it is shown that the generalized power methods of this work are equivalent to some Krylov subspace methods, among them the methods of Arnoldi and Lanczos. Thus, the theory provides a set of completely new results and constructions for these Krylov subspace methods. This theory suggests at the same time a new mode of usage for these Krylov subspace methods that were observed to possess computational advantages over their common mode of usage
Minimizing synchronizations in sparse iterative solvers for distributed supercomputers
Eliminating synchronizations is one of the important techniques related to minimizing communications for modern high performance computing. This paper discusses principles of reducing communications due to global synchronizations in sparse iterative solvers on distributed supercomputers. We demonstrates how to minimizing global synchronizations by rescheduling a typical Krylov subspace method. The benefit of minimizing synchronizations is shown in theoretical analysis and is verified by numerical experiments using up to 900 processors. The experiments also show the communication complexity for some structured sparse matrix vector multiplications and global communications in the underlying supercomputers are in the order P1/2.5 and P4/5 respectively, where P is the number of processors and the experiments were carried on a Dawning 5000A
Order reduction approaches for the algebraic Riccati equation and the LQR problem
We explore order reduction techniques for solving the algebraic Riccati
equation (ARE), and investigating the numerical solution of the
linear-quadratic regulator problem (LQR). A classical approach is to build a
surrogate low dimensional model of the dynamical system, for instance by means
of balanced truncation, and then solve the corresponding ARE. Alternatively,
iterative methods can be used to directly solve the ARE and use its approximate
solution to estimate quantities associated with the LQR. We propose a class of
Petrov-Galerkin strategies that simultaneously reduce the dynamical system
while approximately solving the ARE by projection. This methodology
significantly generalizes a recently developed Galerkin method by using a pair
of projection spaces, as it is often done in model order reduction of dynamical
systems. Numerical experiments illustrate the advantages of the new class of
methods over classical approaches when dealing with large matrices
- …