3,468 research outputs found

    A methodology for exploiting parallelism in the finite element process

    Get PDF
    A methodology is described for developing a parallel system using a top down approach taking into account the requirements of the user. Substructuring, a popular technique in structural analysis, is used to illustrate this approach

    ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers

    Full text link
    Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aims to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.Comment: 55 pages, 14 figures, 2 table

    Minimizing Communication for Eigenproblems and the Singular Value Decomposition

    Full text link
    Algorithms have two costs: arithmetic and communication. The latter represents the cost of moving data, either between levels of a memory hierarchy, or between processors over a network. Communication often dominates arithmetic and represents a rapidly increasing proportion of the total cost, so we seek algorithms that minimize communication. In \cite{BDHS10} lower bounds were presented on the amount of communication required for essentially all O(n3)O(n^3)-like algorithms for linear algebra, including eigenvalue problems and the SVD. Conventional algorithms, including those currently implemented in (Sca)LAPACK, perform asymptotically more communication than these lower bounds require. In this paper we present parallel and sequential eigenvalue algorithms (for pencils, nonsymmetric matrices, and symmetric matrices) and SVD algorithms that do attain these lower bounds, and analyze their convergence and communication costs.Comment: 43 pages, 11 figure

    Efficient approximation of functions of some large matrices by partial fraction expansions

    Full text link
    Some important applicative problems require the evaluation of functions Ψ\Psi of large and sparse and/or \emph{localized} matrices AA. Popular and interesting techniques for computing Ψ(A)\Psi(A) and Ψ(A)v\Psi(A)\mathbf{v}, where v\mathbf{v} is a vector, are based on partial fraction expansions. However, some of these techniques require solving several linear systems whose matrices differ from AA by a complex multiple of the identity matrix II for computing Ψ(A)v\Psi(A)\mathbf{v} or require inverting sequences of matrices with the same characteristics for computing Ψ(A)\Psi(A). Here we study the use and the convergence of a recent technique for generating sequences of incomplete factorizations of matrices in order to face with both these issues. The solution of the sequences of linear systems and approximate matrix inversions above can be computed efficiently provided that A1A^{-1} shows certain decay properties. These strategies have good parallel potentialities. Our claims are confirmed by numerical tests

    Solution of partial differential equations on vector and parallel computers

    Get PDF
    The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

    The use of Lanczos's method to solve the large generalized symmetric definite eigenvalue problem

    Get PDF
    The generalized eigenvalue problem, Kx = Lambda Mx, is of significant practical importance, especially in structural enginering where it arises as the vibration and buckling problem. A new algorithm, LANZ, based on Lanczos's method is developed. LANZ uses a technique called dynamic shifting to improve the efficiency and reliability of the Lanczos algorithm. A new algorithm for solving the tridiagonal matrices that arise when using Lanczos's method is described. A modification of Parlett and Scott's selective orthogonalization algorithm is proposed. Results from an implementation of LANZ on a Convex C-220 show it to be superior to a subspace iteration code

    Numerical methods for large-scale Lyapunov equations with symmetric banded data

    Full text link
    The numerical solution of large-scale Lyapunov matrix equations with symmetric banded data has so far received little attention in the rich literature on Lyapunov equations. We aim to contribute to this open problem by introducing two efficient solution methods, which respectively address the cases of well conditioned and ill conditioned coefficient matrices. The proposed approaches conveniently exploit the possibly hidden structure of the solution matrix so as to deliver memory and computation saving approximate solutions. Numerical experiments are reported to illustrate the potential of the described methods
    corecore