488 research outputs found

    High-Performance Solvers for Dense Hermitian Eigenproblems

    Full text link
    We introduce a new collection of solvers - subsequently called EleMRRR - for large-scale dense Hermitian eigenproblems. EleMRRR solves various types of problems: generalized, standard, and tridiagonal eigenproblems. Among these, the last is of particular importance as it is a solver on its own right, as well as the computational kernel for the first two; we present a fast and scalable tridiagonal solver based on the Algorithm of Multiple Relatively Robust Representations - referred to as PMRRR. Like the other EleMRRR solvers, PMRRR is part of the freely available Elemental library, and is designed to fully support both message-passing (MPI) and multithreading parallelism (SMP). As a result, the solvers can equally be used in pure MPI or in hybrid MPI-SMP fashion. We conducted a thorough performance study of EleMRRR and ScaLAPACK's solvers on two supercomputers. Such a study, performed with up to 8,192 cores, provides precise guidelines to assemble the fastest solver within the ScaLAPACK framework; it also indicates that EleMRRR outperforms even the fastest solvers built from ScaLAPACK's components

    Fast modal extraction in NASTRAN via the FEER computer program

    Get PDF
    A new eigensolution routine, FEER (Fast Eigensolution Extraction Routine), used in conjunction with NASTRAN at Israel Aircraft Industries is described. The FEER program is based on an automatic matrix reduction scheme whereby the lower modes of structures with many degrees of freedom can be accurately extracted from a tridiagonal eigenvalue problem whose size is of the same order of magnitude as the number of required modes. The process is effected without arbitrary lumping of masses at selected node points or selection of nodes to be retained in the analysis set. The results of computational efficiency studies are presented, showing major arithmetic operation counts and actual computer run times of FEER as compared to other methods of eigenvalue extraction, including those available in the NASTRAN READ module. It is concluded that the tridiagonal reduction method used in FEER would serve as a valuable addition to NASTRAN for highly increased efficiency in obtaining structural vibration modes

    Minimizing Communication for Eigenproblems and the Singular Value Decomposition

    Full text link
    Algorithms have two costs: arithmetic and communication. The latter represents the cost of moving data, either between levels of a memory hierarchy, or between processors over a network. Communication often dominates arithmetic and represents a rapidly increasing proportion of the total cost, so we seek algorithms that minimize communication. In \cite{BDHS10} lower bounds were presented on the amount of communication required for essentially all O(n3)O(n^3)-like algorithms for linear algebra, including eigenvalue problems and the SVD. Conventional algorithms, including those currently implemented in (Sca)LAPACK, perform asymptotically more communication than these lower bounds require. In this paper we present parallel and sequential eigenvalue algorithms (for pencils, nonsymmetric matrices, and symmetric matrices) and SVD algorithms that do attain these lower bounds, and analyze their convergence and communication costs.Comment: 43 pages, 11 figure

    Improved Accuracy and Parallelism for MRRR-based Eigensolvers -- A Mixed Precision Approach

    Get PDF
    The real symmetric tridiagonal eigenproblem is of outstanding importance in numerical computations; it arises frequently as part of eigensolvers for standard and generalized dense Hermitian eigenproblems that are based on a reduction to tridiagonal form. For its solution, the algorithm of Multiple Relatively Robust Representations (MRRR) is among the fastest methods. Although fast, the solvers based on MRRR do not deliver the same accuracy as competing methods like Divide & Conquer or the QR algorithm. In this paper, we demonstrate that the use of mixed precisions leads to improved accuracy of MRRR-based eigensolvers with limited or no performance penalty. As a result, we obtain eigensolvers that are not only equally or more accurate than the best available methods, but also -in most circumstances- faster and more scalable than the competition

    ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers

    Full text link
    Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aims to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.Comment: 55 pages, 14 figures, 2 table

    Conditional quasi-exact solvability of the quantum planar pendulum and of its anti-isospectral hyperbolic counterpart

    Full text link
    We have subjected the planar pendulum eigenproblem to a symmetry analysis with the goal of explaining the relationship between its conditional quasi-exact solvability (C-QES) and the topology of its eigenenergy surfaces, established in our earlier work [Frontiers in Physical Chemistry and Chemical Physics 2, 1-16, (2014)]. The present analysis revealed that this relationship can be traced to the structure of the tridiagonal matrices representing the symmetry-adapted pendular Hamiltonian, as well as enabled us to identify many more -- forty in total to be exact -- analytic solutions. Furthermore, an analogous analysis of the hyperbolic counterpart of the planar pendulum, the Razavy problem, which was shown to be also C-QES [American Journal of Physics 48, 285 (1980)], confirmed that it is anti-isospectral with the pendular eigenproblem. Of key importance for both eigenproblems proved to be the topological index Îș\kappa, as it determines the loci of the intersections (genuine and avoided) of the eigenenergy surfaces spanned by the dimensionless interaction parameters η\eta and ζ\zeta. It also encapsulates the conditions under which analytic solutions to the two eigenproblems obtain and provides the number of analytic solutions. At a given Îș\kappa, the anti-isospectrality occurs for single states only (i.e., not for doublets), like C-QES holds solely for integer values of Îș\kappa, and only occurs for the lowest eigenvalues of the pendular and Razavy Hamiltonians, with the order of the eigenvalues reversed for the latter. For all other states, the pendular and Razavy spectra become in fact qualitatively different, as higher pendular states appear as doublets whereas all higher Razavy states are singlets

    Parallel implementation for large and sparse eigenproblems

    Get PDF
    This paper analyses and evaluates the computational aspects of an efficient parallel implementation for the eigenproblem. This parallel implementation allows to solve the eigenproblem of symmetric, sparse and very large matrices. Mathematically, the algorithm is supported by the Lanczos and Divide and Conquer methods. The Lanczos method transforms the eigenproblem of a symmetric matrix into an eigenproblem of a tridiagonal matrix which is easier to be solved. The Divide and Conquer method provides the solution for the eigenproblem of a large tridiagonal matrix by decomposing it in a set of smaller subproblems. The method has been implemented for a distributed memory multiprocessor system with the PVM parallel interface. A Cray T3E system with up to 32 nodes has been used to evaluate the performance of our parallel implementation. Due to the super-lineal speed-up values obtained for all the studied matrices, a detailed analysis of the experimental results is carried out. It will be shown that the management of the memory hierarchy plays an important role in the performance of the parallel implementation
    • 

    corecore