Search CORE

7,928 research outputs found

Parallel eigensolvers in plane-wave Density Functional Theory

Author: Levitt Antoine
Torrent Marc
Publication venue: 'Elsevier BV'
Publication date: 07/10/2014
Field of study

We consider the problem of parallelizing electronic structure computations in plane-wave Density Functional Theory. Because of the limited scalability of Fourier transforms, parallelism has to be found at the eigensolver level. We show how a recently proposed algorithm based on Chebyshev polynomials can scale into the tens of thousands of processors, outperforming block conjugate gradient algorithms for large computations

arXiv.org e-Print Archive

CiteSeerX

Crossref

HAL-CEA

Lanczos eigensolution method for high-performance computers

Author: Bostic Susan W.
Publication venue
Publication date
Field of study

The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors

NASA Technical Reports Server

A bibliography on parallel and vector numerical algorithms

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

NASA Technical Reports Server

Parallel Self-Consistent-Field Calculations via Chebyshev-Filtered Subspace Acceleration

Author: B. Fornberg
B. N. Parlett
J. R. Chelikowsky
James R. Chelikowsky
Murilo L. Tiago
R. B. Lehoucq
R. M. Martin
W. G. Aulbur
W. Koch
Yousef Saad
Yunkai Zhou
Publication venue: 'American Physical Society (APS)'
Publication date: 08/03/2007
Field of study

Solving the Kohn-Sham eigenvalue problem constitutes the most computationally expensive part in self-consistent density functional theory (DFT) calculations. In a previous paper, we have proposed a nonlinear Chebyshev-filtered subspace iteration method, which avoids computing explicit eigenvectors except at the first SCF iteration. The method may be viewed as an approach to solve the original nonlinear Kohn-Sham equation by a nonlinear subspace iteration technique, without emphasizing the intermediate linearized Kohn-Sham eigenvalue problem. It reaches self-consistency within a similar number of SCF iterations as eigensolver-based approaches. However, replacing the standard diagonalization at each SCF iteration by a Chebyshev subspace filtering step results in a significant speedup over methods based on standard diagonalization. Here, we discuss an approach for implementing this method in multi-processor, parallel environment. Numerical results are presented to show that the method enables to perform a class of highly challenging DFT calculations that were not feasible before

arXiv.org e-Print Archive

Crossref

A Lanczos eigenvalue method on a parallel computer

Author: Bostic Susan W.
Fulton Robert E.
Publication venue
Publication date
Field of study

Eigenvalue analyses of complex structures is a computationally intensive task which can benefit significantly from new and impending parallel computers. This study reports on a parallel computer implementation of the Lanczos method for free vibration analysis. The approach used here subdivides the major Lanczos calculation tasks into subtasks and introduces parallelism down to the subtask levels such as matrix decomposition and forward/backward substitution. The method was implemented on a commercial parallel computer and results were obtained for a long flexible space structure. While parallel computing efficiency for the Lanczos method was good for a moderate number of processors for the test problem, the greatest reduction in time was realized for the decomposition of the stiffness matrix, a calculation which took 70 percent of the time in the sequential program and which took 25 percent of the time on eight processors. For a sample calculation of the twenty lowest frequencies of a 486 degree of freedom problem, the total sequential computing time was reduced by almost a factor of ten using 16 processors

NASA Technical Reports Server

An Optimized and Scalable Eigensolver for Sequences of Eigenvalue Problems

Author: Berljafa Mario
Di Napoli Edoardo
Wortmann Daniel
Publication venue
Publication date: 01/01/2014
Field of study

In many scientific applications the solution of non-linear differential equations are obtained through the set-up and solution of a number of successive eigenproblems. These eigenproblems can be regarded as a sequence whenever the solution of one problem fosters the initialization of the next. In addition, in some eigenproblem sequences there is a connection between the solutions of adjacent eigenproblems. Whenever it is possible to unravel the existence of such a connection, the eigenproblem sequence is said to be correlated. When facing with a sequence of correlated eigenproblems the current strategy amounts to solving each eigenproblem in isolation. We propose a alternative approach which exploits such correlation through the use of an eigensolver based on subspace iteration and accelerated with Chebyshev polynomials (ChFSI). The resulting eigensolver is optimized by minimizing the number of matrix-vector multiplications and parallelized using the Elemental library framework. Numerical results show that ChFSI achieves excellent scalability and is competitive with current dense linear algebra parallel eigensolvers.Comment: 23 Pages, 6 figures. First revision of an invited submission to special issue of Concurrency and Computation: Practice and Experienc

arXiv.org e-Print Archive

CiteSeerX

Crossref

MIMS EPrints

Juelich Shared Electronic Resources

The University of Manchester - Institutional Repository