41 research outputs found
Parallel Self-Consistent-Field Calculations via Chebyshev-Filtered Subspace Acceleration
Solving the Kohn-Sham eigenvalue problem constitutes the most computationally
expensive part in self-consistent density functional theory (DFT) calculations.
In a previous paper, we have proposed a nonlinear Chebyshev-filtered subspace
iteration method, which avoids computing explicit eigenvectors except at the
first SCF iteration. The method may be viewed as an approach to solve the
original nonlinear Kohn-Sham equation by a nonlinear subspace iteration
technique, without emphasizing the intermediate linearized Kohn-Sham eigenvalue
problem. It reaches self-consistency within a similar number of SCF iterations
as eigensolver-based approaches. However, replacing the standard
diagonalization at each SCF iteration by a Chebyshev subspace filtering step
results in a significant speedup over methods based on standard
diagonalization. Here, we discuss an approach for implementing this method in
multi-processor, parallel environment. Numerical results are presented to show
that the method enables to perform a class of highly challenging DFT
calculations that were not feasible before
Two-level Chebyshev filter based complementary subspace method: pushing the envelope of large-scale electronic structure calculations
We describe a novel iterative strategy for Kohn-Sham density functional
theory calculations aimed at large systems (> 1000 electrons), applicable to
metals and insulators alike. In lieu of explicit diagonalization of the
Kohn-Sham Hamiltonian on every self-consistent field (SCF) iteration, we employ
a two-level Chebyshev polynomial filter based complementary subspace strategy
to: 1) compute a set of vectors that span the occupied subspace of the
Hamiltonian; 2) reduce subspace diagonalization to just partially occupied
states; and 3) obtain those states in an efficient, scalable manner via an
inner Chebyshev-filter iteration. By reducing the necessary computation to just
partially occupied states, and obtaining these through an inner Chebyshev
iteration, our approach reduces the cost of large metallic calculations
significantly, while eliminating subspace diagonalization for insulating
systems altogether. We describe the implementation of the method within the
framework of the Discontinuous Galerkin (DG) electronic structure method and
show that this results in a computational scheme that can effectively tackle
bulk and nano systems containing tens of thousands of electrons, with chemical
accuracy, within a few minutes or less of wall clock time per SCF iteration on
large-scale computing platforms. We anticipate that our method will be
instrumental in pushing the envelope of large-scale ab initio molecular
dynamics. As a demonstration of this, we simulate a bulk silicon system
containing 8,000 atoms at finite temperature, and obtain an average SCF step
wall time of 51 seconds on 34,560 processors; thus allowing us to carry out 1.0
ps of ab initio molecular dynamics in approximately 28 hours (of wall time).Comment: Resubmitted version (version 2
Chebyshev polynomial filtered subspace iteration in the Discontinuous Galerkin method for large-scale electronic structure calculations
The Discontinuous Galerkin (DG) electronic structure method employs an
adaptive local basis (ALB) set to solve the Kohn-Sham equations of density
functional theory (DFT) in a discontinuous Galerkin framework. The adaptive
local basis is generated on-the-fly to capture the local material physics, and
can systematically attain chemical accuracy with only a few tens of degrees of
freedom per atom. A central issue for large-scale calculations, however, is the
computation of the electron density (and subsequently, ground state properties)
from the discretized Hamiltonian in an efficient and scalable manner. We show
in this work how Chebyshev polynomial filtered subspace iteration (CheFSI) can
be used to address this issue and push the envelope in large-scale materials
simulations in a discontinuous Galerkin framework. We describe how the subspace
filtering steps can be performed in an efficient and scalable manner using a
two-dimensional parallelization scheme, thanks to the orthogonality of the DG
basis set and block-sparse structure of the DG Hamiltonian matrix. The
on-the-fly nature of the ALBs requires additional care in carrying out the
subspace iterations. We demonstrate the parallel scalability of the DG-CheFSI
approach in calculations of large-scale two-dimensional graphene sheets and
bulk three-dimensional lithium-ion electrolyte systems. Employing 55,296
computational cores, the time per self-consistent field iteration for a sample
of the bulk 3D electrolyte containing 8,586 atoms is 90 seconds, and the time
for a graphene sheet containing 11,520 atoms is 75 seconds.Comment: Submitted to The Journal of Chemical Physic
Cyclic Density Functional Theory : A route to the first principles simulation of bending in nanostructures
We formulate and implement Cyclic Density Functional Theory (Cyclic DFT) -- a
self-consistent first principles simulation method for nanostructures with
cyclic symmetries. Using arguments based on Group Representation Theory, we
rigorously demonstrate that the Kohn-Sham eigenvalue problem for such systems
can be reduced to a fundamental domain (or cyclic unit cell) augmented with
cyclic-Bloch boundary conditions. Analogously, the equations of electrostatics
appearing in Kohn-Sham theory can be reduced to the fundamental domain
augmented with cyclic boundary conditions. By making use of this symmetry cell
reduction, we show that the electronic ground-state energy and the
Hellmann-Feynman forces on the atoms can be calculated using quantities defined
over the fundamental domain. We develop a symmetry-adapted finite-difference
discretization scheme to obtain a fully functional numerical realization of the
proposed approach. We verify that our formulation and implementation of Cyclic
DFT is both accurate and efficient through selected examples.
The connection of cyclic symmetries with uniform bending deformations
provides an elegant route to the ab-initio study of bending in nanostructures
using Cyclic DFT. As a demonstration of this capability, we simulate the
uniform bending of a silicene nanoribbon and obtain its energy-curvature
relationship from first principles. A self-consistent ab-initio simulation of
this nature is unprecedented and well outside the scope of any other systematic
first principles method in existence. Our simulations reveal that the bending
stiffness of the silicene nanoribbon is intermediate between that of graphene
and molybdenum disulphide. We describe several future avenues and applications
of Cyclic DFT, including its extension to the study of non-uniform bending
deformations and its possible use in the study of the nanoscale flexoelectric
effect.Comment: Version 3 of the manuscript, Accepted for publication in Journal of
the Mechanics and Physics of Solids,
http://www.sciencedirect.com/science/article/pii/S002250961630368
Recommended from our members
Scalable electronic structure methods to solve the Kohn-Sham equation
From the single hydrogen to proteins in the hundreds of thousands of kilodaltons, scientists can use the electronic structure of interacting atoms to predict their material properties. Knowing the material properties through solving the electronic structure problem, would allow for the controlled prediction and corresponding design of materials. The Kohn-Sham equations, based on density functional theory, transform a many-body problem impossible to solve for anything but the smallest molecules, into a practical problem which can be used to predict material properties. Although KSDFT scales as the cube of the number of electrons in the system, there are additional well documented approximations to further reduce the number of electrons, such as the pseudopotential method.
The incoming exascale era will lead to unavoidable challenges in solving the Kohn-Sham equations. These challenges include communication and hardware considerations. Old paradigms, epitomized by repeated series of globally forced synchronization points, will give way to new breeds of algorithms to maximizing scaling performance while maintaining portability.
This thesis focuses on the solution to Kohn-Sham DFT in real space at scale. Key to this effort is a parallel treatment of numerical elements involving the Rayleigh-Ritz method. At minimum, the Rayleigh-Ritz projection requires a number of distributed matrix vector operations equal to the number of electrons solved for in a system. Furthermore, the projection requires that number, squared and then halved, of dot products. The memory cost for such an algorithm also grows very large quickly, and explicit intelligent management is not an option. I demonstrate the computational requirements for the various steps in solving for the electronic structure problem for both large and small molecular systems. This thesis also discusses opportunities in real space Kohn-Sham DFT to further utilize floating point optimized hardware the with higher order stencils.Chemical Engineerin
Self-consistent-field calculations using Chebyshev-filtered subspace iteration
Abstract The power of density functional theory is often limited by the high computational demand in solving an eigenvalue problem at each self-consistent-field (SCF) iteration. The method presented in this paper replaces the explicit eigenvalue calculations by an approximation of the wanted invariant subspace, obtained with the help of well-selected Chebyshev polynomial filters. In this approach, only the initial SCF iteration requires solving an eigenvalue problem, in order to provide a good initial subspace. In the remaining SCF iterations, no iterative eigensolvers are involved. Instead, Chebyshev polynomials are used to refine the subspace. The subspace iteration at each step is easily five to ten times faster than solving a corresponding eigenproblem by the most efficient eigen-algorithms. Moreover, the subspace iteration reaches self-consistency within roughly the same number of steps as an eigensolver-based approach. This results in a significantly faster SCF iteration