693 research outputs found
Efficient calculation of molecular integrals over London atomic orbitals
The use of London atomic orbitals (LAOs) in a non-perturbative manner enables the determination of gauge-origin invariant energies and properties for molecular species in arbitrarily strong magnetic fields. Central to the efficient implementation of such calculations for molecular systems is the evaluation of molecular integrals, particularly the electron repulsion integrals (ERIs). We present an implementation of several different algorithms for the evaluation of ERIs over Gaussian-type LAOs at arbitrary magnetic field strengths. The efficiency of generalized McMurchie-Davidson (MD), Head-Gordon-Pople (HGP) and Rys quadrature schemes is compared. For the Rys quadrature implementation, we avoid the use of high precision arithmetic and interpolation schemes in the computation of the quadrature roots and weights, enabling the application of this algorithm seamlessly to a wide range of magnetic fields. The efficiency of each generalised algorithm is compared by numerical application, classifying the ERIs according to their total angular momenta and evaluating their performance for primitive and contracted basis sets. In common with zero-field integral evaluation, no single algorithm is optimal for all angular momenta thus a simple mixed scheme is put forward, which selects the most efficient approach to calculate the ERIs for each shell quartet. The mixed approach is significantly more efficient than the exclusive use of any individual algorithm
Memory-Efficient Recursive Evaluation of 3-Center Gaussian Integrals
To improve the efficiency of Gaussian integral evaluation on modern
accelerated architectures FLOP-efficient Obara-Saika-based recursive evaluation
schemes are optimized for the memory footprint. For the 3-center 2-particle
integrals that are key for the evaluation of Coulomb and other 2-particle
interactions in the density-fitting approximation the use of multi-quantal
recurrences (in which multiple quanta are created or transferred at once) is
shown to produce significant memory savings. Other innovation include
leveraging register memory for reduced memory footprint and direct compile-time
generation of optimized kernels (instead of custom code generation) with
compile-time features of modern C++/CUDA. High efficiency of the CPU- and
CUDA-based implementation of the proposed schemes is demonstrated for both the
individual batches of integrals involving up to Gaussians with low and high
angular momenta (up to ) and contraction degrees, as well as for the
density-fitting-based evaluation of the Coulomb potential. The computer
implementation is available in the open-source LibintX library.Comment: 37 pages, 2 figures, 6 table
Development and Optimization of Computational Chemistry Algorithms
The challenges specific to the development of computational chemistry software are discussed. Selected solutions are presented, including examples of algorithmic optimizations and improved load-balancing for parallel calculations. A software framework for development of new quantum-chemical algorithms is proposed. Key design points are discussed. Optimization techniques are briefly described. Important implementation aspects, like automatic code generation, are highlighted
Computing and Compressing Electron Repulsion Integrals on FPGAs
The computation of electron repulsion integrals (ERIs) over Gaussian-type
orbitals (GTOs) is a challenging problem in quantum-mechanics-based atomistic
simulations. In practical simulations, several trillions of ERIs may have to be
computed for every time step.
In this work, we investigate FPGAs as accelerators for the ERI computation.
We use template parameters, here within the Intel oneAPI tool flow, to create
customized designs for 256 different ERI quartet classes, based on their
orbitals. To maximize data reuse, all intermediates are buffered in FPGA
on-chip memory with customized layout. The pre-calculation of intermediates
also helps to overcome data dependencies caused by multi-dimensional recurrence
relations. The involved loop structures are partially or even fully unrolled
for high throughput of FPGA kernels. Furthermore, a lossy compression algorithm
utilizing arbitrary bitwidth integers is integrated in the FPGA kernels. To our
best knowledge, this is the first work on ERI computation on FPGAs that
supports more than just the single most basic quartet class. Also, the
integration of ERI computation and compression it a novelty that is not even
covered by CPU or GPU libraries so far.
Our evaluation shows that using 16-bit integer for the ERI compression, the
fastest FPGA kernels exceed the performance of 10 GERIS ( ERIs
per second) on one Intel Stratix 10 GX 2800 FPGA, with maximum absolute errors
around - Hartree. The measured throughput can be accurately
explained by a performance model. The FPGA kernels deployed on 2 FPGAs
outperform similar computations using the widely used libint reference on a
two-socket server with 40 Xeon Gold 6148 CPU cores of the same process
technology by factors up to 6.0x and on a new two-socket server with 128 EPYC
7713 CPU cores by up to 1.9x
Tensor hypercontraction: A universal technique for the resolution of matrix elements of local, finite-range -body potentials in many-body quantum problems
Configuration-space matrix elements of N-body potentials arise naturally and
ubiquitously in the Ritz-Galerkin solution of many-body quantum problems. For
the common specialization of local, finite-range potentials, we develop the
eXact Tensor HyperContraction (X-THC) method, which provides a quantized
renormalization of the coordinate-space form of the N-body potential, allowing
for a highly separable tensor factorization of the configuration-space matrix
elements. This representation allows for substantial computational savings in
chemical, atomic, and nuclear physics simulations, particularly with respect to
difficult "exchange-like" contractions.Comment: Third version of the manuscript after referee's comments. In press in
PRL. Main text: 4 pages, 2 figures, 1 table; Supplemental material (also
included): 14 pages, 2 figures, 2 table
Efficient calculation of molecular integrals over London atomic orbitals
The use of London atomic orbitals (LAOs) in a non-perturbative manner enables the determination of gauge-origin invariant energies and properties for molecular species in arbitrarily strong magnetic fields. Central to the efficient implementation of such calculations for molecular systems is the evaluation of molecular integrals, particularly the electron repulsion integrals (ERIs). We present an implementation of several different algorithms for the evaluation of ERIs over Gaussian-type LAOs at arbitrary magnetic field strengths. The efficiency of generalized McMurchie-Davidson (MD), Head-Gordon-Pople (HGP) and Rys quadrature schemes is compared. For the Rys quadrature implementation, we avoid the use of high precision arithmetic and interpolation schemes in the computation of the quadrature roots and weights, enabling the application of this algorithm seamlessly to a wide range of magnetic fields. The efficiency of each generalised algorithm is compared by numerical application, classifying the ERIs according to their total angular momenta and evaluating their performance for primitive and contracted basis sets. In common with zero-field integral evaluation, no single algorithm is optimal for all angular momenta thus a simple mixed scheme is put forward, which selects the most efficient approach to calculate the ERIs for each shell quartet. The mixed approach is significantly more efficient than the exclusive use of any individual algorithm
Modernizing the core quantum chemistry algorithms
This document covers the basics of computational chemistry and how using the modern programming techniques the theory can be efficiently implemented on digital computers.
The computer implementations are developed from the core two-electron integrals to many-body and coupled cluster algorithms. A particular attention is paid to the physical constraints of he computer resources and the emergence of the novel architectures
- …