693 research outputs found

    Efficient calculation of molecular integrals over London atomic orbitals

    Get PDF
    The use of London atomic orbitals (LAOs) in a non-perturbative manner enables the determination of gauge-origin invariant energies and properties for molecular species in arbitrarily strong magnetic fields. Central to the efficient implementation of such calculations for molecular systems is the evaluation of molecular integrals, particularly the electron repulsion integrals (ERIs). We present an implementation of several different algorithms for the evaluation of ERIs over Gaussian-type LAOs at arbitrary magnetic field strengths. The efficiency of generalized McMurchie-Davidson (MD), Head-Gordon-Pople (HGP) and Rys quadrature schemes is compared. For the Rys quadrature implementation, we avoid the use of high precision arithmetic and interpolation schemes in the computation of the quadrature roots and weights, enabling the application of this algorithm seamlessly to a wide range of magnetic fields. The efficiency of each generalised algorithm is compared by numerical application, classifying the ERIs according to their total angular momenta and evaluating their performance for primitive and contracted basis sets. In common with zero-field integral evaluation, no single algorithm is optimal for all angular momenta thus a simple mixed scheme is put forward, which selects the most efficient approach to calculate the ERIs for each shell quartet. The mixed approach is significantly more efficient than the exclusive use of any individual algorithm

    Memory-Efficient Recursive Evaluation of 3-Center Gaussian Integrals

    Full text link
    To improve the efficiency of Gaussian integral evaluation on modern accelerated architectures FLOP-efficient Obara-Saika-based recursive evaluation schemes are optimized for the memory footprint. For the 3-center 2-particle integrals that are key for the evaluation of Coulomb and other 2-particle interactions in the density-fitting approximation the use of multi-quantal recurrences (in which multiple quanta are created or transferred at once) is shown to produce significant memory savings. Other innovation include leveraging register memory for reduced memory footprint and direct compile-time generation of optimized kernels (instead of custom code generation) with compile-time features of modern C++/CUDA. High efficiency of the CPU- and CUDA-based implementation of the proposed schemes is demonstrated for both the individual batches of integrals involving up to Gaussians with low and high angular momenta (up to L=6L=6) and contraction degrees, as well as for the density-fitting-based evaluation of the Coulomb potential. The computer implementation is available in the open-source LibintX library.Comment: 37 pages, 2 figures, 6 table

    Development and Optimization of Computational Chemistry Algorithms

    Get PDF
    The challenges specific to the development of computational chemistry software are discussed. Selected solutions are presented, including examples of algorithmic optimizations and improved load-balancing for parallel calculations. A software framework for development of new quantum-chemical algorithms is proposed. Key design points are discussed. Optimization techniques are briefly described. Important implementation aspects, like automatic code generation, are highlighted

    Computing and Compressing Electron Repulsion Integrals on FPGAs

    Full text link
    The computation of electron repulsion integrals (ERIs) over Gaussian-type orbitals (GTOs) is a challenging problem in quantum-mechanics-based atomistic simulations. In practical simulations, several trillions of ERIs may have to be computed for every time step. In this work, we investigate FPGAs as accelerators for the ERI computation. We use template parameters, here within the Intel oneAPI tool flow, to create customized designs for 256 different ERI quartet classes, based on their orbitals. To maximize data reuse, all intermediates are buffered in FPGA on-chip memory with customized layout. The pre-calculation of intermediates also helps to overcome data dependencies caused by multi-dimensional recurrence relations. The involved loop structures are partially or even fully unrolled for high throughput of FPGA kernels. Furthermore, a lossy compression algorithm utilizing arbitrary bitwidth integers is integrated in the FPGA kernels. To our best knowledge, this is the first work on ERI computation on FPGAs that supports more than just the single most basic quartet class. Also, the integration of ERI computation and compression it a novelty that is not even covered by CPU or GPU libraries so far. Our evaluation shows that using 16-bit integer for the ERI compression, the fastest FPGA kernels exceed the performance of 10 GERIS (10×10910 \times 10^9 ERIs per second) on one Intel Stratix 10 GX 2800 FPGA, with maximum absolute errors around 10−710^{-7} - 10−510^{-5} Hartree. The measured throughput can be accurately explained by a performance model. The FPGA kernels deployed on 2 FPGAs outperform similar computations using the widely used libint reference on a two-socket server with 40 Xeon Gold 6148 CPU cores of the same process technology by factors up to 6.0x and on a new two-socket server with 128 EPYC 7713 CPU cores by up to 1.9x

    Tensor hypercontraction: A universal technique for the resolution of matrix elements of local, finite-range NN-body potentials in many-body quantum problems

    Full text link
    Configuration-space matrix elements of N-body potentials arise naturally and ubiquitously in the Ritz-Galerkin solution of many-body quantum problems. For the common specialization of local, finite-range potentials, we develop the eXact Tensor HyperContraction (X-THC) method, which provides a quantized renormalization of the coordinate-space form of the N-body potential, allowing for a highly separable tensor factorization of the configuration-space matrix elements. This representation allows for substantial computational savings in chemical, atomic, and nuclear physics simulations, particularly with respect to difficult "exchange-like" contractions.Comment: Third version of the manuscript after referee's comments. In press in PRL. Main text: 4 pages, 2 figures, 1 table; Supplemental material (also included): 14 pages, 2 figures, 2 table

    Efficient calculation of molecular integrals over London atomic orbitals

    Get PDF
    The use of London atomic orbitals (LAOs) in a non-perturbative manner enables the determination of gauge-origin invariant energies and properties for molecular species in arbitrarily strong magnetic fields. Central to the efficient implementation of such calculations for molecular systems is the evaluation of molecular integrals, particularly the electron repulsion integrals (ERIs). We present an implementation of several different algorithms for the evaluation of ERIs over Gaussian-type LAOs at arbitrary magnetic field strengths. The efficiency of generalized McMurchie-Davidson (MD), Head-Gordon-Pople (HGP) and Rys quadrature schemes is compared. For the Rys quadrature implementation, we avoid the use of high precision arithmetic and interpolation schemes in the computation of the quadrature roots and weights, enabling the application of this algorithm seamlessly to a wide range of magnetic fields. The efficiency of each generalised algorithm is compared by numerical application, classifying the ERIs according to their total angular momenta and evaluating their performance for primitive and contracted basis sets. In common with zero-field integral evaluation, no single algorithm is optimal for all angular momenta thus a simple mixed scheme is put forward, which selects the most efficient approach to calculate the ERIs for each shell quartet. The mixed approach is significantly more efficient than the exclusive use of any individual algorithm

    Modernizing the core quantum chemistry algorithms

    Get PDF
    This document covers the basics of computational chemistry and how using the modern programming techniques the theory can be efficiently implemented on digital computers. The computer implementations are developed from the core two-electron integrals to many-body and coupled cluster algorithms. A particular attention is paid to the physical constraints of he computer resources and the emergence of the novel architectures
    • …
    corecore