24 research outputs found

    Efficient computation of the second-Born self-energy using tensor-contraction operations

    Full text link
    In the nonequilibrium Green's function approach, the approximation of the correlation self-energy at the second-Born level is of particular interest, since it allows for a maximal speed-up in computational scaling when used together with the Generalized Kadanoff-Baym Ansatz for the Green's function. The present day numerical time-propagation algorithms for the Green's function are able to tackle first principles simulations of atoms and molecules, but they are limited to relatively small systems due to unfavourable scaling of self-energy diagrams with respect to the basis size. We propose an efficient computation of the self-energy diagrams by using tensor-contraction operations to transform the internal summations into functions of external low-level linear algebra libraries. We discuss the achieved computational speed-up in transient electron dynamics in selected molecular systems.Comment: 9 pages, 4 figures, 1 tabl

    Scalable Task-Based Algorithm for Multiplication of Block-Rank-Sparse Matrices

    Full text link
    A task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication (MM), is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum chemistry (QC). The novel features of our formulation are: (1) concurrent scheduling of multiple SUMMA iterations, and (2) fine-grained task-based composition. These features make it tolerant of the load imbalance due to the irregular matrix structure and eliminate all artifactual sources of global synchronization.Scalability of iterative computation of square-root inverse of block-rank-sparse QC matrices is demonstrated; for full-rank (dense) matrices the performance of our SUMMA formulation usually exceeds that of the state-of-the-art dense MM implementations (ScaLAPACK and Cyclops Tensor Framework).Comment: 8 pages, 6 figures, accepted to IA3 2015. arXiv admin note: text overlap with arXiv:1504.0504
    corecore