Search CORE

1,687 research outputs found

A Parallel Iterative Method for Computing Molecular Absorption Spectra

Author: Coulaud Olivier
Foerster Dietrich
Koval Peter
Publication venue
Publication date: 01/01/2010
Field of study

We describe a fast parallel iterative method for computing molecular absorption spectra within TDDFT linear response and using the LCAO method. We use a local basis of "dominant products" to parametrize the space of orbital products that occur in the LCAO approach. In this basis, the dynamical polarizability is computed iteratively within an appropriate Krylov subspace. The iterative procedure uses a a matrix-free GMRES method to determine the (interacting) density response. The resulting code is about one order of magnitude faster than our previous full-matrix method. This acceleration makes the speed of our TDDFT code comparable with codes based on Casida's equation. The implementation of our method uses hybrid MPI and OpenMP parallelization in which load balancing and memory access are optimized. To validate our approach and to establish benchmarks, we compute spectra of large molecules on various types of parallel machines. The methods developed here are fairly general and we believe they will find useful applications in molecular physics/chemistry, even for problems that are beyond TDDFT, such as organic semiconductors, particularly in photovoltaics.Comment: 20 pages, 17 figures, 3 table

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Oskar Bordeaux

Quantum ESPRESSO: a modular and open-source software project for quantum simulations of materials

Author: Ahlrichs R Furche F Hättig C Klopper W M Sierka M Weigend F
Alexander Smogunov
Alfredo Pasquarello
Allen M P
Anderson E
Andrea Dal Corso
Anton Kokalj
Ari P Seitsonen
Baerends E J
Blackford L S
Bussi G
Bylaska E J
Carlo Cavazzoni
Carlo Sbraccia
Christos Gougoussis
Dabo I Cances E Li Y Marzari N
Dal Corso A
Dal Corso A
Davide Ceresoli
di Meo R Dal Corso A Giannozzi P Cozzini S
Dovesi R
Dreizler R M
Fletcher R
Francesco Mauri
Frisch M J
Gabriele Sclauzero
Giannozzi P
Guido Fratesi
Guido L Chiarotti
Gygi F
Hellmann H
Isaev E
Ismaila Dabo
Koelling D D
Kokalj A
Kresse G
Layla Martin-Samos
Lorenzo Paulatto
MacDonald A H
Martin R M
Marx D
Marzari N
Matteo Calandra
Matteo Cococcioni
Michele Lazzeri
Mosca Conte A
Nguyen H-V
Nicola Bonini
Nicola Marzari
Paolo Giannozzi
Paolo Umari
Parr R G
Ralph Gebauer
Renata M Wentzcovitch
Riccardo Mazzarello
Roberto Car
Sandro Scandolo
Stefano Baroni
Stefano de Gironcoli
Stefano Fabris
Stefano Paolini
Szabo A
Thonhauser T Ceresoli D Mostofi A Marzari N Resta R Vanderbilt D
Uwe Gerstmann
Publication venue: 'IOP Publishing'
Publication date: 01/01/2009
Field of study

Quantum ESPRESSO is an integrated suite of computer codes for electronic-structure calculations and materials modeling, based on density-functional theory, plane waves, and pseudopotentials (norm-conserving, ultrasoft, and projector-augmented wave). Quantum ESPRESSO stands for "opEn Source Package for Research in Electronic Structure, Simulation, and Optimization". It is freely available to researchers around the world under the terms of the GNU General Public License. Quantum ESPRESSO builds upon newly-restructured electronic-structure codes that have been developed and tested by some of the original authors of novel electronic-structure algorithms and applied in the last twenty years by some of the leading materials modeling groups worldwide. Innovation and efficiency are still its main focus, with special attention paid to massively-parallel architectures, and a great effort being devoted to user friendliness. Quantum ESPRESSO is evolving towards a distribution of independent and inter-operable codes in the spirit of an open-source project, where researchers active in the field of electronic-structure calculations are encouraged to participate in the project by contributing their own codes or by implementing their own ideas into existing codes.Comment: 36 pages, 5 figures, resubmitted to J.Phys.: Condens. Matte

Archivio istituzionale della ricerca - Università degli Studi di Udine

Publikationsserver der RWTH Aachen University

Tamkang University Institutional Repository

Sissa Digital Library

HAL Université de Tours

Hal-Diderot

HAL-Ecole des Ponts ParisTech

Archivio istituzionale della ricerca - Università di Padova

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

AIR Universita degli studi di Milano

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

INRIA a CCSD electronic archive server

King's Research Portal

Archivio della ricerca- Università di Roma La Sapienza

Compilation techniques for irregular problems on parallel machines

Author: Das Subhendu
Publication venue: W&M ScholarWorks
Publication date: 01/01/1994
Field of study

Massively parallel computers have ushered in the era of teraflop computing. Even though large and powerful machines are being built, they are used by only a fraction of the computing community. The fundamental reason for this situation is that parallel machines are difficult to program. Development of compilers that automatically parallelize programs will greatly increase the use of these machines.;A large class of scientific problems can be categorized as irregular computations. In this class of computation, the data access patterns are known only at runtime, creating significant difficulties for a parallelizing compiler to generate efficient parallel codes. Some compilers with very limited abilities to parallelize simple irregular computations exist, but the methods used by these compilers fail for any non-trivial applications code.;This research presents development of compiler transformation techniques that can be used to effectively parallelize an important class of irregular programs. A central aim of these transformation techniques is to generate codes that aggressively prefetch data. Program slicing methods are used as a part of the code generation process. In this approach, a program written in a data-parallel language, such as HPF, is transformed so that it can be executed on a distributed memory machine. An efficient compiler runtime support system has been developed that performs data movement and software caching

College of William & Mary: W&M Publish

First-principle molecular dynamics with ultrasoft pseudopotentials: parallel implementation and application to extended bio-inorganic system

Author: Car R.
De Angelis F.
Giannozzi P.
Publication venue: 'AIP Publishing'
Publication date: 21/11/2003
Field of study

We present a plane-wave ultrasoft pseudopotential implementation of first-principle molecular dynamics, which is well suited to model large molecular systems containing transition metal centers. We describe an efficient strategy for parallelization that includes special features to deal with the augmented charge in the contest of Vanderbilt's ultrasoft pseudopotentials. We also discuss a simple approach to model molecular systems with a net charge and/or large dipole/quadrupole moments. We present test applications to manganese and iron porphyrins representative of a large class of biologically relevant metallorganic systems. Our results show that accurate Density-Functional Theory calculations on systems with several hundred atoms are feasible with access to moderate computational resources.Comment: 29 pages, 4 Postscript figures, revtex

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Udine

Run-time optimization of adaptive irregular applications

Author: Yu Hao
Publication venue: Texas A&M University
Publication date: 15/11/2004
Field of study

Compared to traditional compile-time optimization, run-time optimization could oﬀer signiﬁcant performance improvements when parallelizing and optimizing adaptive irregular applications, because it performs program analysis and adaptive optimizations during program execution. Run-time techniques can succeed where static techniques fail because they exploit the characteristics of input data, programs' dynamic behaviors, and the underneath execution environment. When optimizing adaptive irregular applications for parallel execution, a common observation is that the effectiveness of the optimizing transformations depends on programs' input data and their dynamic phases. This dissertation presents a set of run-time optimization techniques that match the characteristics of programs' dynamic memory access patterns and the appropriate optimization (parallelization) transformations. First, we present a general adaptive algorithm selection framework to automatically and adaptively select at run-time the best performing, functionally equivalent algorithm for each of its execution instances. The selection process is based on off-line automatically generated prediction models and characteristics (collected and analyzed dynamically) of the algorithm's input data, In this dissertation, we specialize this framework for automatic selection of reduction algorithms. In this research, we have identiﬁed a small set of machine independent high-level characterization parameters and then we deployed an off-line, systematic experiment process to generate prediction models. These models, in turn, match the parameters to the best optimization transformations for a given machine. The technique has been evaluated thoroughly in terms of applications, platforms, and programs' dynamic behaviors. Speciﬁcally, for the reduction algorithm selection, the selected performance is within 2% of optimal performance and on average is 60% better than "Replicated Buffer," the default parallel reduction algorithm speciﬁed by OpenMP standard. To reduce the overhead of speculative run-time parallelization, we have developed an adaptive run-time parallelization technique that dynamically chooses effcient shadow structures to record a program's dynamic memory access patterns for parallelization. This technique complements the original speculative run-time parallelization technique, the LRPD test, in parallelizing loops with sparse memory accesses. The techniques presented in this dissertation have been implemented in an optimizing research compiler and can be viewed as effective building blocks for comprehensive run-time optimization systems, e.g., feedback-directed optimization systems and dynamic compilation systems

Texas A&M Repository

A Sparse SCF algorithm and its parallel implementation: Application to DFTB

Author: Rapacioli Mathias
Renon Nicolas
Scemama Anthony
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2014
Field of study

We present an algorithm and its parallel implementation for solving a self consistent problem as encountered in Hartree Fock or Density Functional Theory. The algorithm takes advantage of the sparsity of matrices through the use of local molecular orbitals. The implementation allows to exploit efficiently modern symmetric multiprocessing (SMP) computer architectures. As a first application, the algorithm is used within the density functional based tight binding method, for which most of the computational time is spent in the linear algebra routines (diagonalization of the Fock/Kohn-Sham matrix). We show that with this algorithm (i) single point calculations on very large systems (millions of atoms) can be performed on large SMP machines (ii) calculations involving intermediate size systems (1~000--100~000 atoms) are also strongly accelerated and can run efficiently on standard servers (iii) the error on the total energy due to the use of a cut-off in the molecular orbital coefficients can be controlled such that it remains smaller than the SCF convergence criterion.Comment: 13 pages, 11 figure

arXiv.org e-Print Archive

HAL-INSA Toulouse

Chebyshev polynomial filtered subspace iteration in the Discontinuous Galerkin method for large-scale electronic structure calculations

Author: Amartya S. Banerjee
Blackford L. S.
Chao Yang
Gu M.
John E. Pask
Kohanoff J.
Lin Lin
Martin R. M.
Saad Y.
Wei Hu
Publication venue: 'AIP Publishing'
Publication date: 29/09/2016
Field of study

The Discontinuous Galerkin (DG) electronic structure method employs an adaptive local basis (ALB) set to solve the Kohn-Sham equations of density functional theory (DFT) in a discontinuous Galerkin framework. The adaptive local basis is generated on-the-fly to capture the local material physics, and can systematically attain chemical accuracy with only a few tens of degrees of freedom per atom. A central issue for large-scale calculations, however, is the computation of the electron density (and subsequently, ground state properties) from the discretized Hamiltonian in an efficient and scalable manner. We show in this work how Chebyshev polynomial filtered subspace iteration (CheFSI) can be used to address this issue and push the envelope in large-scale materials simulations in a discontinuous Galerkin framework. We describe how the subspace filtering steps can be performed in an efficient and scalable manner using a two-dimensional parallelization scheme, thanks to the orthogonality of the DG basis set and block-sparse structure of the DG Hamiltonian matrix. The on-the-fly nature of the ALBs requires additional care in carrying out the subspace iterations. We demonstrate the parallel scalability of the DG-CheFSI approach in calculations of large-scale two-dimensional graphene sheets and bulk three-dimensional lithium-ion electrolyte systems. Employing 55,296 computational cores, the time per self-consistent field iteration for a sample of the bulk 3D electrolyte containing 8,586 atoms is 90 seconds, and the time for a graphene sheet containing 11,520 atoms is 75 seconds.Comment: Submitted to The Journal of Chemical Physic

arXiv.org e-Print Archive

Crossref

eScholarship - University of California