20 research outputs found

    An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor

    Full text link
    Modern OpenMP threading techniques are used to convert the MPI-only Hartree-Fock code in the GAMESS program to a hybrid MPI/OpenMP algorithm. Two separate implementations that differ by the sharing or replication of key data structures among threads are considered, density and Fock matrices. All implementations are benchmarked on a super-computer of 3,000 Intel Xeon Phi processors. With 64 cores per processor, scaling numbers are reported on up to 192,000 cores. The hybrid MPI/OpenMP implementation reduces the memory footprint by approximately 200 times compared to the legacy code. The MPI/OpenMP code was shown to run up to six times faster than the original for a range of molecular system sizes.Comment: SC17 conference paper, 12 pages, 7 figure

    Knowledge is power: Quantum chemistry on novel computer architectures

    Get PDF
    In the first chapter of this thesis, a background of fundamental quantum chemistry concepts is provided. Chapter two contains an analysis of the performance and energy efficiency of various modern computer processor architectures while performing computational chemistry calculations. In chapter three, the processor architectural study is expanded to include parallel computational chemistry algorithms executed across multiple-node computer clusters. Chapter four describes a novel computational implementation of the fundamental Hartree-Fock method which significantly reduces computer memory requirements. In chapter five, a case study of quantum chemistry two-electron integral code interoperability is described. The final chapters of this work discuss applications of quantum chemistry. In chapter six, an investigation of the esterification of acetic acid on acid-functionalized silica is presented. In chapter seven, the application of ab initio molecular dynamics to study the photoisomerization and photocyclization of stilbene is discussed. Final concluding remarks are noted in chapter eight

    Knowledge is power: Quantum chemistry on novel computer architectures

    Get PDF
    In the first chapter of this thesis, a background of fundamental quantum chemistry concepts is provided. Chapter two contains an analysis of the performance and energy efficiency of various modern computer processor architectures while performing computational chemistry calculations. In chapter three, the processor architectural study is expanded to include parallel computational chemistry algorithms executed across multiple-node computer clusters. Chapter four describes a novel computational implementation of the fundamental Hartree-Fock method which significantly reduces computer memory requirements. In chapter five, a case study of quantum chemistry two-electron integral code interoperability is described. The final chapters of this work discuss applications of quantum chemistry. In chapter six, an investigation of the esterification of acetic acid on acid-functionalized silica is presented. In chapter seven, the application of ab initio molecular dynamics to study the photoisomerization and photocyclization of stilbene is discussed. Final concluding remarks are noted in chapter eight

    Complexity Reduction in Density Functional Theory: Locality in Space and Energy

    Full text link
    We present recent developments of the NTChem program for performing large scale hybrid Density Functional Theory calculations on the supercomputer Fugaku. We combine these developments with our recently proposed Complexity Reduction Framework to assess the impact of basis set and functional choice on its measures of fragment quality and interaction. We further exploit the all electron representation to study system fragmentation in various energy envelopes. Building off this analysis, we propose two algorithms for computing the orbital energies of the Kohn-Sham Hamiltonian. We demonstrate these algorithms can efficiently be applied to systems composed of thousands of atoms and as an analysis tool that reveals the origin of spectral properties.Comment: Accepted Manuscrip

    Investigation of charge migration/transfer in radical cations using Ehrenfest method with fully quantum nuclear motion

    Get PDF
    The main focus of this thesis is to investigate the effect of charge migration on molecular dynamics. Upon the creation of a superposition of cationic states by a short ionizing pulse in an attosecond pump-probe experiment, the electronic wavefunction is in a non-stationary state and the initial dynamics are purely electronic, driven by Charge Migration (CM) before the onset of any nuclear motions. The CM can be simulated using a frozen nuclear framework but its importance on long-term dynamics and competition with vibrationally mediated charge motion (i.e. Charge Transfer (CT)) remains unknown. Unravelling the mechanism behind CM and its importance on electron and nuclear coherence can help in designing an initial superposition of electronic states to steer nuclear motions toward a specific product. Further control of the photo-reactivity could be achieved with the use of probe/control laser pulses and open the door for more direct comparison with experimental results. In order to investigate the dynamics upon photoionization with an attosecond pump-pulse, the coupled electron-nuclear dynamics of the system is simulated using nonadiabatic quantum dynamics techniques within the sudden approximation. A single-set approach is adopted for the expansion of the nuclear wavefunction using a linear combination of Gaussian Wavepackets (GWP). The calculation is done using the Quantum-Ehrenfest method (QuEh) and the time-dependent Potential Energy Surfaces (PES) are evaluated with the Complete Active Space Configuration Interatcion (CAS-CI) method. The resulting dynamics are analyzed with adiabatic/diabatic state populations, Normal Mode (NM) displacements and bond lengths averaged over the nuclear wavepacket using Gross Gaussian populations (GGP). To reduce the cost of computation, the algorithm implemented in QUANTICS is parallelized with a Message Passing Interface (MPI). Further, the section of code which interacts with the database that contains previously calculated points on the PES is rewritten using the Structured Query Language (SQL) and the SQLite engine. For the purpose of unravelling the mechanism behind CM, the nonadiabatic dynamics of a model retinal Protonated Schiff Base (rPSB) and benzene are investigated by defining the initial electronic wavefunction in a systematic way. As demonstrated by the results on rPSB, the relaxation mechanism such as single and double bond length alternation and isomerization can controlled by varying the initial composition of electronic states. With the rich symmetry of benzene, the initial nuclear dynamics which are controlled by an initial gradient and electron dynamics can be analyzed using symmetry rules. The initial gradient is a combination of totally symmetric motion and non-symmetric components which correspond to the intra- (eigenstate) and inter-state (couplings) gradients, respectively. The electron dynamics and its associated nuclear motions can be examined by grouping together the localized holes where the CM occurs. With the initial gradient and CM, one can predict the initial nuclear relaxation and possibly control the photo-products formed by designing a specific superposition of electronic eigenstates. To explore the effect of laser pulses on dynamics, an implementation within the dipole approximation using the dipole-electric field dot product is done in the GAUSSIAN program. The dynamics in the presence of an infrared probe pulse is simulated on model systems such as allene and the ethylene cation. The pulse is able to induce change in the electron and nuclear dynamics of the system and some of its effect can be explained using irreducible representations and the alignment of the electric fields. The work presented in this thesis offers an insight into the photocontrol of molecules and opens the door for further investigation of charge-directed dynamics

    Dataflow Programming Paradigms for Computational Chemistry Methods

    Get PDF
    The transition to multicore and heterogeneous architectures has shaped the High Performance Computing (HPC) landscape over the past decades. With the increase in scale, complexity, and heterogeneity of modern HPC platforms, one of the grim challenges for traditional programming models is to sustain the expected performance at scale. By contrast, dataflow programming models have been growing in popularity as a means to deliver a good balance between performance and portability in the post-petascale era. This work introduces dataflow programming models for computational chemistry methods, and compares different dataflow executions in terms of programmability, resource utilization, and scalability. This effort is driven by computational chemistry applications, considering that they comprise one of the driving forces of HPC. In particular, many-body methods, such as Coupled Cluster methods (CC), which are the gold standard to compute energies in quantum chemistry, are of particular interest for the applied chemistry community. On that account, the latest development for CC methods is used as the primary vehicle for this research, but our effort is not limited to CC and can be applied across other application domains. Two programming paradigms for expressing CC methods into a dataflow form, in order to make them capable of utilizing task scheduling systems, are presented. Explicit dataflow, is the programming model where the dataflow is explicitly specified by the developer, is contrasted with implicit dataflow, where a task scheduling runtime derives the dataflow. An abstract model is derived to explore the limits of the different dataflow programming paradigms

    Modeling Energy Consumption of High-Performance Applications on Heterogeneous Computing Platforms

    Get PDF
    Achieving Exascale computing is one of the current leading challenges in High Performance Computing (HPC). Obtaining this next level of performance will allow more complex simulations to be run on larger datasets and offer researchers better tools for data processing and analysis. In the dawn of Big Data, the need for supercomputers will only increase. However, these systems are costly to maintain because power is expensive. Thus, a better understanding of power and energy consumption is required such that future hardware can benefit. Available power models accurately capture the relationship to the number of cores and clock-rate, however the relationship between workload and power is less understood. Thus, investigation and analysis of power measurements has been a focal point in this work with the aim to improve the general understanding of energy consumption in the context of HPC. This dissertation investigates power and energy consumption of many different parallel applications on several hardware platforms while varying a number of execution characteristics. Multicore and manycore hardware devices are investigated in homogeneous and heterogeneous computing environments. Further, common techniques for reducing power and energy consumption are employed to each of these devices. Well-known power and performance models have been combined to form the Execution-Phase model, which may be used to quantify energy contributions based on execution phase and has been used to predict energy consumption to within 10%. However, due to limitations in the measurement procedure, a less intrusive approach is required. The Empirical Mode Decomposition (EMD) and Hilbert-Huang Transform analysis technique has been applied in innovative ways to model, analyze, and visualize power and energy measurements. EMD is widely used in other research areas, including earthquake, brain-wave, speech recognition, and sea-level rise analysis and this is the first it has been applied to power traces to analyze the complex interactions occurring within HPC systems. Probability distributions may be used to represent power and energy traces, thereby providing an alternative means of predicting energy consumption while retaining the fact that power is not constant over time. Further, these distributions may be used to define the cost of a workload for a given computing platform

    Accelerating Dynamical Density Response Code on Summit and Its Application for Computing the Density Response Function of Vanadium Sesquioxide

    Get PDF
    This thesis details the process of porting the Eguiluz group dynamical density response computational platform to the hybrid CPU+GPU environment at the Summit supercomputer at Oak Ridge National Laboratory (ORNL) Leadership Computing Center. The baseline CPU-only version is a Gordon Bell-winning platform within the formally-exact time-dependent density functional theory (TD-DFT) framework using the linearly augmented plane wave (LAPW) basis set. The code is accelerated using a combination of the OpenACC programming model and GPU libraries -- namely, the Matrix Algebra for GPU and Multicore Architectures (MAGMA) library -- as well as exploiting the sparsity pattern of the matrices involved in the matrix-matrix multiplication. Benchmarks show a 12.3x speedup compared to the CPU-only version. This performance boost should accelerate discovery in material and condensed matter physics through computational means. After the hybrid CPU+GPU code has been sufficiently optimized, it is used to study the dynamical density response function of vanadium sesquioxide, and the results are compared with spectroscopic data from non-resonant inelastic X-ray scattering {NIXS} experiments

    Optimisation of the first principle code Octopus for massive parallel architectures: application to light harvesting complexes

    Get PDF
    [EN]: Computer simulation has become a powerful technique for assisting scientists in developing novel insights into the basic phenomena underlying a wide variety of complex physical systems. The work reported in this thesis is concerned with the use of massively parallel computers to simulate the fundamental features at the electronic structure level that control the initial stages of harvesting and transfer of solar energy in green plants which initiate the photosynthetic process. Currently available supercomputer facilities offer the possibility of using hundred of thousands of computing cores. However, obtaining a linear speed-up from HPC systems is far from trivial. Thus, great efforts must be devoted to understand the nature of the scientific code, the methods of parallel execution, data communication requirements in multi-process calculations, the efficient use of available memory, etc. This thesis deals with all of these themes, with a clear objective in mind: the electronic structure simulation of complete macro-molecular complexes, namely the Light Harvesting Complex II, with the aim of understanding its physical behaviour. In order to simulate this complex, we have used (with the assistance of the PRACE consortium) some of the most powerful supercomputers in Europe to run Octopus, a scientific software package for Density Functional Theory and TimeDependent Density Functional Theory calculations. Results obtained with Octopus have been analysed in depth in order to identify the main obstacles to optimal scaling using thousands of cores. Many problems have emerged, mainly the poor performance of the Poisson solver, high memory requirements, the transfer of high quantities of complex data structures among processes, and so on. Finally, all of these problems have been overcome, and the new version reaches a very high performance in massively parallel systems. Tests run efficiently up to 128K processors and thus we have been able to complete the largest TDDFT calculations performed to date. At the conclusion of this work it has been possible to study the Light Harvesting Complex II as originally envisioned.[EU]: Konputagailu bidezko simulazioa da, gaur egun, zientzialariek eskura duten tresnarik ahaltsuenetako bat sistema fisiko konplexuen portaera ulertzen saiatzeko. Oinarrizko fenomeno fisiko horiek simulatzeko superkonputagailuak erabili dira tesi honetan aurkezten den lanean. Konkretuki, punta-puntako konputagailuak erabili dira fotosintesiaren lehen urratsak ulertzeko, landare berdeetan eguzki-energiaren xurgatze-prozesua kontrolatzen duen molekula simulatuz. Superkonputazio-zentroek ehunka milaka prozesatze-nukleo dituzten makinak erabiltzeko aukera eskaintzen dute, baina ez da batere erraza azelerazio-faktore linealak lortzea halako konputagailuetan. Hori dela eta, ahalegin handiak egin behar dira, informatikaren ikuspegitik, sistema osoaren ezagutza ahalik eta sakonena lortzeko: kode zientifikoen izaera, beraren exekuzio paraleloen aukerak, prozesuen arteko datu-transmisioaren beharrak, sistemaren memoriaren erabilera eraginkorrena, eta abar. Tesi honek aurreko arazo guztiei aurre egiten die, helburu argi batekin: konplexu makromolekular osoen simulazioa, konkretuki Light Harvesting Complex II sistemaren egitura elektronikoaren simulazioa, beraren portaera fisikoa ulertu ahal izateko. Sistema hori simulatu ahal izateko bidean, Europako superkonputagailu azkarrenak erabili dira (PRACE partzuergoari esker) Octopus software-paketea exekutatzeko, zeina Density Functional Theory eta Time-Dependent Density Functional Theory izeneko teorien araberako simulazio elektronikoak egiten baititu. Lortutako emaitzak sakonki analizatu dira, milaka konputazio-nukleo eraginkorki erabiltzea oztopatzen zuten arazoak aurkitzeko. Problema ugari azaldu dira bide horretan, nagusiki Poisson ebazlearen errendimendu baxua, memoria eskaera handiak, datu-egitura konplexuen kopuru handiko transferentziak, eta abar. Azkenean, problema horiek guztiak ebatzi dira, eta bertsio berriak errendimendu handia lortu du superkonputagailu paraleloetan. Exekuzio eraginkorrak frogatu ahal izan ditugu 128K prozesadorera arte eta, ondorioz, inoizko TDDFT simulaziorik handienak egin ahal izan ditugu. Hala, lan honen amaieran, hasierako helburua bete ahal izan da: Light Harvesting Complex II sistema molekularraren azterketa egitea.University of the Basque Country, UPV/EHU, University of Coimbra, Red Española de Supercomputación (RES), Jülich Supercomputing Centre (JSC), Rechenzentrum Garching, Cineca, Barcelona Supercomputing Center (BSC), CeSViMa, European Research Council Advanced Grant DYNamo (ERC-2010-AdG-267374), Spanish Grant (FIS2013-46159-C3-1-P), Grupos Consolidados UPV/EHU del Gobierno Vasco (IT578-13), Grupos Consolidados UPV/EHU del Gobierno Vasco (IT395-10), European Community FP7 project CRONOS (Grant number 280879-2), COST Actions CM1204 (XLIC) and MP1306 (EUSpec), ALDAPA research group belongs to the Basque Advanced Informatics Laboratory (BAILab) supported by the University of the Basque Country UPV/EHU (grant UFI11/45).Peer Reviewe
    corecore