64 research outputs found

    Laskennallisten avaruussäämallien kehittäminen, validointi ja käyttö

    Get PDF
    Currently the majority of space-based assets are located inside the Earth's magnetosphere where they must endure the effects of the near-Earth space environment, i.e. space weather, which is driven by the supersonic flow of plasma from the Sun. Space weather refers to the day-to-day changes in the temperature, magnetic field and other parameters of the near-Earth space, similarly to ordinary weather which refers to changes in the atmosphere above ground level. Space weather can also cause adverse effects on the ground, for example, by inducing large direct currents in power transmission systems. The performance of computers has been growing exponentially for many decades and as a result the importance of numerical modeling in science has also increased rapidly. Numerical modeling is especially important in space plasma physics because there are no in-situ observations of space plasmas outside of the heliosphere and it is not feasible to study all aspects of space plasmas in a terrestrial laboratory. With the increasing number of computational cores in supercomputers, the parallel performance of numerical models on distributed memory hardware is also becoming crucial. This thesis consists of an introduction, four peer reviewed articles and describes the process of developing numerical space environment/weather models and the use of such models to study the near-Earth space. A complete model development chain is presented starting from initial planning and design to distributed memory parallelization and optimization, and finally testing, verification and validation of numerical models. A grid library that provides good parallel scalability on distributed memory hardware and several novel features, the distributed cartesian cell-refinable grid (DCCRG), is designed and developed. DCCRG is presently used in two numerical space weather models being developed at the Finnish Meteorological Institute. The first global magnetospheric test particle simulation based on the Vlasov description of plasma is carried out using the Vlasiator model. The test shows that the Vlasov equation for plasma in six-dimensionsional phase space is solved correctly by Vlasiator, that results are obtained beyond those of the magnetohydrodynamic (MHD) description of plasma and that global magnetospheric simulations using a hybrid-Vlasov model are feasible on current hardware. For the first time four global magnetospheric models using the MHD description of plasma (BATS-R-US, GUMICS, OpenGGCM, LFM) are run with identical solar wind input and the results compared to observations in the ionosphere and outer magnetosphere. Based on the results of the global magnetospheric MHD model GUMICS a hypothesis is formulated for a new mechanism of plasmoid formation in the Earth's magnetotail.Avaruuteen lähetetyistä laitteista suurin osa sijaitsee Maan magnetosfäärissä, missä ne altistuvat avaruussäälle. Avaruussäällä tarkoitetaan Maan lähiavaruuden läpötilan, magneettikentän ja muiden ominaisuuksien päivittäistä vaihtelua auringosta jatkuvasti virtaavan plasman - aurinkotuulen - vuoksi. Avaruussäällä voi olla haitallisia vaikutuksia myön Maan pinnalla, esimerkkinä sähkönsiirtoverkkoihin indusoituvat suuret tasavirrat. Tietokoneiden laskentateho on kasvanut eksponentiaalisesti jo vuosikymmenien ajan, minkä seurauksena myös laskennallisen mallinnuksen merkitys tieteelle on kasvanut huomattavasti. Laskennallinen mallintaminen on erityisen tärkeää avaruusplasmafysiikassa, sillä aurinkokunnan ulkopuolelta ei ole suoria mittauksia, eikä kaikkia avaruusplasman ominaisuuksia voida tutkia maanpäällisissä laboratorioissa. Supertietokoneiden laskentaytimien määrän kasvaessa myös laskennallisten mallien rinnakkaisesta suorituskyvystä on tullut ratkaisevan tärkeää. Väitöskirja koostuu johdannosta ja neljästä vertaisarvioidusta julkaisusta joissa kuvataan laskennallisten avaruussäämallien kehittämistä ja käyttöä Maan lähiavaruuden tutkimiseen. Avaruussäämallien kaikki kehittämisaskeleet käydään läpi alkaen alustavasta suunnittelusta ja toteutuksesta, jaetun muistin rinnakkaistuksesta ja laskentanopeuden optimoinnista aina testaukseen ja validointiin asti. Väitöskirjan yhteydessä on suunniteltu ja toteutettu rinnakkainen hila jota käytetään tällä hetkellä kahdessa Ilmatieteen laitoksella kehitettävässä avaruussäämallissa. Näistä toisen, Vlasovin yhtälöllä plasmaa mallintavan Vlasiatorin, täyden kuusiulotteisen faasiavaruuden (kolme paikka- ja kolme nopeusulottuvuutta) käsittävällä magnetosfääritestillä on osoitettu mallin toimivuus ja soveltuvuus Maapallon koko magnetosfäärin mallintamiseen nykyisillä supertietokoneilla. Ensimmäistä kertaa on myös vertailtu neljän eri avaruussäämallin (BATS-R-US, GUMICS, OpenGGCM, LFM) tuottamia ennusteita Maan lähiavaruudesta käyttäen samaa aurinkotuulisyötettä

    Software for Exascale Computing - SPPEXA 2016-2019

    Get PDF
    This open access book summarizes the research done and results obtained in the second funding phase of the Priority Program 1648 "Software for Exascale Computing" (SPPEXA) of the German Research Foundation (DFG) presented at the SPPEXA Symposium in Dresden during October 21-23, 2019. In that respect, it both represents a continuation of Vol. 113 in Springer’s series Lecture Notes in Computational Science and Engineering, the corresponding report of SPPEXA’s first funding phase, and provides an overview of SPPEXA’s contributions towards exascale computing in today's sumpercomputer technology. The individual chapters address one or more of the research directions (1) computational algorithms, (2) system software, (3) application software, (4) data management and exploration, (5) programming, and (6) software tools. The book has an interdisciplinary appeal: scholars from computational sub-fields in computer science, mathematics, physics, or engineering will find it of particular interest

    FEMPAR: an object-oriented parallel finite element framework

    Get PDF
    FEMPAR is an open source object oriented Fortran200X scientific software library for the high-performance scalable simulation of complex multiphysics problems governed by partial differential equations at large scales, by exploiting state-of-the-art supercomputing resources. It is a highly modularized, flexible, and extensible library, that provides a set of modules that can be combined to carry out the different steps of the simulation pipeline. FEMPAR includes a rich set of algorithms for the discretization step, namely (arbitrary-order) grad, div, and curl-conforming finite element methods, discontinuous Galerkin methods, B-splines, and unfitted finite element techniques on cut cells, combined with h-adaptivity. The linear solver module relies on state-of-the-art bulk-asynchronous implementations of multilevel domain decomposition solvers for the different discretization alternatives and block-preconditioning techniques for multiphysics problems. FEMPAR is a framework that provides users with out-of-the-box state-of-the-art discretization techniques and highly scalable solvers for the simulation of complex applications, hiding the dramatic complexity of the underlying algorithms. But it is also a framework for researchers that want to experience with new algorithms and solvers, by providing a highly extensible framework. In this work, the first one in a series of articles about FEMPAR, we provide a detailed introduction to the software abstractions used in the discretization module and the related geometrical module. We also provide some ingredients about the assembly of linear systems arising from finite element discretizations, but the software design of complex scalable multilevel solvers is postponed to a subsequent work.Peer ReviewedPostprint (published version

    FEMPAR: an object-oriented parallel finite element framework

    Get PDF
    FEMPAR is an open source object oriented Fortran200X scientific software library for the high-performance scalable simulation of complex multiphysics problems governed by partial differential equations at large scales, by exploiting state-of-the-art supercomputing resources. It is a highly modularized, flexible, and extensible library, that provides a set of modules that can be combined to carry out the different steps of the simulation pipeline. FEMPAR includes a rich set of algorithms for the discretization step, namely (arbitrary-order) grad, div, and curl-conforming finite element methods, discontinuous Galerkin methods, B-splines, and unfitted finite element techniques on cut cells, combined with h-adaptivity. The linear solver module relies on state-of-the-art bulk-asynchronous implementations of multilevel domain decomposition solvers for the different discretization alternatives and block-preconditioning techniques for multiphysics problems. FEMPAR is a framework that provides users with out-of-the-box state-of-the-art discretization techniques and highly scalable solvers for the simulation of complex applications, hiding the dramatic complexity of the underlying algorithms. But it is also a framework for researchers that want to experience with new algorithms and solvers, by providing a highly extensible framework. In this work, the first one in a series of articles about FEMPAR, we provide a detailed introduction to the software abstractions used in the discretization module and the related geometrical module. We also provide some ingredients about the assembly of linear systems arising from finite element discretizations, but the software design of complex scalable multilevel solvers is postponed to a subsequent work

    Resilience for large ensemble computations

    Get PDF
    With the increasing power of supercomputers, ever more detailed models of physical systems can be simulated, and ever larger problem sizes can be considered for any kind of numerical system. During the last twenty years the performance of the fastest clusters went from the teraFLOPS domain (ASCI RED: 2.3 teraFLOPS) to the pre-exaFLOPS domain (Fugaku: 442 petaFLOPS), and we will soon have the first supercomputer with a peak performance cracking the exaFLOPS (El Capitan: 1.5 exaFLOPS). Ensemble techniques experience a renaissance with the availability of those extreme scales. Especially recent techniques, such as particle filters, will benefit from it. Current ensemble methods in climate science, such as ensemble Kalman filters, exhibit a linear dependency between the problem size and the ensemble size, while particle filters show an exponential dependency. Nevertheless, with the prospect of massive computing power come challenges such as power consumption and fault-tolerance. The mean-time-between-failures shrinks with the number of components in the system, and it is expected to have failures every few hours at exascale. In this thesis, we explore and develop techniques to protect large ensemble computations from failures. We present novel approaches in differential checkpointing, elastic recovery, fully asynchronous checkpointing, and checkpoint compression. Furthermore, we design and implement a fault-tolerant particle filter with pre-emptive particle prefetching and caching. And finally, we design and implement a framework for the automatic validation and application of lossy compression in ensemble data assimilation. Altogether, we present five contributions in this thesis, where the first two improve state-of-the-art checkpointing techniques, and the last three address the resilience of ensemble computations. The contributions represent stand-alone fault-tolerance techniques, however, they can also be used to improve the properties of each other. For instance, we utilize elastic recovery (2nd contribution) for mitigating resiliency in an online ensemble data assimilation framework (3rd contribution), and we built our validation framework (5th contribution) on top of our particle filter implementation (4th contribution). We further demonstrate that our contributions improve resilience and performance with experiments on various architectures such as Intel, IBM, and ARM processors.Amb l’increment de les capacitats de còmput dels supercomputadors, es poden simular models de sistemes físics encara més detallats, i es poden resoldre problemes de més grandària en qualsevol tipus de sistema numèric. Durant els últims vint anys, el rendiment dels clústers més ràpids ha passat del domini dels teraFLOPS (ASCI RED: 2.3 teraFLOPS) al domini dels pre-exaFLOPS (Fugaku: 442 petaFLOPS), i aviat tindrem el primer supercomputador amb un rendiment màxim que sobrepassa els exaFLOPS (El Capitan: 1.5 exaFLOPS). Les tècniques d’ensemble experimenten un renaixement amb la disponibilitat d’aquestes escales tan extremes. Especialment les tècniques més noves, com els filtres de partícules, se¿n beneficiaran. Els mètodes d’ensemble actuals en climatologia, com els filtres d’ensemble de Kalman, exhibeixen una dependència lineal entre la mida del problema i la mida de l’ensemble, mentre que els filtres de partícules mostren una dependència exponencial. No obstant, juntament amb les oportunitats de poder computar massivament, apareixen desafiaments com l’alt consum energètic i la necessitat de tolerància a errors. El temps de mitjana entre errors es redueix amb el nombre de components del sistema, i s’espera que els errors s’esdevinguin cada poques hores a exaescala. En aquesta tesis, explorem i desenvolupem tècniques per protegir grans càlculs d’ensemble d’errors. Presentem noves tècniques en punts de control diferencials, recuperació elàstica, punts de control totalment asincrònics i compressió de punts de control. A més, dissenyem i implementem un filtre de partícules tolerant a errors amb captació i emmagatzematge en caché de partícules de manera preventiva. I finalment, dissenyem i implementem un marc per la validació automàtica i l’aplicació de compressió amb pèrdua en l’assimilació de dades d’ensemble. En total, en aquesta tesis presentem cinc contribucions, les dues primeres de les quals milloren les tècniques de punts de control més avançades, mentre que les tres restants aborden la resiliència dels càlculs d’ensemble. Les contribucions representen tècniques independents de tolerància a errors; no obstant, també es poden utilitzar per a millorar les propietats de cadascuna. Per exemple, utilitzem la recuperació elàstica (segona contribució) per a mitigar la resiliència en un marc d’assimilació de dades d’ensemble en línia (tercera contribució), i construïm el nostre marc de validació (cinquena contribució) sobre la nostra implementació del filtre de partícules (quarta contribució). A més, demostrem que les nostres contribucions milloren la resiliència i el rendiment amb experiments en diverses arquitectures, com processadors Intel, IBM i ARM.Postprint (published version

    Etude de l'adéquation des machines Exascale pour les algorithmes implémentant la méthode du Reverse Time Migation

    Get PDF
    As we are expecting Exascale systems for the 2018-2020 time frame, performance analysis and characterization of applications for new processor architectures and large scale systems are important tasks that permit to anticipate the required changes to efficiently exploit the future HPC systems. This thesis focuses on seismic imaging applications used for modeling complex physical phenomena, in particular the depth imaging application called Reverse Time Migration (RTM). My first contribution consists in characterizing and modeling the performance of the computational core of RTM which is based on finite-difference time-domain (FDTD) computations. I identify and explore the major tuning parameters influencing performance and the interaction between the architecture and the application. The second contribution is an analysis to identify the challenges for a hybrid and heterogeneous implementation of FDTD for manycore architectures. We target Intel’s first Xeon Phi co-processor, the Knights Corner. This architecture is an interesting proxy for our study since it contains some of the expected features of an Exascale system: concurrency and heterogeneity.My third contribution is an extension of the performance analysis and modeling to the full RTM. This adds communications and IOs to the computation part. RTM is a data intensive application and requires the storage of intermediate values of the computational field resulting in expensive IO accesses. My fourth contribution is the final measurement and model validation of my hybrid RTM implementation on a large system. This has been done on Stampede, a machine of the Texas Advanced Computing Center (TACC), which allows us to test the scalability up to 64 nodes each containing one 61-core Xeon Phi and two 8-core CPUs for a total close to 5000 heterogeneous coresLa caractérisation des applications en vue de les préparer pour les nouvelles architectures et les porter sur des systèmes très étendus est une étape importante pour pouvoir anticiper les modifications nécessaires. Comme les machines Exascale sont prévues pour la période 2018-2020, l'étude des applications et leur préparation pour ces machines s'avèrent donc essentielles. Nous nous intéressons aux applications d'imagerie sismique et en particulier à l'application Reverse Time Migration (RTM) car elle est très utilisée par les pétroliers dans le cadre de l'exploration sismique.La première partie de nos travaux a porté sur l'étude du cœur de calcul de l'application RTM qui consiste en un calcul de différences finies dans le domaine temporel (FDTD). Nous avons caractérisé cette partie de l'application en soulevant les aspects architecturaux des machines actuelles ayant un fort impact sur la performance, notamment les caches, les bandes passantes et le prefetching. Cette étude a abouti à l'élaboration d'un modèle de performance permettant de prédire le trafic DRAM des FDTD. La deuxième partie de la thèse se focalise sur l'impact de l'hétérogénéité et le parallélisme sur la FDTD et sur RTM. Nous avons choisi l'architecture manycore d’Intel, Xeon Phi, et nous avons étudié une implémentation "native" et une implémentation hétérogène et hybride, la version "symmetric". Enfin, nous avons porté l'application RTM sur un cluster hétérogène, Stampede du Texas Advanced Computing Center (TACC), où nous avons effectué des tests de scalabilité allant jusqu'à 64 nœuds contenant des coprocesseurs Xeon Phi et des processeurs Sandy Bridge ce qui correspond à presque 5000 cœur

    Computational Electromagnetism and Acoustics

    Get PDF
    It is a moot point to stress the significance of accurate and fast numerical methods for the simulation of electromagnetic fields and sound propagation for modern technology. This has triggered a surge of research in mathematical modeling and numerical analysis aimed to devise and improve methods for computational electromagnetism and acoustics. Numerical techniques for solving the initial boundary value problems underlying both computational electromagnetics and acoustics comprise a wide array of different approaches ranging from integral equation methods to finite differences. Their development faces a few typical challenges: highly oscillatory solutions, control of numerical dispersion, infinite computational domains, ill-conditioned discrete operators, lack of strong ellipticity, hysteresis phenomena, to name only a few. Profound mathematical analysis is indispensable for tackling these issues. Many outstanding contributions at this Oberwolfach conference on Computational Electromagnetism and Acoustics strikingly confirmed the immense recent progress made in the field. To name only a few highlights: there have been breakthroughs in the application and understanding of phase modulation and extraction approaches for the discretization of boundary integral equations at high frequencies. Much has been achieved in the development and analysis of discontinuous Galerkin methods. New insight have been gained into the construction and relationships of absorbing boundary conditions also for periodic media. Considerable progress has been made in the design of stable and space-time adaptive discretization techniques for wave propagation. New ideas have emerged for the fast and robust iterative solution for discrete quasi-static electromagnetic boundary value problems

    Software Roadmap to Plug and Play Petaflop/s

    Full text link
    • …
    corecore