11 research outputs found

    The art of solving a large number of non-stiff, low-dimensional ordinary differential equation systems on GPUs and CPUs

    Get PDF
    This paper discusses the main performance barriers for solving a large number of independent ordinary differential equation systems on processors (CPU) and graphics cards (GPU). With a naïve approach, for instance, the utilisation of a CPU can be as low as 4% of its theoretical peak processing power. The main barriers identified by the detailed analysing of the hardware architectures and profiling using hardware performance monitoring units are as follows. First, exploitation of the SIMD capabilities of the CPU via vector registers. The solution is to implement/enforce explicit vectorisation. Second, hiding instruction latencies on both CPUs and GPUs that can be achieved with increasing (instruction-level) parallelism. Third, the efficient handling of large timescale differences or event handling using the massively parallel architecture of GPUs. A viable option to overcome this difficulty is asynchronous time stepping. The above optimisation techniques and their implementation possibilities are discussed and tested on three program packages: MPGOS written in C++ and specialised only for GPUs; ODEINT implemented in C++, which supports execution on both CPUs and GPUs; finally, DifferentialEquations.jl written in Julia that also supports execution on both CPUs and GPUs. The tested systems (Lorenz equation, Keller–Miksis equation and a pressure relief valve model) are non-stiff and have low dimension. Thus, the performance of the codes are not limited by memory bandwidth, and Runge–Kutta type solvers are efficient and suitable choices. The employed hardware are an Intel Core i7-4820K CPU with 30.4 GFLOPS peak double-precision performance per cores and an Nvidia GeForce Titan Black GPU that has a total of 1707 GFLOPS peak double-precision performance

    emgr - The Empirical Gramian Framework

    Full text link
    System Gramian matrices are a well-known encoding for properties of input-output systems such as controllability, observability or minimality. These so-called system Gramians were developed in linear system theory for applications such as model order reduction of control systems. Empirical Gramian are an extension to the system Gramians for parametric and nonlinear systems as well as a data-driven method of computation. The empirical Gramian framework - emgr - implements the empirical Gramians in a uniform and configurable manner, with applications such as Gramian-based (nonlinear) model reduction, decentralized control, sensitivity analysis, parameter identification and combined state and parameter reduction

    Memory-friendly fixed-point iteration method for nonlinear surface mode oscillations of acoustically driven bubbles: from the perspective of high-performance GPU programming

    Get PDF
    A fixed-point iteration technique is presented to handle the implicit nature of the governing equations of nonlinear surface mode oscillations of acoustically excited microbubbles. The model is adopted from the theoretical work of Shaw [1], where the dynamics of the mean bubble radius and the surface modes are bi-directionally coupled via nonlinear terms. The model comprises a set of second-order ordinary differential equations. It extends the classic Keller–Miksis equation and the linearized dynamical equations for each surface mode. Only the implicit parts (containing the second derivatives) are reevaluated during the iteration process. The performance of the technique is tested at various parameter combinations. The majority of the test cases needs only a single reevaluation to achieve 10^-9 error. Although the arithmetic operation count is higher than the Gauss elimination, due to its memory-friendly matrix-free nature, it is a viable alternative for high-performance GPU computations of massive parameter studies

    Software for Exascale Computing - SPPEXA 2016-2019

    Get PDF
    This open access book summarizes the research done and results obtained in the second funding phase of the Priority Program 1648 "Software for Exascale Computing" (SPPEXA) of the German Research Foundation (DFG) presented at the SPPEXA Symposium in Dresden during October 21-23, 2019. In that respect, it both represents a continuation of Vol. 113 in Springer’s series Lecture Notes in Computational Science and Engineering, the corresponding report of SPPEXA’s first funding phase, and provides an overview of SPPEXA’s contributions towards exascale computing in today's sumpercomputer technology. The individual chapters address one or more of the research directions (1) computational algorithms, (2) system software, (3) application software, (4) data management and exploration, (5) programming, and (6) software tools. The book has an interdisciplinary appeal: scholars from computational sub-fields in computer science, mathematics, physics, or engineering will find it of particular interest

    High-performance computing for impact-induced fracture analysis exploiting octree mesh patterns

    Full text link
    The impact-induced fracture analysis has a wide range of engineering and defence applications, including aerospace, manufacturing and construction. An accurate simulation of impact events often requires modelling large-scale complex geometries along with dynamic stress waves and damage propagation. To perform such simulations in a timely manner, a highly efficient and scalable computational framework is necessary. This thesis aims to develop a high-performance computational framework for analysing large-scale structural problems pertaining to impact-induced fracture events. A hierarchical grid-based mesh containing octree cells is utilised for discretising the problem domain. The scaled boundary finite element method (SBFEM) is employed, which can efficiently handle the octree cells by eliminating the hanging node issues. The octree-mesh is used in balanced form with a limited number of octree cell patterns. The master element matrices of each pattern are pre-computed while the storage of the individual element matrices is avoided leading to a significant reduction in memory requirements, especially for large-scale models. Further, the advantages of octree cells are leveraged by automatic mesh generation and local refinement process, which enables efficient pre-processing of models with complex geometries. To handle the matrix operations associated with large-scale simulation, a pattern-by-pattern (PBP) approach is proposed. In this technique, the octree-patterns are exploited to recast a majority of the computational work into pattern-level dense matrix operations. This avoids global matrix assembly, allows better cache utilisation, and aids the associated memory-bandwidth limited computations, resulting in significant performance gains in matrix operations. The PBP approach also supports large-scale parallelism. In this work, the parallel computation is carried out using the mesh-partitioning strategy and implemented using the message passing technique. It is shown that the developed solvers can simulate large-scale and complex structural problems, e.g. delamination/fracture in sandwich panels with approximately a billion unknowns (or DOFs). A massive scaling can be achieved with more than ten thousand cores in a distributed computing environment, which reduces the computation time from months (on a single core) to a few minutes

    Towards Cognition-Guided Patient-Specific Numerical Simulation for Cardiac Surgery Assistance

    Get PDF
    Motivation. Patient-specific, knowledge-based, holistic surgical treatment planning is of utmost importance when dealing with complex surgery. Surgeons need to account for all available medical patient data, keep track of technical developments, and stay on top of current surgical expert knowledge to define a suitable surgical treatment strategy. There is a large potential for computer assistance, also, and in particular, regarding surgery simulation which gives surgeons the opportunity not only to plan but to simulate, too, some steps of an intervention and to forecast relevant surgical situations. Purpose. In this work, we particularly look at mitral valve reconstruction (MVR) surgery, which is to re-establish the functionality of an incompetent mitral valve (MV) through implantation of an artificial ring that reshapes the valvular morphology. We aim at supporting MVR by providing surgeons with biomechanical FEM-based MVR surgery simulations that enable them to assess the simulated behavior of the MV after an MVR. However, according to the above requirements, such surgery simulation is really beneficial to surgeons only if it is patient-specific, surgical expert knowledge-based, comprehensive in terms of the underlying model and the patient’s data, and if its setup and execution is fully automated and integrated into the surgical treatment workflow. Methods. This PhD work conducts research on simulation-enhanced, cognition-guided, patient-specific cardiac surgery assistance. First, we derive a biomechanical MV/MVR model and develop an FEM-based MVR surgery simulation using the FEM software toolkit HiFlow3. Following, we outline the functionality and features of the Medical Simulation Markup Language (MSML) and how it simplifies the biomechanical modeling workflow. It is then detailed, how, by means of the MSML and a set of dedicated MVR simulation reprocessing operators, patient-individual medical data can comprehensively be analyzed and processed in order for the fully automated setup of MVR simulation scenarios. Finally, the presented work is integrated into the cognitive system architecture of the joint research project Cognition-Guided Surgery. We particularly look at its semantic knowledge and data infrastructure as well as at the setup of its cognitive software components, which eventually facilitate cognition-guidance and patient-specifity for the overall simulation-enhanced MVR assistance pipeline. Results and Discussion. We have proposed and implemented, for the first time, a prototypic system for simulation-enhanced, cognition-guided, patient-specific cardiac surgery assistance. The overall system was evaluated in terms of functionality and performance. Through its cognitive, data-driven pipeline setup, medical patient data and surgical information is analyzed and processed comprehensively, efficiently and fully automatically, and the hence set-up simulation scenarios yield reliable, patient-specific MVR surgery simulation results. This indicates the system’s usability and applicability. The proposed work thus presents an important step towards a simulation-enhanced, cognition-guided, patient-specific cardiac surgery assistance, and can – once operative – be expected to significantly enhance MVR surgery. Concluding, we discuss possible further research contents and promising applications to build upon the presented work
    corecore