Search CORE

1,571 research outputs found

Hybrid CPU/GPU implementation for the FE2 multi-scale method for composite problems

Author: Giuntoli Guido
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2020
Field of study

This thesis aims to develop a High-Performance Computing implementation to solve large composite materials problems through the use of the FE2 multi-scale method. Previous works have not been able to scale the FE2 strategy to real size problems with mesh resolutions of more than 10K elements at the macro-scale and 100^3 elements at the micro-scale. The latter is due to the computational requirements needed to carry out these calculations. This works identifies the most computationally intensive parts of the FE2 algorithm and ports several parts of the micro-scale computations to GPUs. The cases considered assume small deformations and steady-state equilibrium conditions. The work provides a feasible parallel strategy that can be used in real engineering cases to optimize the design of composite material structures. For this, it presents a coupling scheme between the MPI multi-physics code Alya (macro-scale) and the CPU/GPU-accelerated code Micropp (micro-scale). The coupled system is designed to work on multi-GPU architectures and to exploit the GPU overloading. Also, a Multi-Zone coupling methodology combined with weighted partitioning is proposed to reduce the computational cost and to solve the load balance problem. The thesis demonstrates that the method proposed scales notably well for the target problems, especially in hybrid architectures with distributed CPU nodes and communicated with multiple GPUs. Moreover, it clarifies the advantages achieved with the CPU/GPU accelerated version respect to the pure CPU approach.Esta tesis apunta a desarrollar una implementación de alta performance computacional para resolver problemas grandes de materiales compuestos a través del método de Multi-Escala FE2. Trabajos previos no han logrado escalar la técnica FE2 a problemas de dimensiones reales con mayas de resolucion de más de 10 K elementos en la macro-escala y 100^3 elementos en la micro-escala. Esto último se debe a los requerimientos computacionales para llevar a cabo estos cálculos. Este trabajo identifica las partes computacionales más costosas del algoritmo FE2 y porta varias partes del cálculo de micro-escala a GPUs. Los casos considerados asumen condiciones de pequeñas deformaciones y estado estacionario de equilibrio. El trabajo provee una estrategía factible que puede ser usada en problemas reales de ingeniería para optimizar el diseño de estructuras de materiales compuestos. Para esto se presenta un esquema de acople entre el codigo MPI de multi-física Alya (macro-escala) y la versión acelerada CPU/GPU de Micropp (micro-escala). El sistema acoplado está diseñado para trabajar con arquitecturas de multiples GPUs y explotar la sobrecarga de GPUs. También, un método de multiple zonas de acople combinado con particionado pesado es propuesto para reducir el costo computacional y resolver el problema de balanceo de carga. La tesis demuestra que el método propuesto escala notablemente bien para los problemas modelo, especialmente en arquitecturas híbridas con nodos CPU distribuidos y comunicados con multiples GPUs. Más aún, la tesis clarifica las ventajas logradas con la versión acelerada CPU/GPU respecto a usar unicamente CPUs

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

High-performance and hardware-aware computing: proceedings of the second International Workshop on New Frontiers in High-performance and Hardware-aware Computing (HipHaC\u2711), San Antonio, Texas, USA, February 2011 ; (in conjunction with HPCA-17)

Author: Buchty Rainer
Weiß Jan-Philipp
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2011
Field of study

High-performance system architectures are increasingly exploiting heterogeneity. The HipHaC workshop aims at combining new aspects of parallel, heterogeneous, and reconfigurable microprocessor technologies with concepts of high-performance computing and, particularly, numerical solution methods. Compute- and memory-intensive applications can only benefit from the full hardware potential if all features on all levels are taken into account in a holistic approach

KITopen

HPC-enabling technologies for high-fidelity combustion simulations

Author: Borrell Pol Ricard
Houzeaux Guillaume
Mira Daniel
Pérez Sánchez Eduardo J.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

With the increase in computational power in the last decade and the forthcoming Exascale supercomputers, a new horizon in computational modelling and simulation is envisioned in combustion science. Considering the multiscale and multiphysics characteristics of turbulent reacting flows, combustion simulations are considered as one of the most computationally demanding applications running on cutting-edge supercomputers. Exascale computing opens new frontiers for the simulation of combustion systems as more realistic conditions can be achieved with high-fidelity methods. However, an efficient use of these computing architectures requires methodologies that can exploit all levels of parallelism. The efficient utilization of the next generation of supercomputers needs to be considered from a global perspective, that is, involving physical modelling and numerical methods with methodologies based on High-Performance Computing (HPC) and hardware architectures. This review introduces recent developments in numerical methods for large-eddy simulations (LES) and direct-numerical simulations (DNS) to simulate combustion systems, with focus on the computational performance and algorithmic capabilities. Due to the broad scope, a first section is devoted to describe the fundamentals of turbulent combustion, which is followed by a general description of state-of-the-art computational strategies for solving these problems. These applications require advanced HPC approaches to exploit modern supercomputers, which is addressed in the third section. The increasing complexity of new computing architectures, with tightly coupled CPUs and GPUs, as well as high levels of parallelism, requires new parallel models and algorithms exposing the required level of concurrency. Advances in terms of dynamic load balancing, vectorization, GPU acceleration and mesh adaptation have permitted to achieve highly-efficient combustion simulations with data-driven methods in HPC environments. Therefore, dedicated sections covering the use of high-order methods for reacting flows, integration of detailed chemistry and two-phase flows are addressed. Final remarks and directions of future work are given at the end. }The research leading to these results has received funding from the European Union’s Horizon 2020 Programme under the CoEC project, grant agreement No. 952181 and the CoE RAISE project grant agreement no. 951733.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

A hybrid parallel framework for computational solid mechanics

Author: Fidkowski Piotr
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2011
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 95-98).A novel, hybrid parallel C++ framework for computational solid mechanics is developed and presented. The modular and extensible design of this framework allows it to support a wide variety of numerical schemes including discontinuous Galerkin formulations and higher order methods, multiphysics problems, hybrid meshes made of different types of elements and a number of different linear and non-linear solvers. In addition, native, seamless support is included for hardware acceleration by Graphics Processing Units (GPUs) via NVIDIA's CUDA architecture for both single GPU workstations and heterogenous clusters of GPUs. The capabilities of the framework are demonstrated through a series of sample problems, including a laser induced cylindrical shock propagation, a dynamic problem involving a micro-truss array made of millions of elements, and a tension problem involving a shape memory alloy with a multifield formulation to model the superelastic effect.by Piotr Fidkowski.S.M

DSpace@MIT

Molecular dynamics simulation: a tool for exploration and discovery using simple models

Author: Rapaport D. C.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2014
Field of study

Emergent phenomena share the fascinating property of not being obvious consequences of the design of the system in which they appear. This characteristic is no less relevant when attempting to simulate such phenomena, given that the outcome is not always a foregone conclusion. The present survey focuses on several simple model systems that exhibit surprisingly rich emergent behavior, all studied by MD simulation. The examples are taken from the disparate fields of fluid dynamics, granular matter and supramolecular self-assembly. In studies of fluids modeled at the detailed microscopic level using discrete particles, the simulations demonstrate that complex hydrodynamic phenomena in rotating and convecting fluids, the Taylor-Couette and Rayleigh-B\'enard instabilities, can not only be observed within the limited length and time scales accessible to MD, but even quantitative agreement can be achieved. Simulation of highly counterintuitive segregation phenomena in granular mixtures, again using MD methods, but now augmented by forces producing damping and friction, leads to results that resemble experimentally observed axial and radial segregation in the case of a rotating cylinder, and to a novel form of horizontal segregation in a vertically vibrated layer. Finally, when modeling self-assembly processes analogous to the formation of the polyhedral shells that package spherical viruses, simulation of suitably shaped particles reveals the ability to produce complete, error-free assembly, and leads to the important general observation that reversible growth steps contribute to the high yield. While there are limitations to the MD approach, both computational and conceptual, the results offer a tantalizing hint of the kinds of phenomena that can be explored, and what might be discovered when sufficient resources are brought to bear on a problem.Comment: 21 pages, 20 figures (v2 - minor text addition

arXiv.org e-Print Archive

CiteSeerX