233 research outputs found

    Parallel computing in automation of decoupled fluid-thermostructural simulation approach

    Full text link
    Decoupling approach presents a novel solution/alternative to the highly time-consuming fluid-thermal-structural simulation procedures when thermal effects and resultant displacements on machine tools are analyzed. Using high dimensional Characteristic Diagrams (CDs) along with a Clustering Algorithm that immensely reduces the data needed for training, a limited number of CFD simulations can suffice in effectively decoupling fluid and thermal-structural simulations. This approach becomes highly significant when complex geometries or dynamic components are considered. However, there is still scope for improvement in the reduction of time needed to train CDs. Parallel computation can be effectively utilized in decoupling approach in simultaneous execution of (i) CFD simulations and data export, and (ii) Clustering technique involving Genetic Algorithm and Radial Basis Function interpolation, which clusters and optimizes the training data for CDs. Parallelization reduces the entire computation duration from several days to a few hours and thereby, improving the efficiency and ease-of-use of decoupling simulation approach

    Task-based Runtime Optimizations Towards High Performance Computing Applications

    Get PDF
    The last decades have witnessed a rapid improvement of computational capabilities in high-performance computing (HPC) platforms thanks to hardware technology scaling. HPC architectures benefit from mainstream advances on the hardware with many-core systems, deep hierarchical memory subsystem, non-uniform memory access, and an ever-increasing gap between computational power and memory bandwidth. This has necessitated continuous adaptations across the software stack to maintain high hardware utilization. In this HPC landscape of potentially million-way parallelism, task-based programming models associated with dynamic runtime systems are becoming more popular, which fosters developers’ productivity at extreme scale by abstracting the underlying hardware complexity. In this context, this dissertation highlights how a software bundle powered by a task-based programming model can address the heterogeneous workloads engendered by HPC applications., i.e., data redistribution, geospatial modeling and 3D unstructured mesh deformation here. Data redistribution aims to reshuffle data to optimize some objective for an algorithm, whose objective can be multi-dimensional, such as improving computational load balance or decreasing communication volume or cost, with the ultimate goal of increasing the efficiency and therefore reducing the time-to-solution for the algorithm. Geostatistical modeling, one of the prime motivating applications for exascale computing, is a technique for predicting desired quantities from geographically distributed data, based on statistical models and optimization of parameters. Meshing the deformable contour of moving 3D bodies is an expensive operation that can cause huge computational challenges in fluid-structure interaction (FSI) applications. Therefore, in this dissertation, Redistribute-PaRSEC, ExaGeoStat-PaRSEC and HiCMA-PaRSEC are proposed to efficiently tackle these HPC applications respectively at extreme scale, and they are evaluated on multiple HPC clusters, including AMD-based, Intel-based, Arm-based CPU systems and IBM-based multi-GPU system. This multidisciplinary work emphasizes the need for runtime systems to go beyond their primary responsibility of task scheduling on massively parallel hardware system for servicing the next-generation scientific applications

    Parallel optimization algorithms for high performance computing : application to thermal systems

    Get PDF
    The need of optimization is present in every field of engineering. Moreover, applications requiring a multidisciplinary approach in order to make a step forward are increasing. This leads to the need of solving complex optimization problems that exceed the capacity of human brain or intuition. A standard way of proceeding is to use evolutionary algorithms, among which genetic algorithms hold a prominent place. These are characterized by their robustness and versatility, as well as their high computational cost and low convergence speed. Many optimization packages are available under free software licenses and are representative of the current state of the art in optimization technology. However, the ability of optimization algorithms to adapt to massively parallel computers reaching satisfactory efficiency levels is still an open issue. Even packages suited for multilevel parallelism encounter difficulties when dealing with objective functions involving long and variable simulation times. This variability is common in Computational Fluid Dynamics and Heat Transfer (CFD & HT), nonlinear mechanics, etc. and is nowadays a dominant concern for large scale applications. Current research in improving the performance of evolutionary algorithms is mainly focused on developing new search algorithms. Nevertheless, there is a vast knowledge of sequential well-performing algorithmic suitable for being implemented in parallel computers. The gap to be covered is efficient parallelization. Moreover, advances in the research of both new search algorithms and efficient parallelization are additive, so that the enhancement of current state of the art optimization software can be accelerated if both fronts are tackled simultaneously. The motivation of this Doctoral Thesis is to make a step forward towards the successful integration of Optimization and High Performance Computing capabilities, which has the potential to boost technological development by providing better designs, shortening product development times and minimizing the required resources. After conducting a thorough state of the art study of the mathematical optimization techniques available to date, a generic mathematical optimization tool has been developed putting a special focus on the application of the library to the field of Computational Fluid Dynamics and Heat Transfer (CFD & HT). Then the main shortcomings of the standard parallelization strategies available for genetic algorithms and similar population-based optimization methods have been analyzed. Computational load imbalance has been identified to be the key point causing the degradation of the optimization algorithm¿s scalability (i.e. parallel efficiency) in case the average makespan of the batch of individuals is greater than the average time required by the optimizer for performing inter-processor communications. It occurs because processors are often unable to finish the evaluation of their queue of individuals simultaneously and need to be synchronized before the next batch of individuals is created. Consequently, the computational load imbalance is translated into idle time in some processors. Several load balancing algorithms have been proposed and exhaustively tested, being extendable to any other population-based optimization method that needs to synchronize all processors after the evaluation of each batch of individuals. Finally, a real-world engineering application that consists on optimizing the refrigeration system of a power electronic device has been presented as an illustrative example in which the use of the proposed load balancing algorithms is able to reduce the simulation time required by the optimization tool.El aumento de las aplicaciones que requieren de una aproximación multidisciplinar para poder avanzar se constata en todos los campos de la ingeniería, lo cual conlleva la necesidad de resolver problemas de optimización complejos que exceden la capacidad del cerebro humano o de la intuición. En estos casos es habitual el uso de algoritmos evolutivos, principalmente de los algoritmos genéticos, caracterizados por su robustez y versatilidad, así como por su gran coste computacional y baja velocidad de convergencia. La multitud de paquetes de optimización disponibles con licencias de software libre representan el estado del arte actual en tecnología de optimización. Sin embargo, la capacidad de adaptación de los algoritmos de optimización a ordenadores masivamente paralelos alcanzando niveles de eficiencia satisfactorios es todavía una tarea pendiente. Incluso los paquetes adaptados al paralelismo multinivel tienen dificultades para gestionar funciones objetivo que requieren de tiempos de simulación largos y variables. Esta variabilidad es común en la Dinámica de Fluidos Computacional y la Transferencia de Calor (CFD & HT), mecánica no lineal, etc. y es una de las principales preocupaciones en aplicaciones a gran escala a día de hoy. La investigación actual que tiene por objetivo la mejora del rendimiento de los algoritmos evolutivos está enfocada principalmente al desarrollo de nuevos algoritmos de búsqueda. Sin embargo, ya se conoce una gran variedad de algoritmos secuenciales apropiados para su implementación en ordenadores paralelos. La tarea pendiente es conseguir una paralelización eficiente. Además, los avances en la investigación de nuevos algoritmos de búsqueda y la paralelización son aditivos, por lo que el proceso de mejora del software de optimización actual se verá incrementada si se atacan ambos frentes simultáneamente. La motivación de esta Tesis Doctoral es avanzar hacia una integración completa de las capacidades de Optimización y Computación de Alto Rendimiento para así impulsar el desarrollo tecnológico proporcionando mejores diseños, acortando los tiempos de desarrollo del producto y minimizando los recursos necesarios. Tras un exhaustivo estudio del estado del arte de las técnicas de optimización matemática disponibles a día de hoy, se ha diseñado una librería de optimización orientada al campo de la Dinámica de Fluidos Computacional y la Transferencia de Calor (CFD & HT). A continuación se han analizado las principales limitaciones de las estrategias de paralelización disponibles para algoritmos genéticos y otros métodos de optimización basados en poblaciones. En el caso en que el tiempo de evaluación medio de la tanda de individuos sea mayor que el tiempo medio que necesita el optimizador para llevar a cabo comunicaciones entre procesadores, se ha detectado que la causa principal de la degradación de la escalabilidad o eficiencia paralela del algoritmo de optimización es el desequilibrio de la carga computacional. El motivo es que a menudo los procesadores no terminan de evaluar su cola de individuos simultáneamente y deben sincronizarse antes de que se cree la siguiente tanda de individuos. Por consiguiente, el desequilibrio de la carga computacional se convierte en tiempo de inactividad en algunos procesadores. Se han propuesto y testado exhaustivamente varios algoritmos de equilibrado de carga aplicables a cualquier método de optimización basado en una población que necesite sincronizar los procesadores tras cada tanda de evaluaciones. Finalmente, se ha presentado como ejemplo ilustrativo un caso real de ingeniería que consiste en optimizar el sistema de refrigeración de un dispositivo de electrónica de potencia. En él queda demostrado que el uso de los algoritmos de equilibrado de carga computacional propuestos es capaz de reducir el tiempo de simulación que necesita la herramienta de optimización

    Estimation of Execution Parameters for k-Wave Simulations

    Get PDF
    Estimation of execution parameters takes centre stage in automatic offloading of complex biomedical workflows to cloud and high performance facilities. Since ordinary users have no or very limited knowledge of the performance characteristics of particular tasks in the workflow, the scheduling system has to have the capabilities to select appropriate amount of compute resources, e.g., compute nodes, GPUs, or processor cores and estimate the execution time and cost. The presented approach considers a fixed set of executables that can be used to create custom workflows, and collects performance data of successfully computed tasks. Since the workflows may differ in the structure and size of the input data, the execution parameters can only be obtained by searching the performance database and interpolating between similar tasks. This paper shows it is possible to predict the execution time and cost with a high confidence. If the task parameters are found in the performance database, the mean interpolation error stays below 2.29%. If only similar tasks are found, the mean interpolation error may grow up to 15%. Nevertheless, this is still an acceptable error since the cluster performance may vary on order of percent as well

    Computational methods for wave-structure interaction:Numerical analysis of a RBF-based method

    Get PDF

    Elucidating the Costitutive Relationship of Calcium-Silicate-Hydrate Gel Using High Throughput Reactive Molecular Simulations and Machine Learning

    Full text link
    Prediction of material behavior using machine learning (ML) requires consistent, accurate, and, representative large data for training. However, such consistent and reliable experimental datasets are not always available for materials. To address this challenge, we synergistically integrate ML with high-throughput reactive molecular dynamics (MD) simulations to elucidate the constitutive relationship of calcium–silicate–hydrate (C–S–H) gel—the primary binding phase in concrete formed via the hydration of ordinary Portland cement. Specifically, a highly consistent dataset on the nine elastic constants of more than 300 compositions of C–S–H gel is developed using high-throughput reactive simulations. From a comparative analysis of various ML algorithms including neural networks (NN) and Gaussian process (GP), we observe that NN provides excellent predictions. To interpret the predicted results from NN, we employ SHapley Additive exPlanations (SHAP), which reveals that the influence of silicate network on all the elastic constants of C–S–H is significantly higher than that of water and CaO content. Additionally, the water content is found to have a more prominent influence on the shear components than the normal components along the direction of the interlayer spaces within C–S–H. This result suggests that the in-plane elastic response is controlled by water molecules whereas the transverse response is mainly governed by the silicate network. Overall, by seamlessly integrating MD simulations with ML, this paper can be used as a starting point toward accelerated optimization of C–S–H nanostructures to design efficient cementitious binders with targeted properties

    Elucidating the constitutive relationship of calcium–silicate–hydrate gel using high throughput reactive molecular simulations and machine learning

    Get PDF
    Prediction of material behavior using machine learning (ML) requires consistent, accurate, and, representative large data for training. However, such consistent and reliable experimental datasets are not always available for materials. To address this challenge, we synergistically integrate ML with high-throughput reactive molecular dynamics (MD) simulations to elucidate the constitutive relationship of calcium–silicate–hydrate (C–S–H) gel—the primary binding phase in concrete formed via the hydration of ordinary portland cement. Specifically, a highly consistent dataset on the nine elastic constants of more than 300 compositions of C–S–H gel is developed using high-throughput reactive simulations. From a comparative analysis of various ML algorithms including neural networks (NN) and Gaussian process (GP), we observe that NN provides excellent predictions. To interpret the predicted results from NN, we employ SHapley Additive exPlanations (SHAP), which reveals that the influence of silicate network on all the elastic constants of C–S–H is significantly higher than that of water and CaO content. Additionally, the water content is found to have a more prominent influence on the shear components than the normal components along the direction of the interlayer spaces within C–S–H. This result suggests that the in-plane elastic response is controlled by water molecules whereas the transverse response is mainly governed by the silicate network. Overall, by seamlessly integrating MD simulations with ML, this paper can be used as a starting point toward accelerated optimization of C–S–H nanostructures to design efficient cementitious binders with targeted properties

    Immiscible thermo-viscous fingering in Hele-Shaw cells

    Get PDF
    We investigate immiscible radial displacement in a Hele-Shaw cell with a temperature dependent viscosity using two coupled high resolution numerical methods. Thermal gradients created in the domain through the injection of a low viscosity fluid at a different temperature to the resident high viscosity fluid can lead to the formation of unstable thermo-viscous fingers, which we explore in the context of immiscible flows. The transient, multi-zone heat transfer is evaluated using a newly developed auxiliary radial basis function-finite collocation (RBF-FC) method, which locally captures variation in flux and field variable over the moving interface, without the need for ghost node extrapolation. The viscosity couples the transient heat transfer to the Darcy pressure/velocity field, which is solved using a boundary element - RBF-FC method, providing an accurate and robust interface tracking scheme for the full thermo-viscous problem. We explore the thermo-viscous problem space using systematic numerical experiments, revealing that the early stage finger growth is controlled by the pressure gradient induced by the varying temperature and mobility field. In hot injection regimes, negative temperature gradients normal to the interface act to accelerate the interface, promoting finger bifurcation and enhancing the viscous fingering instability. Correspondingly, cold injection regimes stabilise the flow compared to isothermal cases, hindering finger formation. The interfacial mobility distribution controls the late stage bifurcation mode, with non-uniformities induced by the thermal diffusivity creating alternate bifurcation modes. Further numerical experiments reveal the neutral stability of the thermal effects on the fingering evolution, with classical viscous fingering dynamics eventually dominating the evolution. We conclude the paper with a mechanistic summary of the immiscible thermo-viscous fingering regime, providing the first detailed analysis of the thermal problem in immiscible flows

    Software for Exascale Computing - SPPEXA 2016-2019

    Get PDF
    This open access book summarizes the research done and results obtained in the second funding phase of the Priority Program 1648 "Software for Exascale Computing" (SPPEXA) of the German Research Foundation (DFG) presented at the SPPEXA Symposium in Dresden during October 21-23, 2019. In that respect, it both represents a continuation of Vol. 113 in Springer’s series Lecture Notes in Computational Science and Engineering, the corresponding report of SPPEXA’s first funding phase, and provides an overview of SPPEXA’s contributions towards exascale computing in today's sumpercomputer technology. The individual chapters address one or more of the research directions (1) computational algorithms, (2) system software, (3) application software, (4) data management and exploration, (5) programming, and (6) software tools. The book has an interdisciplinary appeal: scholars from computational sub-fields in computer science, mathematics, physics, or engineering will find it of particular interest

    A multi-scale method for complex flows of non-Newtonian fluids

    Full text link
    We introduce a new heterogeneous multi-scale method for the simulation of flows of non-Newtonian fluids in general geometries and present its application to paradigmatic two-dimensional flows of polymeric fluids. Our method combines micro-scale data from non-equilibrium molecular dynamics (NEMD) with macro-scale continuum equations to achieve a data-driven prediction of complex flows. At the continuum level, the method is model-free, since the Cauchy stress tensor is determined locally in space and time from NEMD data. The modelling effort is thus limited to the identification of suitable interaction potentials at the micro-scale. Compared to previous proposals, our approach takes into account the fact that the material response can depend strongly on the local flow type and we show that this is a necessary feature to correctly capture the macroscopic dynamics. In particular, we highlight the importance of extensional rheology in simulating generic flows of polymeric fluids.Comment: 18 pages, 9 figure
    corecore