Search CORE

88 research outputs found

Compilation of Abstracts for SC12 Conference Proceedings

Author: Morello Gina Francine
Publication venue
Publication date
Field of study

1 A Breakthrough in Rotorcraft Prediction Accuracy Using Detached Eddy Simulation; 2 Adjoint-Based Design for Complex Aerospace Configurations; 3 Simulating Hypersonic Turbulent Combustion for Future Aircraft; 4 From a Roar to a Whisper: Making Modern Aircraft Quieter; 5 Modeling of Extended Formation Flight on High-Performance Computers; 6 Supersonic Retropropulsion for Mars Entry; 7 Validating Water Spray Simulation Models for the SLS Launch Environment; 8 Simulating Moving Valves for Space Launch System Liquid Engines; 9 Innovative Simulations for Modeling the SLS Solid Rocket Booster Ignition; 10 Solid Rocket Booster Ignition Overpressure Simulations for the Space Launch System; 11 CFD Simulations to Support the Next Generation of Launch Pads; 12 Modeling and Simulation Support for NASA's Next-Generation Space Launch System; 13 Simulating Planetary Entry Environments for Space Exploration Vehicles; 14 NASA Center for Climate Simulation Highlights; 15 Ultrascale Climate Data Visualization and Analysis; 16 NASA Climate Simulations and Observations for the IPCC and Beyond; 17 Next-Generation Climate Data Services: MERRA Analytics; 18 Recent Advances in High-Resolution Global Atmospheric Modeling; 19 Causes and Consequences of Turbulence in the Earths Protective Shield; 20 NASA Earth Exchange (NEX): A Collaborative Supercomputing Platform; 21 Powering Deep Space Missions: Thermoelectric Properties of Complex Materials; 22 Meeting NASA's High-End Computing Goals Through Innovation; 23 Continuous Enhancements to the Pleiades Supercomputer for Maximum Uptime; 24 Live Demonstrations of 100-Gbps File Transfers Across LANs and WANs; 25 Untangling the Computing Landscape for Climate Simulations; 26 Simulating Galaxies and the Universe; 27 The Mysterious Origin of Stellar Masses; 28 Hot-Plasma Geysers on the Sun; 29 Turbulent Life of Kepler Stars; 30 Modeling Weather on the Sun; 31 Weather on Mars: The Meteorology of Gale Crater; 32 Enhancing Performance of NASAs High-End Computing Applications; 33 Designing Curiosity's Perfect Landing on Mars; 34 The Search Continues: Kepler's Quest for Habitable Earth-Sized Planets

NASA Technical Reports Server

Doctor of Philosophy

Author: Meng Qingyu
Publication venue: University of Utah
Publication date: 01/08/2014
Field of study

dissertationRecent trends in high performance computing present larger and more diverse computers using multicore nodes possibly with accelerators and/or coprocessors and reduced memory. These changes pose formidable challenges for applications code to attain scalability. Software frameworks that execute machine-independent applications code using a runtime system that shields users from architectural complexities oer a portable solution for easy programming. The Uintah framework, for example, solves a broad class of large-scale problems on structured adaptive grids using fluid-flow solvers coupled with particle-based solids methods. However, the original Uintah code had limited scalability as tasks were run in a predefined order based solely on static analysis of the task graph and used only message passing interface (MPI) for parallelism. By using a new hybrid multithread and MPI runtime system, this research has made it possible for Uintah to scale to 700K central processing unit (CPU) cores when solving challenging fluid-structure interaction problems. Those problems often involve moving objects with adaptive mesh refinement and thus with highly variable and unpredictable work patterns. This research has also demonstrated an ability to run capability jobs on the heterogeneous systems with Nvidia graphics processing unit (GPU) accelerators or Intel Xeon Phi coprocessors. The new runtime system for Uintah executes directed acyclic graphs of computational tasks with a scalable asynchronous and dynamic runtime system for multicore CPUs and/or accelerators/coprocessors on a node. Uintah's clear separation between application and runtime code has led to scalability increases without significant changes to application code. This research concludes that the adaptive directed acyclic graph (DAG)-based approach provides a very powerful abstraction for solving challenging multiscale multiphysics engineering problems. Excellent scalability with regard to the different processors and communications performance are achieved on some of the largest and most powerful computers available today

The University of Utah: J. Willard Marriott Digital Library

Modeling Energy Consumption of High-Performance Applications on Heterogeneous Computing Platforms

Author: Lawson Gary D., Jr.
Publication venue: ODU Digital Commons
Publication date: 01/10/2017
Field of study

Achieving Exascale computing is one of the current leading challenges in High Performance Computing (HPC). Obtaining this next level of performance will allow more complex simulations to be run on larger datasets and offer researchers better tools for data processing and analysis. In the dawn of Big Data, the need for supercomputers will only increase. However, these systems are costly to maintain because power is expensive. Thus, a better understanding of power and energy consumption is required such that future hardware can benefit. Available power models accurately capture the relationship to the number of cores and clock-rate, however the relationship between workload and power is less understood. Thus, investigation and analysis of power measurements has been a focal point in this work with the aim to improve the general understanding of energy consumption in the context of HPC. This dissertation investigates power and energy consumption of many different parallel applications on several hardware platforms while varying a number of execution characteristics. Multicore and manycore hardware devices are investigated in homogeneous and heterogeneous computing environments. Further, common techniques for reducing power and energy consumption are employed to each of these devices. Well-known power and performance models have been combined to form the Execution-Phase model, which may be used to quantify energy contributions based on execution phase and has been used to predict energy consumption to within 10%. However, due to limitations in the measurement procedure, a less intrusive approach is required. The Empirical Mode Decomposition (EMD) and Hilbert-Huang Transform analysis technique has been applied in innovative ways to model, analyze, and visualize power and energy measurements. EMD is widely used in other research areas, including earthquake, brain-wave, speech recognition, and sea-level rise analysis and this is the first it has been applied to power traces to analyze the complex interactions occurring within HPC systems. Probability distributions may be used to represent power and energy traces, thereby providing an alternative means of predicting energy consumption while retaining the fact that power is not constant over time. Further, these distributions may be used to define the cost of a workload for a given computing platform

Old Dominion University

Fast algorithm for real-time rings reconstruction

Author: Ammendola R.
Bauce Matteo
Biagioni A.
Capuani S.
Chiozzi Stefano
Cotta Ramusino Angelo
Di Domenico Giovanni
Fantechi R.
Fiorini Massimiliano
Giagu S.
Gianoli Alberto
Graverini E.
Lamanna Gianluca
Lonardo A.
Messina A.
Neri Ilaria
Palombo Marco
Pantaleo F.
Paolucci P.S.
Piandani R.
Pontisso L.
Rescigno M.
Simula F.
Sozzi Marco
Vicini P.
Publication venue: Verlag Deutsches Elektronen-Synchrotron
Publication date: 01/01/2015
Field of study

The GAP project is dedicated to study the application of GPU in several contexts in which real-time response is important to take decisions. The definition of real-time depends on the application under study, ranging from answer time of μs up to several hours in case of very computing intensive task. During this conference we presented our work in low level triggers [1] [2] and high level triggers [3] in high energy physics experiments, and specific application for nuclear magnetic resonance (NMR) [4] [5] and cone-beam CT [6]. Apart from the study of dedicated solution to decrease the latency due to data transport and preparation, the computing algorithms play an essential role in any GPU application. In this contribution, we show an original algorithm developed for triggers application, to accelerate the ring reconstruction in RICH detector when it is not possible to have seeds for reconstruction from external trackers

DESY Publication Database

DESY

Archivio istituzionale della ricerca - Università di Ferrara

Archivio della ricerca- Università di Roma La Sapienza

CERN Document Server

Accelerating Dynamical Density Response Code on Summit and Its Application for Computing the Density Response Function of Vanadium Sesquioxide

Author: Phan Wileam Y
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2021
Field of study

This thesis details the process of porting the Eguiluz group dynamical density response computational platform to the hybrid CPU+GPU environment at the Summit supercomputer at Oak Ridge National Laboratory (ORNL) Leadership Computing Center. The baseline CPU-only version is a Gordon Bell-winning platform within the formally-exact time-dependent density functional theory (TD-DFT) framework using the linearly augmented plane wave (LAPW) basis set. The code is accelerated using a combination of the OpenACC programming model and GPU libraries -- namely, the Matrix Algebra for GPU and Multicore Architectures (MAGMA) library -- as well as exploiting the sparsity pattern of the matrices involved in the matrix-matrix multiplication. Benchmarks show a 12.3x speedup compared to the CPU-only version. This performance boost should accelerate discovery in material and condensed matter physics through computational means. After the hybrid CPU+GPU code has been sufficiently optimized, it is used to study the dynamical density response function of vanadium sesquioxide, and the results are compared with spectroscopic data from non-resonant inelastic X-ray scattering {NIXS} experiments

University of Tennessee, Knoxville: Trace

Ubiquitous supercomputing : design and development of enabling technologies for multi-robot systems rethinking supercomputing

Author: Camargo Forero Leonardo
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2019
Field of study

Supercomputing, also known as High Performance Computing (HPC), is almost everywhere (ubiquitous), from the small widget in your phone telling you that today will be a sunny day, up to the next great contribution to the understanding of the origins of the universe.However, there is a field where supercomputing has been only slightly explored - robotics. Other than attempts to optimize complex robotics tasks, the two forces lack an effective alignment and a purposeful long-term contract. With advancements in miniaturization, communications and the appearance of powerful, energy and weight optimized embedded computing boards, a next logical transition corresponds to the creation of clusters of robots, a set of robotic entities that behave similarly as a supercomputer does. Yet, there is key aspect regarding our current understanding of what supercomputing means, or is useful for, that this work aims to redefine. For decades, supercomputing has been solely intended as a computing efficiency mechanism i.e. decreasing the computing time for complex tasks. While such train of thought have led to countless findings, supercomputing is more than that, because in order to provide the capacity of solving most problems quickly, another complete set of features must be provided, a set of features that can also be exploited in contexts such as robotics and that ultimately transform a set of independent entities into a cohesive unit.This thesis aims at rethinking what supercomputing means and to devise strategies to effectively set its inclusion within the robotics realm, contributing therefore to the ubiquity of supercomputing, the first main ideal of this work. With this in mind, a state of the art concerning previous attempts to mix robotics and HPC will be outlined, followed by the proposal of High Performance Robotic Computing (HPRC), a new concept mapping supercomputing to the nuances of multi-robot systems. HPRC can be thought as supercomputing in the edge and while this approach will provide all kind of advantages, in certain applications it might not be enough since interaction with external infrastructures will be required or desired. To facilitate such interaction, this thesis proposes the concept of ubiquitous supercomputing as the union of HPC, HPRC and two more type of entities, computing-less devices (e.g. sensor networks, etc.) and humans.The results of this thesis include the ubiquitous supercomputing ontology and an enabling technology depicted as The ARCHADE. The technology serves as a middleware between a mission and a supercomputing infrastructure and as a framework to facilitate the execution of any type of mission, i.e. precision agriculture, entertainment, inspection and monitoring, etc. Furthermore, the results of the execution of a set of missions are discussed.By integrating supercomputing and robotics, a second ideal is targeted, ubiquitous robotics, i.e. the use of robots in all kind of applications. Correspondingly, a review of existing ubiquitous robotics frameworks is presented and based upon its conclusions, The ARCHADE's design and development have followed the guidelines for current and future solutions. Furthermore, The ARCHADE is based on a rethought supercomputing where performance is not the only feature to be provided by ubiquitous supercomputing systems. However, performance indicators will be discussed, along with those related to other supercomputing features.Supercomputing has been an excellent ally for scientific exploration and not so long ago for commercial activities, leading to all kind of improvements in our lives, in our society and in our future. With the results of this thesis, the joining of two fields, two forces previously disconnected because of their philosophical approaches and their divergent backgrounds, holds enormous potential to open up our imagination for all kind of new applications and for a world where robotics and supercomputing are everywhere.La supercomputación, también conocida como Computación de Alto Rendimiento (HPC por sus siglas en inglés) puede encontrarse en casi cualquier lugar (ubicua), desde el widget en tu teléfono diciéndote que hoy será un día soleado, hasta la siguiente gran contribución al entendimiento de los orígenes del universo. Sin embargo, hay un campo en el que ha sido poco explorada - la robótica. Más allá de intentos de optimizar tareas robóticas complejas, las dos fuerzas carecen de un contrato a largo plazo. Dado los avances en miniaturización, comunicaciones y la aparición de potentes computadores embebidos, optimizados en peso y energía, la siguiente transición corresponde a la creación de un cluster de robots, un conjunto de robots que se comportan de manera similar a un supercomputador. No obstante, hay un aspecto clave, con respecto a la comprensión de la supercomputación, que esta tesis pretende redefinir. Durante décadas, la supercomputación ha sido entendida como un mecanismo de eficiencia computacional, es decir para reducir el tiempo de computación de ciertos problemas extremadamente complejos. Si bien este enfoque ha conducido a innumerables hallazgos, la supercomputación es más que eso, porque para proporcionar la capacidad de resolver todo tipo de problemas rápidamente, se debe proporcionar otro conjunto de características que también pueden ser explotadas en la robótica y que transforman un conjunto de robots en una unidad cohesiva. Esta tesis pretende repensar lo que significa la supercomputación y diseñar estrategias para establecer su inclusión dentro del mundo de la robótica, contribuyendo así a su ubicuidad, el principal ideal de este trabajo. Con esto en mente, se presentará un estado del arte relacionado con intentos anteriores de mezclar robótica y HPC, seguido de la propuesta de Computación Robótica de Alto Rendimiento (HPRC, por sus siglas en inglés), un nuevo concepto, que mapea la supercomputación a los matices específicos de los sistemas multi-robot. HPRC puede pensarse como supercomputación en el borde y si bien este enfoque proporcionará todo tipo de ventajas, ciertas aplicaciones requerirán una interacción con infraestructuras externas. Para facilitar dicha interacción, esta tesis propone el concepto de supercomputación ubicua como la unión de HPC, HPRC y dos tipos más de entidades, dispositivos sin computación embebida y seres humanos. Los resultados de esta tesis incluyen la ontología de la supercomputación ubicua y una tecnología llamada The ARCHADE. La tecnología actúa como middleware entre una misión y una infraestructura de supercomputación y como framework para facilitar la ejecución de cualquier tipo de misión, por ejemplo, agricultura de precisión, inspección y monitoreo, etc. Al integrar la supercomputación y la robótica, se busca un segundo ideal, robótica ubicua, es decir el uso de robots en todo tipo de aplicaciones. Correspondientemente, una revisión de frameworks existentes relacionados serán discutidos. El diseño y desarrollo de The ARCHADE ha seguido las pautas y sugerencias encontradas en dicha revisión. Además, The ARCHADE se basa en una supercomputación repensada donde la eficiencia computacional no es la única característica proporcionada a sistemas basados en la tecnología. Sin embargo, se analizarán indicadores de eficiencia computacional, junto con otros indicadores relacionados con otras características de la supercomputación. La supercomputación ha sido un excelente aliado para la exploración científica, conduciendo a todo tipo de mejoras en nuestras vidas, nuestra sociedad y nuestro futuro. Con los resultados de esta tesis, la unión de dos campos, dos fuerzas previamente desconectadas debido a sus enfoques filosóficos y sus antecedentes divergentes, tiene un enorme potencial para abrir nuestra imaginación hacia todo tipo de aplicaciones nuevas y para un mundo donde la robótica y la supercomputación estén en todos lado

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Ubiquitous supercomputing : design and development of enabling technologies for multi-robot systems rethinking supercomputing

Author: Camargo Forero Leonardo
Publication venue: Universitat Politècnica de Catalunya
Publication date: 25/10/2019
Field of study

UPCommons. Portal del coneixement obert de la UPC

High performance scientific computing in applications with direct finite element simulation

Author: Krishnasamy Ezhilmathi
Publication venue
Publication date: 16/07/2020
Field of study

xiii, 133 p.La predicción del flujo separado, incluida la pérdida de un avión completo mediantela dinámica de fluidos computacional (CFD) se considera uno de los grandes desaf¿¿os que seresolverán en 2030, según NASA. Las ecuaciones no lineales de Navier-Stokes proporcionan laformulación matemática para flujo de fluidos en espacios tridimensionales. Sin embargo, todaviafaltan soluciones clásicas, existencia y singularidad. Ya que el cálculo de la fuerza bruta esintratable para realizar simulación predictiva para un avión completo, uno puede usar la simulaciónnumérica directa (DNS); sin embargo, prohibitivamente caro ya que necesita resolver laturbulencia a escala de magnitud Re power (9/4). Considerando otros métodos como el estad¿¿sticopromedio Reynolds¿s Average Navier Stokes (RANS), spatial average Large Eddy Simulation(LES), y Hybrid Detached Eddy Simulation (DES), que requieren menos cantidad de grados delibertad. Todos estos métodos deben ajustarse a los problemas de referencia y, además, cerca las paredes, la malla tieneque ser muy fina para resolver las capas l¿¿mite (lo cual significa que el costo computacional es muycostoso). Por encima de todo, los resultados son sensibles a, por ejemplo, parámetros expl¿¿citos enel método, la malla, etc.Como una solución al desaf¿¿o, aqu¿¿ presentamos la adaptación Metodolog¿¿a de solución directa deFEM (DFS) con resolución numérica disparo, como una familia predictiva, libre de parámetros demétodos para flujo turbulento. Resolvimos el modelo de avión JAXA Standard Model (JSM) ennúmero realista de Reynolds, presentado como parte del High Lift Taller de predicción 3.Predijimos un aumento de Cl dentro de un error de 5 % vs experimento, arrastre Cd dentro de 10 %error y detenga 1 ¿ dentro del ángulo de ataque.El taller identificó un probable experimento error depedido 10 % para los resultados de arrastre. La simulación es 10 veces más rápido y más barato encomparación con CFD tradicional o existente enfoques. La eficiencia proviene principalmente dell¿¿mite de deslizamiento condición que permite mallas gruesas cerca de las paredes, orientada aobjetivos control de error adaptativo que refina la malla solo donde es necesario y grandes pasos detiempo utilizando un método de iteración de punto fijo tipo Schur, sin comprometer la precisión delos resultados de la simulación.También presentamos una generalización de DFS a densidad variable y validado contra el problemade referencia MARIN bien establecido. los Los resultados muestran un buen acuerdo con losresultados experimentales en forma de sensores de presión. Más tarde, usamos esta metodolog¿¿apara resolver dos aplicaciones en problemas de flujo multifásico. Uno tiene que ver con un flashtanque de almacenamiento de agua de lluvia (consorcio de agua de Bilbao), y el segundo es sobre eldiseño de una boquilla para impresión 3D. En el agua de lluvia tanque de almacenamiento,predijimos que la altura del agua en el tanque tiene un influencia significativa sobre cómo secomporta el flujo aguas abajo de la puerta del tanque (válvula). Para la impresión 3D,desarrollamos un diseño eficiente con El flujo de chorro enfocado para evitar la oxidación y elcalentamiento en la punta del boquilla durante un proceso de fusión.Finalmente, presentamos aqu¿¿ el paralelismo en múltiples GPU y el incrustado sistema dearquitectura Kalray. Casi todas las supercomputadoras de hoy tienen arquitecturas heterogéneas,1 See the UNESCO Internacional Standard nomenclature for fields of Science and Technologyacomo CPU+GPU u otros aceleradores, y, por lo tanto, es esencial desarrollar marcoscomputacionales para aprovecha de ellos. Como lo hemos visto antes, se comienza a desarrollar eseCFD más tarde en la década de 1060 cuando podemos tener poder computacional, por lo tanto, Esesencial utilizar y probar estos aceleradores para los cálculos de CFD. Las GPU tienen unaarquitectura diferente en comparación con las CPU tradicionales. Técnicamente, la GPU tienemuchos núcleos en comparación con las CPU que hacen de la GPU una buena opción para elcómputo paralelo.Para múltiples GPU, desarrollamos un cálculo de plantilla, aplicado a simulación depliegues geológicos. Exploramos la computación de halo y utilizamos Secuencias CUDA paraoptimizar el tiempo de computación y comunicación. La ganancia de rendimiento resultante fue de23 % para cuatro GPU con arquitectura Fermi, y la mejora correspondiente obtenida en cuatro LasGPU Kepler fueron de 47 %.This research was carried out at the Basque Center for Applied Mathematics (BCAM) within the CFD Computational Technology (CFDCT) and also at the School of Electrical Engineering and Computer Science(Royal Institue of Technology, Stockholm, Sweden). Which is suported by Fundacion Obra Social “la Caixa“, Severo Ochoa Excellence research centre 2014-2018 SEV-2013-0323, Severo Ochoa Excellence research centre 2018-2022 SEV-2017-0718, BERC program 2014-2017, BERC program 2018-2021, MSO4SC European project, Elkartek. This work has been performed using the computing infrastructure from SNIC (Swedish National Infrastructure for Computing)

Archivo Digital para la Docencia y la Investigación