29 research outputs found
Performance Analysis of Irregular Task-Based Applications on Hybrid Platforms: Structure Matters
International audienceEfficiently exploiting computational resources in heterogeneous platforms is a real challenge which has motivated the adoption of the task-based programming paradigm where resource usage is dynamic and adaptive. Unfortunately, classical performance visualization techniques used in routine performance analysis often fail to provide any insight in this new context, especially when the application structure is irregular. In this paper, we propose several performance visualization techniques and modeling strategies motivated by the analysis of task-based multifrontal sparse linear solvers whose structure is particularly complex. We show that by building on both a performance model of irregular tasks and on structure of the application (in particular the elimination tree), we can detect and highlight anomalies and understand resource utilization from the application point-of-view in a very insightful way. We validate these novel performance analysis techniques with the QR_mumps sparse parallel solver by describing a series of case studies where we identify and address non trivial performance issues thanks to our visualization methodology
Optimización del rendimiento y la eficiencia energética en sistemas masivamente paralelos
RESUMEN Los sistemas heterogéneos son cada vez más relevantes, debido a sus capacidades de rendimiento y eficiencia energética, estando presentes en todo tipo de plataformas de cómputo, desde dispositivos embebidos y servidores, hasta nodos HPC de grandes centros de datos. Su complejidad hace que sean habitualmente usados bajo el paradigma de tareas y el modelo de programación host-device. Esto penaliza fuertemente el aprovechamiento de los aceleradores y el consumo energético del sistema, además de dificultar la adaptación de las aplicaciones.
La co-ejecución permite que todos los dispositivos cooperen para computar el mismo problema, consumiendo menos tiempo y energía. No obstante, los programadores deben encargarse de toda la gestión de los dispositivos, la distribución de la carga y la portabilidad del código entre sistemas, complicando notablemente su programación.
Esta tesis ofrece contribuciones para mejorar el rendimiento y la eficiencia energética en estos sistemas masivamente paralelos. Se realizan propuestas que abordan objetivos generalmente contrapuestos: se mejora la usabilidad y la programabilidad, a la vez que se garantiza una mayor abstracción y extensibilidad del sistema, y al mismo tiempo se aumenta el rendimiento, la escalabilidad y la eficiencia energética. Para ello, se proponen dos motores de ejecución con enfoques completamente distintos.
EngineCL, centrado en OpenCL y con una API de alto nivel, favorece la máxima compatibilidad entre todo tipo de dispositivos y proporciona un sistema modular extensible. Su versatilidad permite adaptarlo a entornos para los que no fue concebido, como aplicaciones con ejecuciones restringidas por tiempo o simuladores HPC de dinámica molecular, como el utilizado en un centro de investigación internacional.
Considerando las tendencias industriales y enfatizando la aplicabilidad profesional, CoexecutorRuntime proporciona un sistema flexible centrado en C++/SYCL que dota de soporte a la co-ejecución a la tecnología oneAPI. Este runtime acerca a los programadores al dominio del problema, posibilitando la explotación de estrategias dinámicas adaptativas que mejoran la eficiencia en todo tipo de aplicaciones.ABSTRACT Heterogeneous systems are becoming increasingly relevant, due to their performance and energy efficiency capabilities, being present in all types of computing platforms, from embedded devices and servers to HPC nodes in large data centers. Their complexity implies that they are usually used under the task paradigm and the host-device programming model. This strongly penalizes accelerator utilization and system energy consumption, as well as making it difficult to adapt applications.
Co-execution allows all devices to simultaneously compute the same problem, cooperating to consume less time and energy. However, programmers must handle all device management, workload distribution and code portability between systems, significantly complicating their programming.
This thesis offers contributions to improve performance and energy efficiency in these massively parallel systems. The proposals address the following generally conflicting objectives: usability and programmability are improved, while ensuring enhanced system abstraction and extensibility, and at the same time performance, scalability and energy efficiency are increased. To achieve this, two runtime systems with completely different approaches are proposed.
EngineCL, focused on OpenCL and with a high-level API, provides an extensible modular system and favors maximum compatibility between all types of devices. Its versatility allows it to be adapted to environments for which it was not originally designed, including applications with time-constrained executions or molecular dynamics HPC simulators, such as the one used in an international research center.
Considering industrial trends and emphasizing professional applicability, CoexecutorRuntime provides a flexible C++/SYCL-based system that provides co-execution support for oneAPI technology. This runtime brings programmers closer to the problem domain, enabling the exploitation of dynamic adaptive strategies that improve efficiency in all types of applications.Funding: This PhD has been supported by the Spanish Ministry of Education (FPU16/03299 grant),
the Spanish Science and Technology Commission under contracts TIN2016-76635-C2-2-R
and PID2019-105660RB-C22.
This work has also been partially supported by the Mont-Blanc 3: European Scalable and
Power Efficient HPC Platform based on Low-Power Embedded Technology project (G.A. No.
671697) from the European Union’s Horizon 2020 Research and Innovation Programme
(H2020 Programme). Some activities have also been funded by the Spanish Science and Technology
Commission under contract TIN2016-81840-REDT (CAPAP-H6 network).
The Integration II: Hybrid programming models of Chapter 4 has been partially performed
under the Project HPC-EUROPA3 (INFRAIA-2016-1-730897), with the support of the EC
Research Innovation Action under the H2020 Programme. In particular, the author gratefully
acknowledges the support of the SPMT Department of the High Performance Computing
Center Stuttgart (HLRS)
XXIII Edición del Workshop de Investigadores en Ciencias de la Computación : Libro de actas
Compilación de las ponencias presentadas en el XXIII Workshop de Investigadores en Ciencias de la Computación (WICC), llevado a cabo en Chilecito (La Rioja) en abril de 2021.Red de Universidades con Carreras en Informátic
Simulación green de alto rendimiento de un modelo basado en agentes del mosquito Aedes aegypti
The increase in temperature caused by the climate change has resulted in the rapid dissemination of infectious diseases. Given the alert for the current situation, the World Health Organization (WHO) has declared a state of health emergency, highlighting the severity of the situation in some countries. For this reason, coming up with knowledge and tools that can help control and eradicate the vectors propagating these diseases is of the utmost importance. Highperformance modeling and simulation can be used to produce knowledge and strategies that allow predicting infections, guiding actions and/or training health/civil protection agents. The model developed as part of this research work is aimed at assisting the decision-making process for disease prevention and control, as well as evaluating the reproduction and predicting the evolution of the Aedes aegypti mosquito, which is the transmitting vector of the dengue, Zika and chikungunya diseases. Decisionmaking based on these models requires a large number of simulations to achieve results with statistical variability.
The objective of this paper is to demonstrate that the GPU is a suitable platform from the point of view of the reduction of energy consumed for HPC simulations. It is also shown that it is possible to define energy prediction models that allow scientists to plan their experiments based on energy consumption and select those that are representative for decision making by reducing energy consumption in HPC simulations.El aumento de la temperatura a raíz del cambio climático, ha dado lugar a la rápida expansión de enfermedades infecciosas. Dada la alerta por la situación actual, la Organización Mundial de la Salud (OMS) ha declarado la emergencia sanitaria poniendo de manifiesto la grave situación que se vive en algunos países. Es por ello que es necesario aportar conocimiento y herramientas que ayuden al control y erradicación del vector que propaga estas enfermedades. El modelado y la simulación de altas prestaciones pueden ayudar a aportar conocimiento y estrategias que permitan predecir infecciones, orientar actuaciones y/o formar a los agentes de protección civil/salud. El modelo desarrollado en este trabajo, tiene por objetivo ayudar a la toma de decisiones de prevención y control, a evaluar la reproducción y a predecir la evolución del mosquito Aedes aegypti, transmisor de las enfermedades dengue, Zika y chikungunya. Dado que son necesarias un elevado número de simulaciones para tener resultados con variabilidad estadística, se ha utilizado GPU. Con esta plataforma se busca: su potencia de cómputo para reducir el tiempo de ejecución y, además, reducir el consumo de energía.
Para ello se proponen diferentes escenarios y experimentos para comprobar los beneficios de la arquitectura propuesta.Facultad de Informátic
Simulación green de alto rendimiento de un modelo basado en agentes del mosquito Aedes aegypti
The increase in temperature caused by the climate change has resulted in the rapid dissemination of infectious diseases. Given the alert for the current situation, the World Health Organization (WHO) has declared a state of health emergency, highlighting the severity of the situation in some countries. For this reason, coming up with knowledge and tools that can help control and eradicate the vectors propagating these diseases is of the utmost importance. Highperformance modeling and simulation can be used to produce knowledge and strategies that allow predicting infections, guiding actions and/or training health/civil protection agents. The model developed as part of this research work is aimed at assisting the decision-making process for disease prevention and control, as well as evaluating the reproduction and predicting the evolution of the Aedes aegypti mosquito, which is the transmitting vector of the dengue, Zika and chikungunya diseases. Decisionmaking based on these models requires a large number of simulations to achieve results with statistical variability.
The objective of this paper is to demonstrate that the GPU is a suitable platform from the point of view of the reduction of energy consumed for HPC simulations. It is also shown that it is possible to define energy prediction models that allow scientists to plan their experiments based on energy consumption and select those that are representative for decision making by reducing energy consumption in HPC simulations.El aumento de la temperatura a raíz del cambio climático, ha dado lugar a la rápida expansión de enfermedades infecciosas. Dada la alerta por la situación actual, la Organización Mundial de la Salud (OMS) ha declarado la emergencia sanitaria poniendo de manifiesto la grave situación que se vive en algunos países. Es por ello que es necesario aportar conocimiento y herramientas que ayuden al control y erradicación del vector que propaga estas enfermedades. El modelado y la simulación de altas prestaciones pueden ayudar a aportar conocimiento y estrategias que permitan predecir infecciones, orientar actuaciones y/o formar a los agentes de protección civil/salud. El modelo desarrollado en este trabajo, tiene por objetivo ayudar a la toma de decisiones de prevención y control, a evaluar la reproducción y a predecir la evolución del mosquito Aedes aegypti, transmisor de las enfermedades dengue, Zika y chikungunya. Dado que son necesarias un elevado número de simulaciones para tener resultados con variabilidad estadística, se ha utilizado GPU. Con esta plataforma se busca: su potencia de cómputo para reducir el tiempo de ejecución y, además, reducir el consumo de energía.
Para ello se proponen diferentes escenarios y experimentos para comprobar los beneficios de la arquitectura propuesta.Facultad de Informátic
Simulación green de alto rendimiento de un modelo basado en agentes del mosquito Aedes aegypti
The increase in temperature caused by the climate change has resulted in the rapid dissemination of infectious diseases. Given the alert for the current situation, the World Health Organization (WHO) has declared a state of health emergency, highlighting the severity of the situation in some countries. For this reason, coming up with knowledge and tools that can help control and eradicate the vectors propagating these diseases is of the utmost importance. Highperformance modeling and simulation can be used to produce knowledge and strategies that allow predicting infections, guiding actions and/or training health/civil protection agents. The model developed as part of this research work is aimed at assisting the decision-making process for disease prevention and control, as well as evaluating the reproduction and predicting the evolution of the Aedes aegypti mosquito, which is the transmitting vector of the dengue, Zika and chikungunya diseases. Decisionmaking based on these models requires a large number of simulations to achieve results with statistical variability.
The objective of this paper is to demonstrate that the GPU is a suitable platform from the point of view of the reduction of energy consumed for HPC simulations. It is also shown that it is possible to define energy prediction models that allow scientists to plan their experiments based on energy consumption and select those that are representative for decision making by reducing energy consumption in HPC simulations.El aumento de la temperatura a raíz del cambio climático, ha dado lugar a la rápida expansión de enfermedades infecciosas. Dada la alerta por la situación actual, la Organización Mundial de la Salud (OMS) ha declarado la emergencia sanitaria poniendo de manifiesto la grave situación que se vive en algunos países. Es por ello que es necesario aportar conocimiento y herramientas que ayuden al control y erradicación del vector que propaga estas enfermedades. El modelado y la simulación de altas prestaciones pueden ayudar a aportar conocimiento y estrategias que permitan predecir infecciones, orientar actuaciones y/o formar a los agentes de protección civil/salud. El modelo desarrollado en este trabajo, tiene por objetivo ayudar a la toma de decisiones de prevención y control, a evaluar la reproducción y a predecir la evolución del mosquito Aedes aegypti, transmisor de las enfermedades dengue, Zika y chikungunya. Dado que son necesarias un elevado número de simulaciones para tener resultados con variabilidad estadística, se ha utilizado GPU. Con esta plataforma se busca: su potencia de cómputo para reducir el tiempo de ejecución y, además, reducir el consumo de energía.
Para ello se proponen diferentes escenarios y experimentos para comprobar los beneficios de la arquitectura propuesta.Facultad de Informátic
XXV Congreso Argentino de Ciencias de la Computación - CACIC 2019: libro de actas
Trabajos presentados en el XXV Congreso Argentino de Ciencias de la Computación (CACIC), celebrado en la ciudad de Río Cuarto los días 14 al 18 de octubre de 2019 organizado por la Red de Universidades con Carreras en Informática (RedUNCI) y Facultad de Ciencias Exactas, Físico-Químicas y Naturales - Universidad Nacional de Río CuartoRed de Universidades con Carreras en Informátic
XXV Congreso Argentino de Ciencias de la Computación - CACIC 2019: libro de actas
Trabajos presentados en el XXV Congreso Argentino de Ciencias de la Computación (CACIC), celebrado en la ciudad de Río Cuarto los días 14 al 18 de octubre de 2019 organizado por la Red de Universidades con Carreras en Informática (RedUNCI) y Facultad de Ciencias Exactas, Físico-Químicas y Naturales - Universidad Nacional de Río CuartoRed de Universidades con Carreras en Informátic
Exploiting Heterogeneous Parallelism on Hybrid Metaheuristics for Vector Autoregression Models
In the last years, the huge amount of data available in many disciplines makes the mathematical modeling, and, more concretely, econometric models, a very important technique to explain those data. One of the most used of those econometric techniques is the Vector Autoregression Models (VAR) which are multi-equation models that linearly describe the interactions and behavior of a group of variables by using their past. Traditionally, Ordinary Least Squares and Maximum likelihood estimators have been used in the estimation of VAR models. These techniques are consistent and asymptotically efficient under ideal conditions of the data and the identification problem. Otherwise, these techniques would yield inconsistent parameter estimations. This paper considers the estimation of a VAR model by minimizing the difference between the dependent variables in a certain time, and the expression of their own past and the exogenous variables of the model (in this case denoted as VARX model). The solution of this optimization problem is approached through hybrid metaheuristics. The high computational cost due to the huge amount of data makes it necessary to exploit High-Performance Computing for the acceleration of methods to obtain the models. The parameterized, parallel implementation of the metaheuristics and the matrix formulation ease the simultaneous exploitation of parallelism for groups of hybrid metaheuristics. Multilevel and heterogeneous parallelism are exploited in multicore CPU plus multiGPU nodes, with the optimum combination of the different parallelism parameters depending on the particular metaheuristic and the problem it is applied to.This work was supported by the Spanish MICINN and AEI, as well as European Commission FEDER funds, under grant RTI2018-098156-B-C53 and grant TIN2016-80565-R
High-Performance Fast Iterative Methods for Eikonal Equations
Department of Computer Science and EngineeringThe eikonal equation has a wide range of applications related to distances or travel time in space, such as geoscience, computer vision, image processing, path planning, and computer graphics. Recently, the research on eikonal equation solvers has focused more on developing efficient parallel algorithms to leverage the computing power of parallel systems, such as multi-core CPUs and graphics processing units (GPUs). However, only a little research literature exists for the massively parallel eikonal equation solver because of its complications related to data and work management.
In this dissertation research, I introduce several-fold novel contributions to leverage the high-performance and massive computing platform for a parallel eikonal equation solver. First, I introduce a novel adaptive domain decomposition method for an efficient multi-GPU implementation of the block-based fast iterative method (FIM). The proposed method expands the sub-domain which is to be processed for each GPU by considering the fair load balancing as the iterative algorithm proceeds. It also provides a locality-aware clustering algorithm to minimize the communication overhead. With this, I solved the parallel performance problems that are often encountered in naive multi-GPU extensions that depend on regular domain decomposition, such as task load imbalance and high communication cost. In addition, it includes several optimization techniques, such as hiding the CPU cost using the CUDA multi-streams and hiding the data transfer costs between multiple GPUs.
Second, I propose an efficient parallel implementation of FIM for a multi-core shared-memory system by using a lock-free local queue approach and provide an in-depth analysis of the parallel performance of the method. In addition, I propose a new parallel algorithm, Group-Ordered Fast Iterative Method (GO-FIM), that exploits the causality of grid blocks to reduce redundant computations, which was the main drawback of the original FIM. The proposed GO-FIM method uses the clustering of blocks based on the updating order where each cluster can be updated in parallel by using multi-core parallel architectures.
Third, I propose a novel algorithm called Causality-Ordered Fast Iterative Method (CO-FIM), that exploits the causality dependency at a node level to reduce redundant computations. Moreover, I propose a new parallel algorithm, Causality and Group-Ordered Fast Iterative Method (CGOFIM), that integrates GO-FIM and CO-FIM. The proposed CGO-FIM determines the updating order at the block level while minimizing the redundancy calculation in the inner block by a node-level causality dependency. The CGO-FIM method has a condition for using both COFIM and FIM interchangeably in the inner block, and it is fully compatible with the lock-free local queue approach, so it can be efficiently implemented for multi-core parallel architectures.clos