Search CORE

36 research outputs found

Understanding the Energy Consumption of HPC Scale Artificial Intelligence

Author: Carastan dos Santos Danilo
Publication venue: HAL CCSD
Publication date: 26/09/2022
Field of study

International audienceThis paper contributes towards better understanding the energy consumption trade-offs of HPC scale Artificial Intelligence (AI), and more specifically Deep Learning (DL) algorithms. For this task we developed benchmark-tracker, a benchmark tool to evaluate the speed and energy consumption of DL algorithms in HPC environments. We exploited hardware counters and Python libraries to collect energy information through software, which enabled us to instrument a known AI benchmark tool, and to evaluate the energy consumption of numerous DL algorithms and models. Through an experimental campaign, we show a case example of the potential of benchmark-tracker to measure the computing speed and the energy consumption for training and inference DL algorithms, and also the potential of Benchmark-Tracker to help better understanding the energy behavior of DL algorithms in HPC platforms. This work is a step forward to better understand the energy consumption of Deep Learning in HPC, and it also contributes with a new tool to help HPC DL developers to better balance the HPC infrastructure in terms of speed and energy consumption

INRIA a CCSD electronic archive server

Obtaining Dynamic Scheduling Policies with Simulation and Machine Learning

Author: Carastan-Santos Danilo
de Camargo Raphael Y.
Publication venue: HAL CCSD
Publication date: 12/11/2017
Field of study

International audienceDynamic scheduling of tasks in large-scale HPC platforms is normally accomplished using ad-hoc heuristics, based on task characteristics, combined with some backfilling strategy. Defining heuristics that work efficiently in different scenarios is a difficult task, specially when considering the large variety of task types and platform architectures. In this work, we present a methodology based on simulation and machine learning to obtain dynamic scheduling policies. Using simulations and a workload generation model, we can determine the characteristics of tasks that lead to a reduction in the mean slowdown of tasks in an execution queue. Modeling these characteristics using a nonlinear function and applying this function to select the next task to execute in a queue dramatically improved the mean task slowdown in synthetic workloads. When applied to real workload traces from highly different machines, these functions still resulted in important performance improvements, attesting the generalization capability of the obtained heuristics

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Oscilador LC monolítico comandado por tensão a 2,4GHz

Author: CARASTAN Danilo Justino
DEMARQUETTE Nicole Raymonde
MASENELLI-VARLOT Karine
VERMOGEN Alexandre
Publication venue: ISEL - Instituto Superior de Engenharia de Lisboa
Publication date: 01/01/2005
Field of study

Esta Comunicação tem como finalidade divulgar o projecto de um VCO monolítico a 2.4GHz para integrar uma Malha de Captura de Fase (PLL). O Oscilador projectado é baseado num par diferencial cruzado (parte activa). O circuito funciona com uma tensão de 2.8V e com uma tensão de comando entre 1.6V e 1.8V, produzindo uma variação de frequência entre 2.4GHz e 2.75GHz.info:eu-repo/semantics/publishedVersio

Repositório Científico do Instituto Politécnico de Lisboa

HAL Descartes

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Repositório da Produção USP (Univ. de São Paulo)

Hal-Diderot

HOMOGENIZATION METHOD FOR 2-D NANOSTRUCTURE REINFORCED EPOXY

Author: Blinzler Brina
Carastan Danilo
Chiaretti D.V.A.
Larsson Ragnar
Moritugui M.K.M.
T\ue1boas A. G.
Publication venue
Publication date: 01/01/2018
Field of study

Graphene\ua0 flakes\ua0 are\ua0 being\ua0 used\ua0 as base\ua0 resin additives\ua0 in\ua0 epoxy\ua0 to\ua0 improve\ua0 the\ua0 properties\ua0 of\ua0 the material\ua0\ua0 for\ua0\ua0 aerospace\ua0\ua0 applications[1].\ua0\ua0 The concentrations\ua0 of\ua0 the\ua0 flakes\ua0 should\ua0 be\ua0 optimized\ua0 to create material properties that meet the design and cost requirements\ua0\ua0 of\ua0\ua0 the\ua0\ua0 components.\ua0\ua0 A\ua0\ua0 predictive modelingapproach\ua0 is\ua0 needed\ua0 to\ua0 aid\ua0 in\ua0 the\ua0 design\ua0 of these\ua0 composite\ua0 materials\ua0 for\ua0 increased\ua0 stiffness. Using\ua0 a2D\ua0 representation, the mechanical\ua0 properties of\ua0 a\ua0 representative\ua0 area\ua0 element\ua0 of\ua0 epoxy\ua0 embedded with graphene flakes can be predicted

Chalmers Research

Short-Term Ambient Temperature Forecasting for Smart Heaters

Author: Carastan-Santos Danilo
da Silva Anderson Andrei
Goldman Alfredo
Mitra Angan
Mommessin Clément
Ngoko Yanik
Trystram Denis
Publication venue: HAL CCSD
Publication date: 04/10/2021
Field of study

Maintaining Cloud data centers is a worrying challenge in terms of energy efficiency. This challenge leads to solutions such as deploying Edge nodes that operate inside buildings without massive cooling systems. Edge nodes can act assmart heaters by recycling their consumed energy to heat these buildings. We propose a novel technique to perform temperature forecasting for Edge Computing smart heater environments. Our approach uses time series algorithms to exploit historical air temperature data with smart heaters’ power consumption and heat-sink temperatures to create models to predict short-term ambient temperatures. We implemented our approach on top of Facebook’s Prophet time series forecasting framework, and we used the real-time logs from Qarnot Computing as a usecase of a smart heater Edge platform. Our best trained model yields ambient temperature forecasts with less than 2.66% Mean Absolute Percentage Error showing the feasibility of near realtime forecasting

INRIA a CCSD electronic archive server

Learning about simple heuristics for online parallel job scheduling

Author: Carastan dos Santos Danilo
Publication venue
Publication date: 27/11/2019
Field of study

Les plate-formes de Calcul Haute Performance (High Performance Computing, HPC) augmentent en taille et en complexité. De manière contradictoire, la demande en énergie de telles plates-formes a également rapidement augmenté. Les supercalculateurs actuels ont besoin d’une puissance équivalente à celle de toute une centrale d’énergie. Dans le but de faire un usage plus responsable de ce puissance de calcul, les chercheurs consacrent beaucoup d’efforts à la conception d’algorithmes et de techniques permettant d’améliorer différents aspects de performance, tels que l’ordonnancement et la gestion des ressources. Cependent, les responsables des plate-formes HPC hésitent encore à déployer des méthodes d’ordonnancement à la fine pointe de la technologie et la plupart d’entre eux recourent à des méthodes heuristiques simples, telles que l’EASY Backfilling, qui repose sur un tri naïf premier arrivé, premier servi. Les nouvelles méthodes sont souvent complexes et obscures, et la simplicité et la transparence de l’EASY Backfilling sont trop importantes pour être sacrifiées.Dans un premier temps, nous explorons les techniques d’Apprentissage Automatique (Machine Learning, ML) pour apprendre des méthodes heuristiques d’ordonnancement online de tâches parallèles. À l’aide de simulations et d’un modèle de génération de charge de travail, nous avons pu déterminer les caractéristiques des applications HPC (tâches) qui contribuent pour une réduction du ralentissement moyen des tâches dans une file d’attente d’exécution. La modélisation de ces caractéristiques par une fonction non linéaire et l’application de cette fonction pour sélectionner la prochaine tâche à exécuter dans une file d’attente ont amélioré le ralentissement moyen des tâches dans les charges de travail synthétiques. Appliquées à des traces de charges de travail réelles de plate-formes HPC très différents, ces fonctions ont néanmoins permis d’améliorer les performances, attestant de la capacité de généralisation des heuristiques obtenues.Dans un deuxième temps, à l’aide de simulations et de traces de charge de travail de plusieurs plates-formes HPC réelles, nous avons effectué une analyse approfondie des résultats cumulés de quatre heuristiques simples d’ordonnancement (y compris l’EASY Backfilling). Nous avons également évalué des outres effets tels que la relation entre la taille des tâches et leur ralentissement, la distribution des valeurs de ralentissement et le nombre de tâches mises en calcul par backfilling, par chaque plate-forme HPC et politique d’ordonnancement. Nous démontrons de manière expérimentale que l’on ne peut que gagner en remplaçant l’EASY Backfilling par la stratégie SAF (Smallest estimated Area First) aidée par backfilling, car elle offre une amélioration des performances allant jusqu’à 80% dans la métrique de ralentissement, tout en maintenant la simplicité et la transparence d’EASY Backfilling. La SAF réduit le nombre de tâches à hautes valeurs de ralentissement et, par l’inclusion d’un mécanisme de seuillage simple, nous garantonts l’absence d’inanition de tâches.Dans l’ensemble, nous avons obtenu les remarques suivantes : (i) des heuristiques simples et efficaces sous la forme d’une fonction non linéaire des caractéristiques des tâches peuvent être apprises automatiquement, bien qu’il soit subjectif de conclure si le raisonnement qui sous-tend les décisions d’ordonnancement de ces heuristiques est clair ou non. (ii) La zone (l’estimation du temps d’exécution multipliée par le nombre de processeurs) des tâches semble être une propriété assez importante pour une bonne heuristique d’ordonnancement des tâches parallèles, car un bon nombre d’heuristiques (notamment la SAF) qui ont obtenu de bonnes performances ont la zone de la tâche comme entrée (iii) Le mécanisme de backfilling semble toujours contribuer à améliorer les performances, bien que cela ne remédie pas à un meilleur tri de la file d’attente de tâches, tel que celui effectué par SAF.High-Performance Computing (HPC) platforms are growing in size and complexity. In an adversarial manner, the power demand of such platforms has rapidly grown as well, and current top supercomputers require power at the scale of an entire power plant. In an effort to make a more responsible usage of such power, researchers are devoting a great amount of effort to devise algorithms and techniques to improve different aspects of performance such as scheduling and resource management. But HPC platform maintainers are still reluctant to deploy state of the art scheduling methods and most of them revert to simple heuristics such as EASY Backfilling, which is based in a naive First-Come-First-Served (FCFS) ordering. Newer methods are often complex and obscure, and the simplicity and transparency of EASY Backfilling are too important to sacrifice.At a first moment we explored Machine Learning (ML) techniques to learn on-line parallel job scheduling heuristics. Using simulations and a workload generation model, we could determine the characteristics of HPC applications (jobs) that lead to a reduction in the mean slowdown of jobs in an execution queue. Modeling these characteristics using a nonlinear function and applying this function to select the next job to execute in a queue improved the mean task slowdown in synthetic workloads. When applied to real workload traces from highly different machines, these functions still resulted in performance improvements, attesting the generalization capability of the obtained heuristics.At a second moment, using simulations and workload traces from several real HPC platforms, we performed a thorough analysis of the cumulative results of four simple scheduling heuristics (including EASY Backfilling). We also evaluated effects such as the relationship between job size and slowdown, the distribution of slowdown values, and the number of backfilled jobs, for each HPC platform and scheduling policy. We show experimental evidence that one can only gain by replacing EASY Backfilling with the Smallest estimated Area First (SAF) policy with backfilling, as it offers improvements in performance by up to 80% in the slowdown metric while maintaining the simplicity and the transparency of EASY. SAF reduces the number of jobs with large slowdowns and the inclusion of a simple thresholding mechanism guarantees that no starvation occurs.Overall we achieved the following remarks: (i) simple and efficient scheduling heuristics in the form of a nonlinear function of the jobs characteristics can be learned automatically, though whether the reasoning behind their scheduling decisions is clear or not can be up to argument. (ii) The area (processing time estimate multiplied by the number of processors) of the jobs seems to be a quite important property for good parallel job scheduling heuristics, since many of the heuristics (notably SAF) that achieved good performances have the job's area as input. (iii) The backfilling mechanism seems to always help in increasing performance, though it does not outperform a better sorting of the jobs waiting queue, such as the sorting performed by SAF

Theses.fr

Preparation and rheological characterization of nanocomposites of styrenic polymers.

Author: Carastan Danilo Justino
Publication venue: 'Universidade de Sao Paulo, Agencia USP de Gestao da Informacao Academica (AGUIA)'
Publication date: 09/10/2007
Field of study

Neste trabalho foram preparados nanocompósitos de polímeros estirênicos com argilas organofílicas. Os polímeros estudados foram o poliestireno (PS), um copolímero tribloco de poliestireno-b-polibutadieno-b-estireno (SBS) e quatro copolímeros tribloco de poliestirenob- poli(etileno-co-butileno)-b-estireno (SEBS), sendo um deles modificado com anidrido maléico. Os nanocompósitos foram preparados por três técnicas de obtenção: mistura no fundido, solução e uma técnica híbrida que combina as duas primeiras. Os materiais obtidos foram caracterizados por difração de raios X (XRD), microscopia óptica (OM), microscopia eletrônica de transmissão (TEM), espalhamento de raios X a baixo ângulo (SAXS) e também foram realizados estudos reológicos através do ensaio de cisalhamento oscilatório de pequenas amplitudes (SAOS). O grau de dispersão de argila em algumas amostras foi avaliado por uma técnica baseada na análise de imagens obtidas por TEM. Os resultados mostraram que na maioria dos casos foram obtidos nanocompósitos intercalados, graças à presença da fase de PS em cada polímero. Amostras preparadas por solução tiveram o melhor grau de dispersão de argila, e o polímero que resultou na estrutura mais exfoliada foi o SEBS maleatado. Estudos reológicos mostraram-se muito sensíveis à formação de reticulados de partículas de argila nos nanocompósitos, que passaram a ter comportamento semelhante ao de sólidos. A combinação de técnicas de SAXS com reologia foi bastante útil para estudar a morfologia de fases ordenadas em copolímeros em bloco, permitindo identificar e distinguir estruturas lamelares, cilíndricas e esféricas em cada copolímero. Foi possível verificar que a presença de argila perturba a ordem das fases dos copolímeros e causa diferentes efeitos nas propriedades reológicas destes materiais.In this work nanocomposites of styrenic polymers and organoclays were prepared. The polymers studied were polystyrene (PS), a polystyrene-b-polybutadiene-b-polystyrene triblock copolymer (SBS) and four polystyrene-b-poly(ethylene-co-butylene)-polystyrene triblock copolymers (SEBS), with one containing maleic anhydride. The nanocomposites were prepared using three different techniques: melt mixing, solution casting and a hybrid technique combining the former two. The materials obtained were characterized by x-ray diffraction (XRD), optical microscopy (OM), transmission electron microscopy (TEM), small angle x-ray scattering (SAXS) and by rheological studies, through small amplitude oscillatory shear tests (SAOS). The degree of clay dispersion was evaluated in some samples using a TEM image analysis technique. The results have shown that in most cases intercalated nanocomposites were obtained, due to the PS phase present in each polymer. Samples prepared by solution had the highest degree of clay dispersion, and the maleated SEBS was the polymer which originated the most exfoliated nanocomposite. The results have also shown that rheological studies are very sensitive to the formation of clay networks within the nanocomposites, which behave more solidlike. The combination of SAXS techniques and rheology was very useful to study the morphology of ordered phases in block copolymers, allowing to identify and distinguish the different structures of each copolymer, such as the lamellar, cylindrical and spherical phases. It was possible to verify that the presence of clay disturbs the phase order in the copolymers and has different effects on the rheological properties of these materials

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblioteca Digital de Teses e Dissertações

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Apprentissage sur heuristiques simples pour l'ordonnancement online de tâches parallèles

Author: Carastan dos Santos Danilo
Publication venue: HAL CCSD
Publication date: 27/11/2019
Field of study

High-Performance Computing (HPC) platforms are growing in size and complexity. In an adversarial manner, the power demand of such platforms has rapidly grown as well, and current top supercomputers require power at the scale of an entire power plant. In an effort to make a more responsible usage of such power, researchers are devoting a great amount of effort to devise algorithms and techniques to improve different aspects of performance such as scheduling and resource management. But HPC platform maintainers are still reluctant to deploy state of the art scheduling methods and most of them revert to simple heuristics such as EASY Backfilling, which is based in a naive First-Come-First-Served (FCFS) ordering. Newer methods are often complex and obscure, and the simplicity and transparency of EASY Backfilling are too important to sacrifice.At a first moment we explored Machine Learning (ML) techniques to learn on-line parallel job scheduling heuristics. Using simulations and a workload generation model, we could determine the characteristics of HPC applications (jobs) that lead to a reduction in the mean slowdown of jobs in an execution queue. Modeling these characteristics using a nonlinear function and applying this function to select the next job to execute in a queue improved the mean task slowdown in synthetic workloads. When applied to real workload traces from highly different machines, these functions still resulted in performance improvements, attesting the generalization capability of the obtained heuristics.At a second moment, using simulations and workload traces from several real HPC platforms, we performed a thorough analysis of the cumulative results of four simple scheduling heuristics (including EASY Backfilling). We also evaluated effects such as the relationship between job size and slowdown, the distribution of slowdown values, and the number of backfilled jobs, for each HPC platform and scheduling policy. We show experimental evidence that one can only gain by replacing EASY Backfilling with the Smallest estimated Area First (SAF) policy with backfilling, as it offers improvements in performance by up to 80% in the slowdown metric while maintaining the simplicity and the transparency of EASY. SAF reduces the number of jobs with large slowdowns and the inclusion of a simple thresholding mechanism guarantees that no starvation occurs.Overall we achieved the following remarks: (i) simple and efficient scheduling heuristics in the form of a nonlinear function of the jobs characteristics can be learned automatically, though whether the reasoning behind their scheduling decisions is clear or not can be up to argument. (ii) The area (processing time estimate multiplied by the number of processors) of the jobs seems to be a quite important property for good parallel job scheduling heuristics, since many of the heuristics (notably SAF) that achieved good performances have the job's area as input. (iii) The backfilling mechanism seems to always help in increasing performance, though it does not outperform a better sorting of the jobs waiting queue, such as the sorting performed by SAF.Les plate-formes de Calcul Haute Performance (High Performance Computing, HPC) augmentent en taille et en complexité. De manière contradictoire, la demande en énergie de telles plates-formes a également rapidement augmenté. Les supercalculateurs actuels ont besoin d’une puissance équivalente à celle de toute une centrale d’énergie. Dans le but de faire un usage plus responsable de ce puissance de calcul, les chercheurs consacrent beaucoup d’efforts à la conception d’algorithmes et de techniques permettant d’améliorer différents aspects de performance, tels que l’ordonnancement et la gestion des ressources. Cependent, les responsables des plate-formes HPC hésitent encore à déployer des méthodes d’ordonnancement à la fine pointe de la technologie et la plupart d’entre eux recourent à des méthodes heuristiques simples, telles que l’EASY Backfilling, qui repose sur un tri naïf premier arrivé, premier servi. Les nouvelles méthodes sont souvent complexes et obscures, et la simplicité et la transparence de l’EASY Backfilling sont trop importantes pour être sacrifiées.Dans un premier temps, nous explorons les techniques d’Apprentissage Automatique (Machine Learning, ML) pour apprendre des méthodes heuristiques d’ordonnancement online de tâches parallèles. À l’aide de simulations et d’un modèle de génération de charge de travail, nous avons pu déterminer les caractéristiques des applications HPC (tâches) qui contribuent pour une réduction du ralentissement moyen des tâches dans une file d’attente d’exécution. La modélisation de ces caractéristiques par une fonction non linéaire et l’application de cette fonction pour sélectionner la prochaine tâche à exécuter dans une file d’attente ont amélioré le ralentissement moyen des tâches dans les charges de travail synthétiques. Appliquées à des traces de charges de travail réelles de plate-formes HPC très différents, ces fonctions ont néanmoins permis d’améliorer les performances, attestant de la capacité de généralisation des heuristiques obtenues.Dans un deuxième temps, à l’aide de simulations et de traces de charge de travail de plusieurs plates-formes HPC réelles, nous avons effectué une analyse approfondie des résultats cumulés de quatre heuristiques simples d’ordonnancement (y compris l’EASY Backfilling). Nous avons également évalué des outres effets tels que la relation entre la taille des tâches et leur ralentissement, la distribution des valeurs de ralentissement et le nombre de tâches mises en calcul par backfilling, par chaque plate-forme HPC et politique d’ordonnancement. Nous démontrons de manière expérimentale que l’on ne peut que gagner en remplaçant l’EASY Backfilling par la stratégie SAF (Smallest estimated Area First) aidée par backfilling, car elle offre une amélioration des performances allant jusqu’à 80% dans la métrique de ralentissement, tout en maintenant la simplicité et la transparence d’EASY Backfilling. La SAF réduit le nombre de tâches à hautes valeurs de ralentissement et, par l’inclusion d’un mécanisme de seuillage simple, nous garantonts l’absence d’inanition de tâches.Dans l’ensemble, nous avons obtenu les remarques suivantes : (i) des heuristiques simples et efficaces sous la forme d’une fonction non linéaire des caractéristiques des tâches peuvent être apprises automatiquement, bien qu’il soit subjectif de conclure si le raisonnement qui sous-tend les décisions d’ordonnancement de ces heuristiques est clair ou non. (ii) La zone (l’estimation du temps d’exécution multipliée par le nombre de processeurs) des tâches semble être une propriété assez importante pour une bonne heuristique d’ordonnancement des tâches parallèles, car un bon nombre d’heuristiques (notamment la SAF) qui ont obtenu de bonnes performances ont la zone de la tâche comme entrée (iii) Le mécanisme de backfilling semble toujours contribuer à améliorer les performances, bien que cela ne remédie pas à un meilleur tri de la file d’attente de tâches, tel que celui effectué par SAF

Thèses en Ligne

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Run your HPC jobs in Eco-Mode: revealing the potential of user-assisted power capping in supercomputing systems

Author: Angelelli Luc
Carastan-Santos Danilo
Dutot Pierre-François
Publication venue: HAL CCSD
Publication date: 01/03/2024
Field of study

The energy consumption of an exascale High-Performance Computing (HPC) supercomputer rivals that of tens of thousands of people in terms of electricity demand. Given the substantial energy footprint of exascale HPC systems and the increasing strain on power grids due to climate-related events, electricity providers are starting to impose power caps during critical periods to their users. In this context, it becomes crucial to implement strategies that manage the power consumption of supercomputers while simultaneously ensuring their uninterrupted operation.This paper investigates the proposition that HPC users can willingly sacrifice some processing performance to contribute to a global energy-saving initiative. With the objective of offering an efficient energy-saving strategy by involving users, we introduce a user-assisted supercomputer power-capping methodology. In this approach, users have the option to voluntarily permit their applications to operate in a power-capped mode, denoted as ’Eco-Mode’, as necessary. Leveraging HPC simulations, along with energy traces and application metadata derived from a recent Top500 HPC supercomputer, we conducted an experimental campaign to quantify the effects of Eco-Mode on energy conservation and on user experience. Specifically, our study aimed to demonstrate that, with a sufficient number of users choosing Eco-Mode, the supercomputer maintains good performances within the specified power cap. Furthermore, we sought to determine the optimal conditions regarding the number of users embracing Eco-Mode and the magnitude of power capping required for applications (i.e., the intensity of Eco-Mode). Our findings indicate that decreasing the speed of jobs can decrease significantly the number of jobs that must be killed. Moreover, as the adoption of Eco-Mode increases among users, the likelihood of every job to be killed also decreases

INRIA a CCSD electronic archive server