Search CORE

6 research outputs found

Reclaiming the energy of a schedule: models and algorithms

Author: Aupy Guillaume
Benoit Anne
Dufossé Fanny
Robert Yves
Publication venue: 'Wiley'
Publication date: 01/04/2011
Field of study

We consider a task graph to be executed on a set of processors. We assume that the mapping is given, say by an ordered list of tasks to execute on each processor, and we aim at optimizing the energy consumption while enforcing a prescribed bound on the execution time. While it is not possible to change the allocation of a task, it is possible to change its speed. Rather than using a local approach such as backfilling, we consider the problem as a whole and study the impact of several speed variation models on its complexity. For continuous speeds, we give a closed-form formula for trees and series-parallel graphs, and we cast the problem into a geometric programming problem for general directed acyclic graphs. We show that the classical dynamic voltage and frequency scaling (DVFS) model with discrete modes leads to a NP-complete problem, even if the modes are regularly distributed (an important particular case in practice, which we analyze as the incremental model). On the contrary, the VDD-hopping model leads to a polynomial solution. Finally, we provide an approximation algorithm for the incremental model, which we extend for the general DVFS model.Comment: A two-page extended abstract of this work appeared as a short presentation in SPAA'2011, while the long version has been accepted for publication in "Concurrency and Computation: Practice and Experience

arXiv.org e-Print Archive

HAL-ENS-LYON

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Gestion mémoire dans une infrastructure répartie

Author: Gadafi Aeiman
Publication venue: École Doctorale Mathématiques, Informatique et Télécommunications (Toulouse);142547247
Publication date: 01/10/2013
Field of study

De nos jours, de plus en plus d'organisations déploient des infrastructures matérielles telles que des clusters ou des grilles. Elles sont utilisées pour héberger des services internet communs tels que l'email, les réseaux sociaux ou le commerce électronique ou pour exécuter des applications scientifiques telles que les simulations nucléaires ou les prédictions météorologiques. La capacité de traitement et de stockage demandée pour répondre à la charge de travail de ces applications ne peut être fournie que par le biais de ces infrastructures matérielles. Ces infrastructures matérielles embarquent des systèmes d'exploitation, qui peuvent potentiellement coopérer dans le but de gérer au mieux les ressources disponibles. Ces systèmes gèrent alors l'allocation des ressources aux applications en fonction des besoins de ces dernières. Ces systèmes visent à garantir la qualité de service et en même temps à gérer de façon optimale les ressources dans le but de limiter les coûts, notamment l'énergie. La communauté scientifique s'est intéressée à la problématique de la gestion des ressources. De nombreuses approches ont été proposées et des solutions ont été mises en œuvre. En réalisant un état de l'art de ces approches, nous constatons que la plupart d'entre elles s'intéressent à la gestion des nœuds dans l'objectif de répartir les calculs d'une façon adéquate pour exploiter de manière optimale la charge processeur. La gestion globale de la mémoire dans de telles infrastructures n'a pas été suffisamment étudiée. En effet, la mémoire est souvent considérée comme une ressource avec une capacité théoriquement illimitée grâce aux mécanismes de swap, mais ces derniers ont des conséquences importantes sur les performances des applications et le coût de fonctionnement de l'infrastructure. Dans cette thèse, nous étudions la conception et l'implantation d'un service de gestion globale de la mémoire dans une infrastructure matérielle. Ce service de gestion de mémoire doit éviter le gaspillage de mémoire et ne doit pas pénaliser les performances des applications hébergées. Nous proposons un service de swap à distance permettant à une machine, plutôt que swapper sur son disque local, de swapper sur la mémoire distante d'une autre machine ayant de la mémoire disponible. Les pages distantes peuvent être déplacées dynamiquement afin d'équilibrer la charge entre les machines. Ceci permet de mutualiser la mémoire et d'économiser les ressources. Un prototype a été implémenté et évalué. ABSTRACT : Nowadays, more and more organizations are deploying large scale infrastructures such as clusters or grids. They are used to host common Internet services such as email, social networks, e-commerce applications or to run scientific applications such as nuclear simulations and weather predictions. Processing power and storage capacities satisfying the workload of these applications can only be provided by such infrastructure. The operating systems deployed on these nodes manage the allocation of application resources and can potentially cooperate in order to manage the available resources according of the application needs. The scientific community is usually interested in the resource management problematic. Many approaches have been proposed and solutions have been implemented. However, we find out that most of them focus on the node management in order to adequately distribute calculations to optimally exploit the CPU load. The global memory management in such infrastructures has not been enough studied. Indeed, memory is often considered as a resource with a theoretically unlimited capacity thanks to the swap capabilities, but swapping has a significant impact on the system performance and the operation cost. In this thesis, we study the design and the implementation of a global memory service management in a large scale infrastructure. This memory management service must avoid wasting memory resources and should not penalize the performance of hosted applications. It is based on remote swap mechanisms. A prototype has been implemented and evaluated

Thèses en Ligne

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Institut National Polytechnique de Toulouse (Theses)

Fault Tolerant and Energy Efficient One-Sided Matrix Decompositions on Heterogeneous Systems with GPUs

Author: Chen Jieyang
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Heterogeneous computing system with both CPUs and GPUs has become a class of widely used hardware architecture in supercomputers. As heterogeneous systems delivering higher computational performance, they are being built with an increasing number of complex components. This is anticipated that these systems will be more susceptible to hardware faults with higher power consumption. Numerical linear algebra libraries are used in a wide spectrum of high-performance scientific applications. Among numerical linear algebra operations, one-sided matrix decompositions can sometimes take a large portion of execution time or even dominate the whole scientific application execution. Due to the computational characteristic of one-sided matrix decompositions, they are very suitable for computation platforms such as heterogeneous systems with CPUs and GPUs. Many works have been done to implement and optimize one-sided matrix decompositions on heterogeneous systems with CPUs and GPUs. However, it is challenging to enable stable and high performance one-sided matrix decompositions running on computing platforms that are unreliable and high energy consumption. So, in this thesis, we aim to develop novel fault tolerance and energy efficiency optimizations for one-sided matrix decompositions on heterogeneous systems with CPUs and GPUs.To improve reliability and energy efficiency, extensive researches have been done on developing and optimizing fault tolerance methods and energy-saving strategies for one-sided matrix decompositions. However, current designs still have several limitations: (1) Little has been done on developing and optimizing fault tolerance method for one-sided matrix decompositions on heterogeneous systems with GPUs; (2) Limited by the protection coverage and strength, existing fault tolerance works provide insufficient protection when applied to one-sided matrix decompositions on heterogeneous systems with GPUs; (3) Lack the knowledge of algorithms, existing system level energy saving solutions cannot achieve the optimal energy savings due to potentially inaccurate and high-cost workload prediction they rely on when they are used in one-sided matrix decompositions; (4) It is challenging to apply both fault tolerance techniques and energy saving strategies to one-side matrix decompositions at the same time given that their current designs are not naturally compatible with each other.To address the first problem, based on the original (Algorithm Based Fault Tolerance) ABFT, we develop the first ABFT for matrix decomposition on heterogeneous systems with GPUs together with the novel storage errors protection and several optimization techniques specifically for GPUs. As for the second problem, we design a novel checksum scheme for ABFT that allows data stored in matrices to be encoded in two dimensions. This stronger checksum encoding mechanism enables much stronger protection including enhanced error propagation protection. In addition, we introduce a more efficient checking scheme. By prioritizing the checksum verification according to the sensitivity of matrix operations to soft errors with optimized checksum verification kernel for GPUs, we can achieve strong protect to matrix decompositions with comparable overhead. For the third problem, to improve energy efficiency for one-sided matrix decompositions, we introduce an algorithm-based energy-saving approach designed to maximize energy savings by utilizing algorithmic characteristics. Our approach can predict program execution behavior much more accurately, which is difficult for system level solutions for applications with variable execution characteristics. Experiments show that our approach can lead to much higher energy saving than existing works. Finally, for the fourth problem, we propose a novel energy saving approach for one-sided matrix decompositions on heterogeneous systems with GPUs. It allows energy saving strategies and fault tolerance techniques to be enabled at the same time without brings performance impact or extra energy cost

Ezid

eScholarship - University of California

Енергоефективне обслуговування навантаження інформаційно-комунікаційної мережі

Author: Прокопець Наталія Андріївна
Publication venue: Київ
Publication date: 01/01/2022
Field of study

Прокопець Н.А. Енергоефективне обслуговування навантаження інформаційно-комунікаційної мережі. – Кваліфікаційна наукова праця на правах рукопису. Дисертація на здобуття наукового ступеня доктора філософії за спеціальністю 172 – Телекомунікації та радіотехніка. – Навчально-науковий інститут телекомунікаційних систем КПІ ім. Ігоря Сікорського, Київ, 2022. У дисертаційній роботі розв’язано актуальну науково-практичну задачу підвищення енергоефективності та продуктивності обслуговування навантаження інформаційно-комунікаційної мережі (ІКМ) при виконанні вимог щодо доступності системи обслуговування навантаження за рахунок застосування комплексного методу енергоефективного обслуговування навантаження. Функціонування сучасної ІКМ великою мірою залежить від програмного забезпечення (ПЗ), що виконує різноманітні мережеві задачі. Це обумовлено розвитком ряду технологій та концепцій, зокрема SDN (Software-Defined Networking), NFV (Network Functions Virtualization), логічного поділу мережі (Network Slicing), периферійних обчислень (Edge Computing) та bDDN (Big data driven networking). Задачі, що вирішуються в рамках цих концепцій, формують обчислювальне навантаження, для обслуговування якого необхідною є побудова та підтримка розподілених обчислювальних систем як невід’ємної частини архітектури ІКМ. При цьому, особливості цих типів навантаження формують специфічні вимоги щодо його обслуговування. Проведений у роботі аналіз вимог різних типів навантаження ІКМ згідно з рекомендаціями Міжнародної спілки електрозв’язку дозволив визначити основні показники ефективності системи розподіленого обслуговування навантаження у складі ІКМ та серверного кластера як одиниці розподіленого центру обробки даних (ЦОД) у складі ІКМ зокрема: показники енергоефективності та продуктивності обробки обчислювального навантаження, а також коефіцієнт готовності системи розподіленого обслуговування навантаження . На основі цих показників сформовано критерій оптимальності процесу обслуговування навантаження в інформаційно- комунікаційній мережі . В ході аналізу існуючих підходів щодо підвищення енергоефективності розподіленого обслуговування навантаження виявлено певні їх недоліки, а саме: статичні підходи не враховують динамічну змінюваність інтенсивності навантаження; динамічні підходи, що застосовуються на рівні апаратного забезпечення мають високу складність та вартість впровадження. Серед відомих динамічних підходів, що використовуються на рівні програмного забезпечення, підходи щодо консолідації та масштабування обчислювальних ресурсів не враховують показник доступності системи, можуть негативно впливати на продуктивність системи, особливо у випадку динамічних змін інтенсивності навантаження, не використовують індивідуальні характеристики енергоспоживання обчислювальних вузлів, що призводить до неоптимального використання обчислювальних ресурсів. Серед методів енергоефективного розподілу навантаження було відзначено алгоритм планування навантаження Backfill, основною перевагою якого є мінімізація простою обчислювальних вузлів за рахунок щільного розподілу обчислювальних робіт. Однак ефективність цього підходу значно зменшується у випадку невисокої інтенсивності вхідного навантаження, крім того, він не враховує індивідуальні характеристики енергоспоживання та продуктивності обчислювальних вузлів. Окремим сукупним недоліком існуючих підходів є те, що кожен з них вирішує задачу підвищення енергоефективності обслуговування навантаження з урахуванням лише частини аспектів цього процесу та показників його ефективності, що визначило необхідність систематизації та формалізації процесу обслуговування навантаження в інформаційно-комунікаційній мережі. З метою систематизації та формалізації процесу обслуговування навантаження ІКМ як об’єкта дослідження побудовано онтологічну модель досліджуваної системи розподіленого обслуговування навантаження. Це дозволило якісно описати складні взаємозв’язки між виділеними показниками ефективності досліджуваного процесу та параметрами, що впливають на них. З метою отримання кількісної оцінки взаємозв’язків між показниками ефективності досліджуваного процесу та параметрами, що на них впливають, побудовано математичну модель системи розподіленого обслуговування навантаження у складі ІКМ як системи масового обслуговування (СМО). У процесі побудови моделі запропоновано метод переходу від нестаціонарного неординарного вхідного потоку заявок до стаціонарного ординарного потоку шляхом дискретизації кривої інтенсивності вхідного навантаження та за допомогою переходу до комплектів серверів, що дозволило значно спростити розрахунки при допустимих втратах точності моделі. Для дискретизації кривої інтенсивності вхідного навантаження запропоновано використання методу квантування за рівнями, що дозволило узгодити величину кроку дискретизації функції зі швидкістю зміни інтенсивності вхідного навантаження. Для визначення кроку квантування запропоновано метод розрахунку порогових величин інтенсивностей вхідного навантаження як функцій кількості обчислювальних вузлів у системі. На основі побудованої математичної моделі запропоновано метод розрахунку шаблонів горизонтального масштабування, що дозволяє визначати оптимальну кількість активних обчислювальних вузлів у кластерах ЦОД ІКМ на кожному інтервалі часу, який визначається швидкістю зміни інтенсивності вхідного навантаження. Проаналізовано способи визначення індивідуальних моделей енергоспоживання обчислювальних вузлів розподілених ЦОД та обґрунтовано доцільність їх використання у процесі обслуговування навантаження ІКМ. Докладно розглянуто два способи визначення моделей енергоспоживання: емпіричний та програмний. Перший спосіб базується на безпосередньому вимірюванні енергоспоживання вузлів та подальшій інтерполяції отриманих залежностей поліномом ступеня з метою отримання аналітичних функцій. Другий спосіб базується на програмній оцінці енергетичних моделей з подальшою інтерполяцією отриманих функцій. Рекомендовано використання методу емпіричного визначення математичних моделей енергоспоживання для нових систем на етапі їх налаштування. У випадку введення нових вузлів до системи або під час її переатестації, рекомендовано використання аналітичного методу визначення моделей енергоспоживання. Побудовану математичну модель системи у вигляді СМО та розглянуті способи визначення індивідуальних моделей енергоспоживання обчислювальних вузлів покладено в основу нового комплексного методу енергоефективного обслуговування навантаження в ІКМ. Запропонований комплексний метод відрізняється від відомих використанням індивідуальних моделей енергоспоживання обчислювальних вузлів, поєднанням переваг підходів горизонтального масштабування та енергоефективного розподілу задач, врахуванням непередбачуваних динамічних змін інтенсивності вхідного навантаження, що дозволило підвищити енергоефективність процесу обслуговування навантаження без втрати продуктивності та за умови дотримання вимог щодо доступності системи. В рамках запропонованого комплексного методу удосконалено існуючі підходи щодо горизонтального масштабування обчислювальної системи шляхом використання індивідуальних моделей енергоспоживання обчислювальних вузлів та застосування механізму прогнозування динамічних відхилень вхідного навантаження, що дозволило забезпечити інтенсивніше використання найбільш енергоефективного обладнання та вчасно реагувати на непередбачувані зміни інтенсивності вхідного навантаження. На основі запропонованого комплексного методу енергоефективного обслуговування навантаження створено програмне забезпечення (ПЗ) керування обчислювальними ресурсами в ІКМ, яке дозволяє підвищити енергоефективність та продуктивність розподіленого обслуговування навантаження з дотриманням вимог щодо доступності системи обслуговування та може бути використано для підвищення енергоефективності та продуктивності обробки навантаження у периферійній та центральній хмарі в архітектурі мережі 5G. Ефективність запропонованого комплексного методу та ПЗ на його основі перевірено із використанням методів лабораторного експерименту та імітаційного моделювання. Шляхом лабораторного експерименту перевірено ефективність методу у невеликому серверному кластері з 4 обчислювальних вузлів. Імітаційна модель, адекватність якої доведено із використанням критерія Фішера, довела ефективність запропонованого комплексного методу у більшій розподіленій системі із 20 вузлів. Виграш запропонованого комплексного методу у порівнянні із відомими підходами Backfill та Round Robin за показником енергоефективності при цьому склав 9,953% та 26,382% відповідно. Виграш за показником продуктивності становив 5,593% та 49,458% відповідно. При цьому запропонований комплексний метод забезпечує виконання вимог щодо доступності обчислювальних вузлів розподілених ЦОД та дає виграш за обраним критерієм оптимальності на 15,722% у порівнянні із Backfill та на 88,887% у порівнянні з Round Robin, що доводить практичну цінність отриманих результатів дослідження.N.A. Prokopets Energy-efficient processing of the information and communication network workload. – Qualifying scientific work on manuscript rights. Thesis for graduation scientific degree of Philosophy Doctor by specialty 172 – Telecommunications and radio engineering. – Educational and Scientific Institute of Telecommunication Systems of KPI named after Igor Sikorsky, Kyiv, 2022. In the thesis, the important scientific and practical problem of increasing the energy efficiency and performance of workload processing in information and communication network (ICN) while meeting the requirements for the availability of the workload processing system was solved through the use of a comprehensive method of energyefficient workload processing. The functioning of a modern ICN largely depends on the software that performs various network tasks. This is due to the development of a number of technologies and concepts, including SDN (Software-Defined Networking), NFV (Network Functions Virtualization), Network Slicing, Edge Computing and bDDN (Big data driven networking). The tasks being solved within these concepts form a computing workload, for the processing of which it is necessary to build and maintain distributed computing systems as an integral part of the ICN architecture. At the same time, the peculiarities of these types of workload form specific requirements for its processing. The requirements analysis conducted for each of these workload types in accordance with the recommendations of the International Telecommunication Union allowed to determine the main performance indicators of the distributed workload processing system as part of the ICN and the server cluster as a unit of the distributed data center as part of the ICN, in particular: energy efficiency and performance indicators of workload processing, as well as the system availability factor . Based on these indicators, an optimality criterion of workload processing in ICN was proposed. During the analysis of the existing approaches to increase the energy efficiency of distributed workload processing, some shortcomings were revealed, namely: static approaches do not take into account the dynamic variability of the workload; dynamic approaches applied at the hardware level have high complexity and cost of implementation. Among the known dynamic approaches used at the software level, the approaches to consolidation and scaling of computing resources do not take into account the system availability indicator, which can negatively affect the system performance, especially in the case of dynamic changes in the workload arrival rate. They also do not use the individual characteristics of computing nodes’ energy consumption, which leads to suboptimal use of computing resources. Among the approaches to energy-efficient workload scheduling, the Backfill workload scheduling algorithm was noted, the main advantage of which is minimizing the downtime of computing nodes due to the dense distribution of computing work. However, the effectiveness of this approach is significantly reduced in the case of a low input workload arrival rate, in addition, it does not take into account the individual characteristics of energy consumption and performance of computing nodes. A separate collective disadvantage of the existing approaches is that each of them solves the problem of increasing the energy efficiency of workload processing taking into account only part of the aspects of this process and its efficiency indicators, which determined the need to systematize and formalize the workload processing process in the information and communication network. In order to systematize and formalize the workload processing process in the information and communication network as an object of research, an ontological model of a distributed workload processing system was built. This made it possible to qualitatively describe the complex relationships between the selected efficiency indicators of the process being researched and the parameters affecting them. In order to obtain a quantitative assessment of the relationships between the defined efficiency indicators and the parameters that affect them, a mathematical model of the distributed workload processing system within the ICN as a queuing system (QS) was built. While building the model, a method of transition from a non-stationary nonordinary input requests flow to a stationary ordinary flow was proposed by discretizing the intensity curve of the input workload and using the transition to sets of servers, which made it possible to significantly simplify calculations with permissible losses of model accuracy. For the discretization of the input workload arrival rate curve, the use of the quantization by levels was proposed, which made it possible to match the size of the discretization step with the rate of change of the input workload arrival rate. To determine the quantization step, a method of calculating threshold values of input workload arrival rate as a function of the number of computing nodes in the system is proposed. Based on the constructed mathematical model, a method for calculating horizontal scaling patterns is proposed, which allows determining the optimal number of active computing nodes in the system at each time interval, which is determined by the rate the input workload arrival rate change. The methods of determining individual energy consumption models of computing nodes were analyzed and the expediency of their use in the workload processing process in ICN was substantiated. Two methods of determining energy consumption models were considered in detail: empirical and software-based methods. The first method is based on the direct measurement of the energy consumption of the nodes and further interpolation of the obtained dependencies by a polynomial of a degree in order to obtain analytical functions. The second method is based on software-based evaluation of energy consumption models with subsequent interpolation of the obtained functions. It is recommended to use the method of empirical energy consumption models determination for new systems at the stage of their configuration. In the case of introducing new nodes to the system or during its re-configuration, it is recommended to use a software-based method for determining energy consumption models. The built mathematical model of the system in the form of QS and the considered methods of determining individual energy consumption models of computing nodes became the basis of a new comprehensive method of energy-efficient workload processing in computing nodes of distributed data centers. The proposed comprehensive method differs from known ones in the use of individual models of computing nodes’ energy consumption, a combination of the advantages of horizontal scaling approaches and energy-efficient scheduling, while taking into account dynamic changes in the input workload arrival rate, which made it possible to increase the energy efficiency of the workload processing without loss of performance and subject to compliance with system availability requirements. As part of the proposed comprehensive method, the existing approaches to horizontal scaling of the computer system were improved by using individual models of computer nodes’ energy consumption and mechanism for predicting dynamic deviations of the input workload arrival rate, which made it possible to ensure more intensive use of the most energy-efficient equipment and to respond in time to unpredictable changes in the input workload arrival rate. On the basis of the proposed comprehensive method of energy-efficient workload processing, software for managing computing resources has been created, which allows to increase the energy efficiency and performance of distributed workload processing while complying with the requirements for system availability, and can be used to increase the energy efficiency and performance of workload processing in edge and central cloud within the 5G network architecture. The effectiveness of the proposed comprehensive method and the software based on it was verified using the methods of laboratory experiment and simulation modeling. The effectiveness of the method was tested in a small server cluster with 4 computing nodes by means of the experiment. The simulation model, the adequacy of which was proven using Fisher's test, proved the effectiveness of the proposed comprehensive method in a larger distributed system with 20 nodes. The performance of the proposed comprehensive method in comparison with the known Backfill and Round Robin approaches in terms of energy efficiency was 9.953% and 26.382%, respectively. The performance gain was 5.593% and 49.458% respectively. At the same time, the proposed comprehensive method ensures the fulfillment of the requirements regarding the system availability and gives a gain according to the proposed optimality criterion by 15.722% in comparison with Backfill and by 88.887% in comparison with Round Robin, which proves the practical value of the obtained research results

Electronic Archive of Kyiv Polytechnic Institute