167 research outputs found

    Capacity Planning for Green Data Center Sustainability

    Get PDF
    Business demands for powerful computing resources are increasing the workload in data centers, hence raising the complexity in managing the data center. The impact of data centers on the environment is now a particular concern. Although there are now efforts to realize Green data centers through harnessing Green technologies, their sustainability still remains an issue. Data center management includes planning its capacity. However, capacity planning for data center efficiency has only focused on measurements of power and cooling (part of infrastructure operation), which are insufficient to ensure the sustainability of the data center. In fact, a more thorough set of components must be considered. Therefore, this study proposes a capacity planning framework to derive practical solutions for Green data center management and thus ensure the sustainability of the Green data center. To this end, the components for Green data center sustainability and capacity planning were determined using Content Analysis and the findings verified by Green data center experts

    FedZero: Leveraging Renewable Excess Energy in Federated Learning

    Full text link
    Federated Learning (FL) is an emerging machine learning technique that enables distributed model training across data silos or edge devices without data sharing. Yet, FL inevitably introduces inefficiencies compared to centralized model training, which will further increase the already high energy usage and associated carbon emissions of machine learning in the future. Although the scheduling of workloads based on the availability of low-carbon energy has received considerable attention in recent years, it has not yet been investigated in the context of FL. However, FL is a highly promising use case for carbon-aware computing, as training jobs constitute of energy-intensive batch processes scheduled in geo-distributed environments. We propose FedZero, a FL system that operates exclusively on renewable excess energy and spare capacity of compute infrastructure to effectively reduce the training's operational carbon emissions to zero. Based on energy and load forecasts, FedZero leverages the spatio-temporal availability of excess energy by cherry-picking clients for fast convergence and fair participation. Our evaluation, based on real solar and load traces, shows that FedZero converges considerably faster under the mentioned constraints than state-of-the-art approaches, is highly scalable, and is robust against forecasting errors

    Datacenter management for on-site intermittent and uncertain renewable energy sources

    Get PDF
    Les technologies de l'information et de la communication sont devenues, au cours des dernières années, un pôle majeur de consommation énergétique avec les conséquences environnementales associées. Dans le même temps, l'émergence du Cloud computing et des grandes plateformes en ligne a causé une augmentation en taille et en nombre des centres de données. Pour réduire leur impact écologique, alimenter ces centres avec des sources d'énergies renouvelables (EnR) apparaît comme une piste de solution. Cependant, certaines EnR telles que les énergies solaires et éoliennes sont liées aux conditions météorologiques, et sont par conséquent intermittentes et incertaines. L'utilisation de batteries ou d'autres dispositifs de stockage est souvent envisagée pour compenser ces variabilités de production. De par leur coût important, économique comme écologique, ainsi que les pertes énergétiques engendrées, l'utilisation de ces dispositifs sans intégration supplémentaire est insuffisante. La consommation électrique d'un centre de données dépend principalement de l'utilisation des ressources de calcul et de communication, qui est déterminée par la charge de travail et les algorithmes d'ordonnancement utilisés. Pour utiliser les EnR efficacement tout en préservant la qualité de service du centre, une gestion coordonnée des ressources informatiques, des sources électriques et du stockage est nécessaire. Il existe une grande diversité de centres de données, ayant différents types de matériel, de charge de travail et d'utilisation. De la même manière, suivant les EnR, les technologies de stockage et les objectifs en termes économiques ou environnementaux, chaque infrastructure électrique est modélisée et gérée différemment des autres. Des travaux existants proposent des méthodes de gestion d'EnR pour des couples bien spécifiques de modèles électriques et informatiques. Cependant, les multiples combinaisons de ces deux parties rendent difficile l'extrapolation de ces approches et de leurs résultats à des infrastructures différentes. Cette thèse explore de nouvelles méthodes pour résoudre ce problème de coordination. Une première contribution reprend un problème d'ordonnancement de tâches en introduisant une abstraction des sources électriques. Un algorithme d'ordonnancement est proposé, prenant les préférences des sources en compte, tout en étant conçu pour être indépendant de leur nature et des objectifs de l'infrastructure électrique. Une seconde contribution étudie le problème de planification de l'énergie d'une manière totalement agnostique des infrastructures considérées. Les ressources informatiques et la gestion de la charge de travail sont encapsulées dans une boîte noire implémentant un ordonnancement sous contrainte de puissance. La même chose s'applique pour le système de gestion des EnR et du stockage, qui agit comme un algorithme d'optimisation d'engagement de sources pour répondre à une demande. Une optimisation coopérative et multiobjectif, basée sur un algorithme évolutionnaire, utilise ces deux boîtes noires afin de trouver les meilleurs compromis entre les objectifs électriques et informatiques. Enfin, une troisième contribution vise les incertitudes de production des EnR pour une infrastructure plus spécifique. En utilisant une formulation en processus de décision markovien (MDP), la structure du problème de décision sous-jacent est étudiée. Pour plusieurs variantes du problème, des méthodes sont proposées afin de trouver les politiques optimales ou des approximations de celles-ci avec une complexité raisonnable.In recent years, information and communication technologies (ICT) became a major energy consumer, with the associated harmful ecological consequences. Indeed, the emergence of Cloud computing and massive Internet companies increased the importance and number of datacenters around the world. In order to mitigate economical and ecological cost, powering datacenters with renewable energy sources (RES) began to appear as a sustainable solution. Some of the commonly used RES, such as solar and wind energies, directly depends on weather conditions. Hence they are both intermittent and partly uncertain. Batteries or other energy storage devices (ESD) are often considered to relieve these issues, but they result in additional energy losses and are too costly to be used alone without more integration. The power consumption of a datacenter is closely tied to the computing resource usage, which in turn depends on its workload and on the algorithms that schedule it. To use RES as efficiently as possible while preserving the quality of service of a datacenter, a coordinated management of computing resources, electrical sources and storage is required. A wide variety of datacenters exists, each with different hardware, workload and purpose. Similarly, each electrical infrastructure is modeled and managed uniquely, depending on the kind of RES used, ESD technologies and operating objectives (cost or environmental impact). Some existing works successfully address this problem by considering a specific couple of electrical and computing models. However, because of this combined diversity, the existing approaches cannot be extrapolated to other infrastructures. This thesis explores novel ways to deal with this coordination problem. A first contribution revisits batch tasks scheduling problem by introducing an abstraction of the power sources. A scheduling algorithm is proposed, taking preferences of electrical sources into account, though designed to be independent from the type of sources and from the goal of the electrical infrastructure (cost, environmental impact, or a mix of both). A second contribution addresses the joint power planning coordination problem in a totally infrastructure-agnostic way. The datacenter computing resources and workload management is considered as a black-box implementing a scheduling under variable power constraint algorithm. The same goes for the electrical sources and storage management system, which acts as a source commitment optimization algorithm. A cooperative multiobjective power planning optimization, based on a multi-objective evolutionary algorithm (MOEA), dialogues with the two black-boxes to find the best trade-offs between electrical and computing internal objectives. Finally, a third contribution focuses on RES production uncertainties in a more specific infrastructure. Based on a Markov Decision Process (MDP) formulation, the structure of the underlying decision problem is studied. For several variants of the problem, tractable methods are proposed to find optimal policies or a bounded approximation

    Joint Computing and Electric Systems Optimization for Green Datacenters

    Get PDF
    This chapter presents an optimization framework to manage green datacenters using multilevel energy reduction techniques in a joint approach. A green datacenter exploits renewable energy sources and active Uninterruptible Power Supply (UPS) units to reduce the energy intake from the grid while improving its Quality of Service (QoS). At server level, the state-of-the-art correlation-aware Virtual Machines (VMs) consolidation technique allows to maximize server’s energy efficiency. At system level, heterogeneous Energy Storage Systems (ESS) replace standard UPSs, while a dedicated optimization strategy aims at maximizing the lifetime of the battery banks and to reduce the energy bill, considering the load of the servers. Results demonstrate, under different number of VMs in the system, up to 11.6% energy savings, 10.4% improvement of QoS compared to existing correlation-aware VM allocation schemes for datacenters and up to 96% electricity bill savings

    Improved self-management of datacenter systems applying machine learning

    Get PDF
    Autonomic Computing is a Computer Science and Technologies research area, originated during mid 2000's. It focuses on optimization and improvement of complex distributed computing systems through self-control and self-management. As distributed computing systems grow in complexity, like multi-datacenter systems in cloud computing, the system operators and architects need more help to understand, design and optimize manually these systems, even more when these systems are distributed along the world and belong to different entities and authorities. Self-management lets these distributed computing systems improve their resource and energy management, a very important issue when resources have a cost, by obtaining, running or maintaining them. Here we propose to improve Autonomic Computing techniques for resource management by applying modeling and prediction methods from Machine Learning and Artificial Intelligence. Machine Learning methods can find accurate models from system behaviors and often intelligible explanations to them, also predict and infer system states and values. These models obtained from automatic learning have the advantage of being easily updated to workload or configuration changes by re-taking examples and re-training the predictors. So employing automatic modeling and predictive abilities, we can find new methods for making "intelligent" decisions and discovering new information and knowledge from systems. This thesis departs from the state of the art, where management is based on administrators expertise, well known data, ad-hoc studied algorithms and models, and elements to be studied from computing machine point of view; to a novel state of the art where management is driven by models learned from the same system, providing useful feedback, making up for incomplete, missing or uncertain data, from a global network of datacenters point of view. - First of all, we cover the scenario where the decision maker works knowing all pieces of information from the system: how much will each job consume, how is and will be the desired quality of service, what are the deadlines for the workload, etc. All of this focusing on each component and policy of each element involved in executing these jobs. -Then we focus on the scenario where instead of fixed oracles that provide us information from an expert formula or set of conditions, machine learning is used to create these oracles. Here we look at components and specific details while some part of the information is not known and must be learned and predicted. - We reduce the problem of optimizing resource allocations and requirements for virtualized web-services to a mathematical problem, indicating each factor, variable and element involved, also all the constraints the scheduling process must attend to. The scheduling problem can be modeled as a Mixed Integer Linear Program. Here we face an scenario of a full datacenter, further we introduce some information prediction. - We complement the model by expanding the predicted elements, studying the main resources (this is CPU, Memory and IO) that can suffer from noise, inaccuracy or unavailability. Once learning predictors for certain components let the decision making improve, the system can become more ¿expert-knowledge independent¿ and research can focus on an scenario where all the elements provide noisy, uncertainty or private information. Also we introduce to the management optimization new factors as for each datacenter context and costs may change, turning the model as "multi-datacenter" - Finally, we review of the cost of placing datacenters depending on green energy sources, and distribute the load according to green energy availability

    Matching Renewable Energy Supply and Demand in Green Datacenters

    Get PDF
    In this paper, we propose GreenSlot, a scheduler for parallel batch jobs in a datacenter powered by a photovoltaic solar array and the electrical grid (as a backup). GreenSlot predicts the amount of solar energy that will be available in the near future, and schedules the workload to maximize the green energy consumption while meeting the jobs’ deadlines. If grid energy must be used to avoid deadline violations, the scheduler selects times when it is cheap. Our results for both scientific computing workloads and data processing workloads demonstrate that GreenSlot can increase solar energy consumption by up to 117% and decrease energy cost by up to 39%, compared to conventional schedulers. Based on these positive results, we conclude that green datacenters and green-energy-aware scheduling can have a significant role in building a more sustainable IT ecosystem

    Economic Analysis of a Data Center Virtual Power Plant Participating in Demand Response

    Get PDF
    Data centers consume a significant amount of energy from the grid, and the number of data centers are increasing at a high rate. As the amount of demand on the transmission system increases, network congestion reduces the economic efficiency of the grid and begins to risk failure. Data centers have underutilized energy resources, such as backup generators and battery storage, which can be used for demand response (DR) to benefit both the electric power system and the data center. Therefore, data center energy resources, including renewable energy, are aggregated and controlled using an energy management system (EMS) to operate as a virtual power plant (VPP). The data center as a VPP participates in a day-ahead DR program to relieve network congestion and improve market efficiency. Data centers mostly use lead-acid batteries for energy reserve in Uninterruptible Power Supply (UPS) systems that ride through power fluctuations and short term power outages. These batteries are sized according to the power requirement of the data center and the backup power duration required for reliable operation of the data center. Most of the time, these batteries remain on float charge, with seldom charging and discharging cycles. Batteries have a limited float life, where at the end of the float life, the battery is assumed dead, and require replacement. Therefore, the unused energy of the battery can be utilized by allocating a daily energy budget limit without affecting the overall float life of the battery used in data center for the purpose of DR. This is incorporated as a soft constraint in the EMS model, and the extra use of battery energy over the daily budget limit will account for the wear cost of the battery. A case study is conducted in which the data center is placed on a modified version of the IEEE 30-bus test system to evaluate the potential economic savings by participating in the DR program, coordinated by the Independent System Operator (ISO). We show that the savings of the data center operating as a VPP and participating in the DR program far outweighs the additional expense due to operating its own generators and batteries

    Efficiency and Reliability Analysis of AC and 380V DC Data Centers

    Get PDF
    The rapid growth of the Internet has resulted in colossal increase in the number of data centers. A data center consume a tremendous amount of electricity resulting in high operation cost. Even a slight improvement in the power distribution system of a data center could save millions of dollars in electricity bills. Benchmarks for both AC and 380V DC data centers are developed and efficiency analyses thereof have been performed for an entire year. The efficiency of the power distribution system can be increased if number of power conversion stages can be reduced and more efficient converters are used. Use of wide band gap (WBG) converters will further improve the overall system efficiency because of its high efficiency. The results shows that 380V DC data centers are more efficient than AC data centers with and without PV integration. Using 380V DC distribution system not only improve the efficiency of the system, but it saves millions of dollars by decreasing system downtime. Maintaining high availability at all times is very critical to data centers. The distribution system with higher number of series components is more likely to fail, resulting in increased downtime. This study aims at comparing reliabilities of AC against 380V DC architecture. Reliability assessment was done for both AC and DC systems complying with Tier IV standard. The analysis was done for different level of redundancy (eg. N, N+1, N+2) in the UPS system for both AC and DC systems. Monte Carlo simulation method was used to perform the reliability calculations. The simulation results showed that the 380V DC distribution system has higher level of reliability than AC distribution system in data centers but only up to certain level of redundancy in the UPS system. The reliability level of AC system will approach to that of a DC system when a very high level of redundancy in the UPS system is considered, but this will increase the overall cost of a data center
    • …
    corecore