14 research outputs found

    MultiGreen: Cost-Minimizing Multi-source Datacenter Power Supply with Online Control

    Get PDF
    Session 4: Data Center Energy ManagementFulltext of the conference paper in: http://conferences.sigcomm.org/eenergy/2013/papers/p13.pdfFaced by soaring power cost, large footprint of carbon emis- sion and unpredictable power outage, more and more mod- ern Cloud Service Providers (CSPs) begin to mitigate these challenges by equipping their Datacenter Power Supply Sys- tem (DPSS) with multiple sources: (1) smart grid with time- varying electricity prices, (2) uninterrupted power supply (UPS) of finite capacity, and (3) intermittent green or re- newable energy. It remains a significant challenge how to operate among multiple power supply sources in a comple- mentary manner, to deliver reliable energy to datacenter users over time, while minimizing a CSP’s operational cost over the long run. This paper proposes an efficient, online control algorithm for DPSS, called MultiGreen. MultiGreen is based on an innovative two-timescale Lyapunov optimiza- tion technique. Without requiring a priori knowledge of system statistics, MultiGreen allows CSPs to make online decisions on purchasing grid energy at two time scales (in the long-term market and in the real-time market), leveraging renewable energy, and opportunistically charging and dis- charging UPS, in order to fully leverage the available green energy and low electricity prices at times for minimum op- erational cost. Our detailed analysis and trace-driven sim- ulations based on one-month real-world data have demon- strated the optimality (in terms of the tradeoff between min- imization of DPSS operational cost and satisfaction of data- center availability) and stability (performance guarantee in cases of fluctuating energy demand and supply) of Multi- Green

    Thermal Energy Storage for Datacenters with Phase Change Materials

    Full text link
    Datacenters, vast warehouses containing millions of servers that run the internet and the cloud, have experienced double digit growth for almost two decades. Datacenters cost hundreds of millions of dollars, with the largest now exceeding over a billion dollars each, and consume enormous amounts of power–over 2% of all electricity in the US and projected to increase up to 10% by 2030. The impact of such high compute density, with thousands of individual compute nodes packed together in a small space, is heat: every watt of power used by servers must be removed form the datacenter. This requires active cooling: air cooling is by far the most common with an air conditioner or other form of heat exchanger cooling air in the datacenter room then transporting heat outside the facility to heat exchanger or similar fixture. Such a system is simple, common, and functional, but inherently inefficient due to the nature of datacenter workloads. Datacenters primarily server user facing workloads, that is: the user requests a search or sends and email and their query prompts load in the datacenter. The query is handled locally, on a relative geographic scale, to provide a low response time and positive user experience. This necessitates globally distributed datacenter capacity, but also creates a diurnal load pattern whereby datacenters are most heavily loaded during the peak hours when users in their region of service are awake and active online versus the off hours when users are offline or asleep and query requests are low. Because datacenter infrastructure must be provisioned for peak load, servers, power distribution, and cooling infrastructure is significantly underutilized most of the time. This dissertation investigates the cooling needs of datacenters, and proposes to decouple the work and cooling needs. Specifically, we hypothesize that by storing thermal energy we can reshape the thermal profile of a datacenter to better balance cooling load throughout the day. We call this technique Thermal Time Shifting (TTS). First, we discuss how phase change materials (PCMs) enable TTS and evaluate the potential use scenarios of placing a small amount of PCM inside of servers for thermal energy storage. Next we dive deeper into the potential of thermal energy storage and propose Virtual Melting Temperatures (VMT), a technique that uses active job placement to control the melting and cooling of PCM to enable a much greater degree of control over the behavior of the thermal profile. Finally we propose and evaluate Thermal Gradient Transfer (TGT), a technique that uses direct water cooling to move heat straight from CPUs and GPUs to the wax for wider applicability and greater peak cooling load reduction.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/147726/1/skachm_1.pdfDescription of skachm_1.pdf : Restricted to UM users only

    Peak shaving through battery storage for low-voltage enterprises with peak demand pricing

    Get PDF
    The renewable energy transition has introduced new electricity tariff structures. With the increased penetration of photovoltaic and wind power systems, users are being charged more for their peak demand. Consequently, peak shaving has gained attention in recent years. In this paper, we investigated the potential of peak shaving through battery storage. The analyzed system comprises a battery, a load and the grid but no renewable energy sources. The study is based on 40 load profiles of low-voltage users, located in Belgium, for the period 1 January 2014, 00:00-31 December 2016, 23:45, at 15 min resolution, with peak demand pricing. For each user, we studied the peak load reduction achievable by batteries of varying energy capacities (kWh), ranging from 0.1 to 10 times the mean power (kW). The results show that for 75% of the users, the peak reduction stays below 44% when the battery capacity is 10 times the mean power. Furthermore, for 75% of the users the battery remains idle for at least 80% of the time; consequently, the battery could possibly provide other services as well if the peak occurrence is sufficiently predictable. From an economic perspective, peak shaving looks interesting for capacity invoiced end users in Belgium, under the current battery capex and electricity prices (without Time-of-Use (ToU) dependency)

    Towards Power- and Energy-Efficient Datacenters

    Full text link
    As the Internet evolves, cloud computing is now a dominant form of computation in modern lives. Warehouse-scale computers (WSCs), or datacenters, comprising the foundation of this cloud-centric web have been able to deliver satisfactory performance to both the Internet companies and the customers. With the increased focus and popularity of the cloud, however, datacenter loads rise and grow rapidly, and Internet companies are in need of boosted computing capacity to serve such demand. Unfortunately, power and energy are often the major limiting factors prohibiting datacenter growth: it is often the case that no more servers can be added to datacenters without surpassing the capacity of the existing power infrastructure. This dissertation aims to investigate the issues of power and energy usage in a modern datacenter environment. We identify the source of power and energy inefficiency at three levels in a modern datacenter environment and provides insights and solutions to address each of these problems, aiming to prepare datacenters for critical future growth. We start at the datacenter-level and find that the peak provisioning and improper service placement in multi-level power delivery infrastructures fragment the power budget inside production datacenters, degrading the compute capacity the existing infrastructure can support. We find that the heterogeneity among datacenter workloads is key to address this issue and design systematic methods to reduce the fragmentation and improve the utilization of the power budget. This dissertation then narrow the focus to examine the energy usage of individual servers running cloud workloads. Especially, we examine the power management mechanisms employed in these servers and find that the coarse time granularity of these mechanisms is one critical factor that leads to excessive energy consumption. We propose an intelligent and low overhead solution on top of the emerging finer granularity voltage/frequency boosting circuit to effectively pinpoints and boosts queries that are likely to increase the tail distribution and can reap more benefit from the voltage/frequency boost, improving energy efficiency without sacrificing the quality of services. The final focus of this dissertation takes a further step to investigate how using a fundamentally more efficient computing substrate, field programmable gate arrays (FPGAs), benefit datacenter power and energy efficiency. Different from other types of hardware accelerations, FPGAs can be reconfigured on-the-fly to provide fine-grain control over hardware resource allocation and presents a unique set of challenges for optimal workload scheduling and resource allocation. We aim to design a set coordinated algorithms to manage these two key factors simultaneously and fully explore the benefit of deploying FPGAs in the highly varying cloud environment.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144043/1/hsuch_1.pd

    Mass transport enhancement in redox flow batteries with corrugated fluidic networks

    Get PDF
    We propose a facile, novel concept of mass transfer enhancement in flow batteries based on electrolyte guidance in rationally designed corrugated channel systems. The proposed fluidic networks employ periodic throttling of the flow to optimally deflect the electrolytes into the porous electrode, targeting enhancement of the electrolyte-electrode interaction. Theoretical analysis is conducted with channels in the form of trapezoidal waves, confirming and detailing the mass transport enhancement mechanism. In dilute concentration experiments with an alkaline quinone redox chemistry, a scaling of the limiting current with Re0.74 is identified, which compares favourably against the Re0.33 scaling typical of diffusionlimited laminar processes. Experimental IR-corrected polarization curves are presented for high concentration conditions, and a significant performance improvement is observed with the narrowing of the nozzles. The adverse effects of periodic throttling on the pumping power are compared with the benefits in terms of power density, and an improvement of up to 102% in net power density is obtained in comparison with the flow-by case employing straight parallel channels. The proposed novel concept of corrugated fluidic networks comes with facile fabrication and contributes to the improvement of the transport characteristics and overall performance of redox flow battery systems.IndisponĂ­vel

    Optimizing Resource Management in Cloud Analytics Services

    Get PDF
    The fundamental challenge in the cloud today is how to build and optimize machine learning and data analytical services. Machine learning and data analytical platforms are changing computing infrastructure from expensive private data centers to easily accessible online services. These services pack user requests as jobs and run them on thousands of machines in parallel in geo-distributed clusters. The scale and the complexity of emerging jobs lead to increasing challenges for the clusters at all levels, from power infrastructure to system architecture and corresponding software framework design. These challenges come in many forms. Today's clusters are built on commodity hardware and hardware failures are unavoidable. Resource competition, network congestion, and mixed generations of hardware make the hardware environment complex and hard to model and predict. Such heterogeneity becomes a crucial roadblock for efficient parallelization on both the task level and job level. Another challenge comes from the increasing complexity of the applications. For example, machine learning services run jobs made up of multiple tasks with complex dependency structures. This complexity leads to difficulties in framework designs. The scale, especially when services span geo-distributed clusters, leads to another important hurdle for cluster design. Challenges also come from the power infrastructure. Power infrastructure is very expensive and accounts for more than 20% of the total costs to build a cluster. Power sharing optimization to maximize the facility utilization and smooth peak hour usages is another roadblock for cluster design. In this thesis, we focus on solutions for these challenges at the task level, on the job level, with respect to the geo-distributed data cloud design and for power management in colocation data centers. At the task level, a crucial hurdle to achieving predictable performance is stragglers, i.e., tasks that take significantly longer than expected to run. At this point, speculative execution has been widely adopted to mitigate the impact of stragglers in simple workloads. We apply straggler mitigation for approximation jobs for the first time. We present GRASS, which carefully uses speculation to mitigate the impact of stragglers in approximation jobs. GRASS's design is based on the analysis of a model we develop to capture the optimal speculation levels for approximation jobs. Evaluations with production workloads from Facebook and Microsoft Bing in an EC2 cluster of 200 nodes show that GRASS increases accuracy of deadline-bound jobs by 47% and speeds up error-bound jobs by 38%. Moving from task level to job level, task level speculation mechanisms are designed and operated independently of job scheduling when, in fact, scheduling a speculative copy of a task has a direct impact on the resources available for other jobs. Thus, we present Hopper, a job-level speculation-aware scheduler that integrates the tradeoffs associated with speculation into job scheduling decisions based on a model generalized from the task-level speculation model. We implement both centralized and decentralized prototypes of the Hopper scheduler and show that 50% (66%) improvements over state-of-the-art centralized (decentralized) schedulers and speculation strategies can be achieved through the coordination of scheduling and speculation. As computing resources move from local clusters to geo-distributed cloud services, we are expecting the same transformation for data storage. We study two crucial pieces of a geo-distributed data cloud system: data acquisition and data placement. Starting from developing the optimal algorithm for the case of a data cloud made up of a single data center, we propose a near-optimal, polynomial-time algorithm for a geo-distributed data cloud in general. We show, via a case study, that the resulting design, Datum, is near-optimal (within 1.6%) in practical settings. Efficient power management is a fundamental challenge for data centers when providing reliable services. Power oversubscription in data centers is very common and may occasionally trigger an emergency when the aggregate power demand exceeds the capacity. We study power capping solutions for handling such emergencies in a colocation data center, where the operator supplies power to multiple tenants. We propose a novel market mechanism based on supply function bidding, called COOP, to financially incentivize and coordinate tenants' power reduction for minimizing total performance loss while satisfying multiple power capping constraints. We demonstrate that COOP is "win-win", increasing the operator's profit (through oversubscription) and reducing tenants' costs (through financial compensation for their power reduction during emergencies).</p

    Leveraging Stored Energy for Handling Power Emergencies in Aggressively Provisioned Datacenters

    No full text
    Datacenters spend $10-25 per watt in provisioning their power infrastructure, regardless of the watts actually consumed. Since peak power needs arise rarely, provisioning power infrastructure for them can be expensive. One can, thus, aggressively under-provision infrastructure assuming that simultaneous peak draw across all equipment will happen rarely. The resulting non-zero probability of emergency events where power needs exceed provisioned capacity, however small, mandates graceful reaction mechanisms to cap the power draw instead of leaving it to disruptive circuit breakers/fuses. Existing strategies for power capping use temporal knobs local to a server that throttle the rate of execution (using power modes), and/or spatial knobs that redirect/migrate excess load to regions of the datacenter with more power headroom. We show these mechanisms to have performance degrading ramifications, and propose an entirely orthogonal solution that leverages existing UPS batteries to temporarily augment the utility supply during emergencies. We build an experimental prototype to demonstrate such power capping on a cluster of 8 servers, each with an individual battery, and implement several online heuristics in the context of different datacenter workloads to evaluate their effectiveness in handling power emergencies. We show that: (i) our battery-based solution can handle emergencies of short duration on its own, (ii) supplement existing reaction mechanisms to enhance their efficacy for longer emergencies, and (iii) battery even provide feasible options when other knobs do not suffice
    corecore