138 research outputs found
SHIELD: Sustainable Hybrid Evolutionary Learning Framework for Carbon, Wastewater, and Energy-Aware Data Center Management
Today's cloud data centers are often distributed geographically to provide
robust data services. But these geo-distributed data centers (GDDCs) have a
significant associated environmental impact due to their increasing carbon
emissions and water usage, which needs to be curtailed. Moreover, the energy
costs of operating these data centers continue to rise. This paper proposes a
novel framework to co-optimize carbon emissions, water footprint, and energy
costs of GDDCs, using a hybrid workload management framework called SHIELD that
integrates machine learning guided local search with a decomposition-based
evolutionary algorithm. Our framework considers geographical factors and
time-based differences in power generation/use, costs, and environmental
impacts to intelligently manage workload distribution across GDDCs and data
center operation. Experimental results show that SHIELD can realize 34.4x
speedup and 2.1x improvement in Pareto Hypervolume while reducing the carbon
footprint by up to 3.7x, water footprint by up to 1.8x, energy costs by up to
1.3x, and a cumulative improvement across all objectives (carbon, water, cost)
of up to 4.8x compared to the state-of-the-art
MOSAIC: A Multi-Objective Optimization Framework for Sustainable Datacenter Management
In recent years, cloud service providers have been building and hosting
datacenters across multiple geographical locations to provide robust services.
However, the geographical distribution of datacenters introduces growing
pressure to both local and global environments, particularly when it comes to
water usage and carbon emissions. Unfortunately, efforts to reduce the
environmental impact of such datacenters often lead to an increase in the cost
of datacenter operations. To co-optimize the energy cost, carbon emissions, and
water footprint of datacenter operation from a global perspective, we propose a
novel framework for multi-objective sustainable datacenter management (MOSAIC)
that integrates adaptive local search with a collaborative decomposition-based
evolutionary algorithm to intelligently manage geographical workload
distribution and datacenter operations. Our framework sustainably allocates
workloads to datacenters while taking into account multiple geography- and
time-based factors including renewable energy sources, variable energy costs,
power usage efficiency, carbon factors, and water intensity in energy. Our
experimental results show that, compared to the best-known prior work
frameworks, MOSAIC can achieve 27.45x speedup and 1.53x improvement in Pareto
Hypervolume while reducing the carbon footprint by up to 1.33x, water footprint
by up to 3.09x, and energy costs by up to 1.40x. In the simultaneous
three-objective co-optimization scenario, MOSAIC achieves a cumulative
improvement across all objectives (carbon, water, cost) of up to 4.61x compared
to the state-of-the-arts
HT-FED2004-56528 DEVELOPMENT AND EXPERIMENTAL VALIDATION OF AN EXERGY-BASED COMPUTATIONAL TOOL FOR DATA CENTER THERMAL MANAGEMENT
ABSTRACT The recent miniaturization of electronic devices and compaction of computer systems will soon lead to data centers with power densities of the order of 300 W/ft 2 . At these levels, traditional thermal management techniques are unlikely to suffice. To enable the dynamic smart cooling systems necessary for future data centers, an exergetic approach based on the second law of thermodynamics has recently been proposed. However, no experimental data related to this concept is currently available. This paper discusses the development and subsequent validation of an exergy-based computer model at an instrumented data center in Palo Alto, California. The study finds that when appropriately calibrated, such a computational tool can successfully predict information about local and global thermal performance that cannot be perceived intuitively from traditional design methods. Further development of the concept has promising potential for efficient data center thermal management
From Chip to Cooling Tower Data Center Modeling:
The chiller cooled data center environment consists of many interlinked elements that are usually treated as individual components. This chain of components and their influences on each other must be considered in determining the benefits of any data center design and operational strategies seeking to improve efficiency, such as temperature controlled fan algorithms. Using the models previously developed by the authors, this paper extends the analysis to include the electronics within the rack through considering the processor heat sink temperature. This has allowed determination of the influence of various cooling strategies on the data center coefficient of performance. The strategy of increasing inlet aisle temperature is examined in some detail and found not to be a robust methodology for improving the overall energy performance of the data center, while tight temperature controls at the chip level consistently provide better performance, yielding more computing per watt of cooling power. These findings are of strong practical relevance for the design of fan control algorithms at the rack level and general operational strategies in data centers. Finally, the impact of heat sink thermal resistance is considered, and the potential data center efficiency gains from improved heat sink designs are discussed
DCDB Wintermute: Enabling Online and Holistic Operational Data Analytics on HPC Systems
As we approach the exascale era, the size and complexity of HPC systems
continues to increase, raising concerns about their manageability and
sustainability. For this reason, more and more HPC centers are experimenting
with fine-grained monitoring coupled with Operational Data Analytics (ODA) to
optimize efficiency and effectiveness of system operations. However, while
monitoring is a common reality in HPC, there is no well-stated and
comprehensive list of requirements, nor matching frameworks, to support
holistic and online ODA. This leads to insular ad-hoc solutions, each
addressing only specific aspects of the problem.
In this paper we propose Wintermute, a novel generic framework to enable
online ODA on large-scale HPC installations. Its design is based on the results
of a literature survey of common operational requirements. We implement
Wintermute on top of the holistic DCDB monitoring system, offering a large
variety of configuration options to accommodate the varying requirements of ODA
applications. Moreover, Wintermute is based on a set of logical abstractions to
ease the configuration of models at a large scale and maximize code re-use. We
highlight Wintermute's flexibility through a series of practical case studies,
each targeting a different aspect of the management of HPC systems, and then
demonstrate the small resource footprint of our implementation.Comment: Accepted for publication at the 29th ACM International Symposium on
High-Performance Parallel and Distributed Computing (HPDC 2020
Dimensionless Parameter for Evaluation of Thermo-Fluids Performance of Air Conditioning and Ventilation Systems
2007b. Data center workload placement for energy efficiency
ABSTRACT Data center costs for computer power and cooling have been steadily increasing over the past decade. Much work has been done in recent years on understanding how to improve the delivery of cooling resources to IT equipment in data centers, but little attention has been paid to the optimization of heat production by considering the placement of application workload. Because certain physical locations inside the data center are more efficient to cool than others, this suggests that allocating heavy computational workloads onto those servers that are in more efficient places might bring substantial savings. This paper explores this issue by introducing a workload placement metric that considers the cooling efficiency of the environment. Additionally, results from a set of experiments that utilize this metric in a thermally isolated portion of a real data center are described. The results show that the potential savings is substantial and that further work in this area is needed to exploit the savings opportunity
Dynamic Characterization of Thermal Interface Material for Electronic Cooling
The principle of measuring thermal resistance of thermal interface material (TIM) by sandwiching the material between a hot block and cold block is well known in the industry. TIM manufacturers usually use a variation of the industrial standard ASTM D5470 test method, and subsequently provide data that is difficult for the end user to effectively utilize for product development. This paper will discuss the design and construction of an automated TIM test system based on the ASTM D5470 standard. This automated test vehicle provides an independent study of various TIMs. The instrument enables standardized testing and performance documentation of interface materials from a wide array of manufacturers making it easier for end-users to compare and select the appropriate material for various applications. The automated test method is faster and easier to use than previous methods. It requires minimal operator intervention during the test and can perform preconditioning, and non-uniform heating if required. Experimental results obtained from the instrument will be discussed.</jats:p
- …
