17 research outputs found

    Thermal Balancing of Liquid-Cooled 3D-MPSoCs Using Channel Modulation

    Get PDF
    While possessing the potential to replace conventional air-cooled heat sinks, inter-tier microchannel liquid cooling of 3D ICs also creates the problem of increased thermal gradients from the fluid inlet to outlet ports [1, 2]. These cooling-induced thermal gradients can be high enough to create undesirable stress in the ICs, undermining the structural reliability and lifetimes. In this paper, we present a novel design-time solution for the thermal gradient problem in liquid-cooled 3D Multi-Processor System-on-Chip (MPSoC) architectures. The proposed method is based on channel width modulation and provides the designers with an additional dimension in the design-space exploration. We formulate the channel width modulation as an optimal control design problem to minimize the temperature gradients in the 3D IC while meeting the design constraints. The proposed thermal balancing technique uses an analytical model for forced convective heat transfer in microchannels, and has been applied to a two tier 3D-MPSoC. The results show that the proposed approach can reduce thermal gradients by up to 31% when applied to realistic 3D-MPSoC architectures, while maintaining pressure drops in the microchannels well below their safe limits of operation

    Temperature-Aware Design and Management for 3D Multi-Core Architectures

    Get PDF
    Vertically-integrated 3D multiprocessors systems-on-chip (3D MPSoCs) provide the means to continue integrating more functionality within a unit area while enhancing manufacturing yields and runtime performance. However, 3D MPSoCs incur amplified thermal challenges that undermine the corresponding reliability. To address these issues, several advanced cooling technologies, alongside temperature-aware design-time optimizations and run-time management schemes have been proposed. In this monograph, we provide an overall survey on the recent advances in temperature-aware 3D MPSoC considerations. We explore the recent advanced cooling strategies, thermal modeling frameworks, design-time optimizations and run-time thermal management schemes that are primarily targeted for 3D MPSoCs. Our aim of proposing this survey is to provide a global perspective, highlighting the advancements and drawbacks on the recent state-of-the-ar

    GreenCool: An Energy-Efficient Liquid Cooling Design Technique for 3-D MPSoCs Via Channel Width Modulation

    Get PDF
    Liquid cooling using interlayer microchannels has appeared as a viable and scalable packaging technology for 3-D multiprocessor system-on-chips (MPSoCs). Microchannel-based liquid cooling, however, can substantially increase the on-chip thermal gradients, which are undesirable for reliability, performance, and cooling efficiency. In this paper, we present GreenCool, an optimal design methodology for liquid-cooled 3-D MPSoCs. GreenCool simultaneously minimizes the cooling energy for a given system while maintaining thermal gradients and peak temperatures under safe limits. This is accomplished by tuning the heat transfer characteristics of the microchannels using channel width modulation. Channel width modulation is compatible with the current process technologies and incurs minimal additional fabrication costs. Through an extensive set of experiments, we show that channel width modulation is capable of complementing and enhancing the benefits of temperature-aware floorplanning. We also experiment with a 16-core 3-D system with stacked dynamic random-access memory, for which GreenCool improves energy efficiency by up to 53% with respect to no channel modulation

    A semi-analytical approach for optimized design of microchannel liquid-cooled ICs

    Get PDF
    The development of embedded and interlayer liquid cooling in integrated circuits (ICs) using silicon microchannels has gained interest in the recent years owing to the rise of on-chip heat uses that aggravate thermal reliability issues of the emerging 3D stacked ICs. Further development of such devices and their translation to commercial applications depend largely on the availability of tools and methodologies that can enable the "temperature-aware" design of liquid- cooled microprocessors and 2D/3D multiprocessor systems-on-chip (MPSoCs). Recently, two optimal design methods have been proposed for liquid-cooled microchannel ICs: one to minimize on-chip temperature gradients and the other, called GreenCool, to maximize energy eciency in the coolant pumping eort. Both these methods rely upon the concept of channel width modulation to modify the thermal behaviour of a microchannel liquid-cooled heat sink. At the heart of both these methods is a new semi-analytical mathematical model for heat transfer in liquid-cooled ICs. Such a mathematical model enables the application of gradient descent approaches, such as non-linear programming, in the search for the most optimally performing channel design in a huge multi-dimensional design space. In this paper, we thoroughly quantify the impact and efficiency of the semi-analytical model, combined with non-linear programming, when compared against several numerical optimization mechanisms. Our experimental evaluation shows that non-linear programming, alongside the semi-analytical model, is up to 23x faster than conventional randomized/heuristic design approaches such as genetic algorithms and simulated annealing using fully-numerical thermal models

    Towards Thermally-Aware Design of 3D MPSoCs with Inter-Tier Cooling

    Get PDF
    Abstract—New tendencies envisage 3D Multi-Processor System-On-Chip (MPSoC) design as a promising solution to keep increasing the performance of the next-generation highperformance computing (HPC) systems. However, as the power density of HPC systems increases with the arrival of 3D MPSoCs, supplying electrical power to the computing equipment and constantly removing the generated heat is rapidly becoming the dominant cost in any HPC facility. Thus, both power and thermal/cooling implications play a major role in the design of new HPC systems, given the energy constraints in our society. Therefore, EPFL, IBM and ETHZ have been working within the CMOSAIC Nano-Tera.ch program project in the last three years on the development of a holistic thermally-aware design. This paper presents the exploration in CMOSAIC of novel cooling technologies, as well as suitable thermal modeling and system-level design methods, which are all necessary to develop 3D MPSoCs with inter-tier liquid cooling systems. As a result, we develop energy-efficient run-time thermal control strategies to achieve energy-efficient cooling mechanisms to compress almost 1 Tera nano sized functional units into one cubic centimeter with a 10 to 100 fold higher connectivity than otherwise possible. The proposed thermally-aware design paradigm includes exploring the synergies of hardware-, software- and mechanical-based thermal control techniques as a fundamental step to design 3D MPSoCs for HPC systems. More precisely, we target the use of inter-tier coolants ranging from liquid water and twophase refrigerants to novel engineered environmentally friendly nano-fluids, as well as using specifically designed micro-channel arrangements, in combination with the use of dynamic thermal management at system-level to tune the flow rate of the coolant in each micro-channel to achieve thermally-balanced 3D-ICs. Our management strategy prevents the system from surpassing the given threshold temperature while achieving up to 67% reduction in cooling energy and up to 30% reduction in system-level energy in comparison to setting the flow rate at the maximum value to handle the worst-case temperature

    Power-Thermal Modeling and Control of Energy-Efficient Servers and Datacenters

    Get PDF
    Recently, the energy-efficiency constraints have become the dominant limiting factor for datacenters due to their unprecedented increase of growing size and electrical power demands. In this chapter we explain the power and thermal modeling and control solutions which can play a key role to reduce the power consumption of datacenters considering time-varying workload characteristics while maintaining the performance requirements and the maximum temperature constraints. We first explain simple-yet-accurate power and temperature models for computing servers, and then, extend the model to cover computing servers and cooling infrastructure of datacenters. Second, we present the power and thermal management solutions for servers manipulating various control knobs such as voltage and frequency of servers, workload allocation, and even cooling capability, especially, flow rate of liquid cooled servers). Finally, we present the solution to minimize the server clusters of datacenters by proposing a solution which judiciously allocates virtual machines to servers considering their correlation, and then, the joint optimization solution which enables to minimize the total energy consumption of datacenters with hybrid cooling architecture (including the computing servers and the cooling infrastructure of datacenters)

    Improving processor efficiency through thermal modeling and runtime management of hybrid cooling strategies

    Full text link
    One of the main challenges in building future high performance systems is the ability to maintain safe on-chip temperatures in presence of high power densities. Handling such high power densities necessitates novel cooling solutions that are significantly more efficient than their existing counterparts. A number of advanced cooling methods have been proposed to address the temperature problem in processors. However, tradeoffs exist between performance, cost, and efficiency of those cooling methods, and these tradeoffs depend on the target system properties. Hence, a single cooling solution satisfying optimum conditions for any arbitrary system does not exist. This thesis claims that in order to reach exascale computing, a dramatic improvement in energy efficiency is needed, and achieving this improvement requires a temperature-centric co-design of the cooling and computing subsystems. Such co-design requires detailed system-level thermal modeling, design-time optimization, and runtime management techniques that are aware of the underlying processor architecture and application requirements. To this end, this thesis first proposes compact thermal modeling methods to characterize the complex thermal behavior of cutting-edge cooling solutions, mainly Phase Change Material (PCM)-based cooling, liquid cooling, and thermoelectric cooling (TEC), as well as hybrid designs involving a combination of these. The proposed models are modular and they enable fast and accurate exploration of a large design space. Comparisons against multi-physics simulations and measurements on testbeds validate the accuracy of our models (resulting in less than 1C error on average) and demonstrate significant reductions in simulation time (up to four orders of magnitude shorter simulation times). This thesis then introduces temperature-aware optimization techniques to maximize energy efficiency of a given system as a whole (including computing and cooling energy). The proposed optimization techniques approach the temperature problem from various angles, tackling major sources of inefficiency. One important angle is to understand the application power and performance characteristics and to design management techniques to match them. For workloads that require short bursts of intense parallel computation, we propose using PCM-based cooling in cooperation with a novel Adaptive Sprinting technique. By tracking the PCM state and incorporating this information during runtime decisions, Adaptive Sprinting utilizes the PCM heat storage capability more efficiently, achieving 29\% performance improvement compared to existing sprinting policies. In addition to the application characteristics, high heterogeneity in on-chip heat distribution is an important factor affecting efficiency. Hot spots occur on different locations of the chip with varying intensities; thus, designing a uniform cooling solution to handle worst-case hot spots significantly reduces the cooling efficiency. The hybrid cooling techniques proposed as part of this thesis address this issue by combining the strengths of different cooling methods and localizing the cooling effort over hot spots. Specifically, the thesis introduces LoCool, a cooling system optimizer that minimizes cooling power under temperature constraints for hybrid-cooled systems using TECs and liquid cooling. Finally, the scope of this work is not limited to existing advanced cooling solutions, but it also extends to emerging technologies and their potential benefits and tradeoffs. One such technology is integrated flow cell array, where fuel cells are pumped through microchannels, providing both cooling and on-chip power generation. This thesis explores a broad range of design parameters including maximum chip temperature, leakage power, and generated power for flow cell arrays in order to maximize the benefits of integrating this technology with computing systems. Through thermal modeling and runtime management techniques, and by exploring the design space of emerging cooling solutions, this thesis provides significant improvements in processor energy efficiency.2018-07-09T00:00:00

    Can Cooling Technology Save Many-Core Parallel Programming from Its Programming Woes?

    Get PDF
    An abstract of this work will be presented at the Compiler, Architecture and Tools Conference (CATC), Intel Development Center, Haifa, Israel November 23, 2015.This paper is advancing the following premise (henceforth, "vision"): that it is feasible to greatly enhance data movement in the short term, and do it in ways that would be both power efficient and pragmatic in the long term. The paper spells this premise out in greater detail: 1. it is feasible to build first generations of a variety of (power-inefficient) designs for which data movement will not be a restriction and begin application software development for them; 2. growing reliance on silicon compatible photonic technologies, and feasible advances in them with proper investment, will allow reduction of power consumption in these design by several orders of magnitude; 3. successful high performance application software, the ease of programming demonstrated and growing adoption by customers, software vendors and programmers will incentivize (hardware vendor) investment in new application-software-compatible generations of these designs (a new "software spiral" a la former Intel CEO, Andy Grove) with further reduction of power consumption in each generation; 4. microfluidic cooling is instrumental for enabling item 1, as well as for midwifing this overall vision. The opening paragraph of the paper provides a preamble to that vision, the body of the paper supports it and the paragraph "Moore's-Law-type vision" summarizes it. The scope of the paper is a bit forward looking and it may not exactly fit any particular community. However, its new directions for interaction among architecture and programming may suggest new horizons for representing and exposing a greater variety of data and task parallelism.National Science Foundatio

    Thermal designs, models and optimization for three-dimensional integrated circuits

    Get PDF
    Three-dimensional integrated circuits (3D ICs), a novel packaging technology, are heavily studied to enable improved performance with denser packaging and reduced interconnects. Despite numerous advantages, thermal management is the biggest bottleneck to expanding the applications of this device stacking technology. In addition to implementing the thermal-aware designs of existing methodologies, it is necessary to implement new features to dissipate heat efficiently. This work presents two main aspects of thermal designs: on-chip level and package level. First, we propose a novel thermal-aware physical design on chip between devices. We aim to mitigate localized hotspots to ensure the functionality by adding thermal fin geometry to existing thermal through- silicon via (TTSV). We analyze design requirements of thermal fin for single TTSV as well as TTSV cluster designs with the goal of maximizing heat dissipation while minimizing the interference with routing and area consumption. An analytical model of the three-dimensional system and thermal resistance circuit is built for accurate and runtime-efficient thermal analysis. In terms of high-performance computing systems in 3D ICs, thermal bottle- necks are much more challenging with merely on-chip design solutions. Inter- tier liquid cooling microchannel layers have been introduced into 3D ICs as an integrated cooling mechanism to tackle the thermal degradation. Many existing research works optimize microchannel designs based on runtime-intensive numerical methods or inaccurate thermo-fluid models. Hence, we propose an accurate but compact closed-form model of tapered microchannel to capture the relationship between the channel geometry and heat transfer performance. To improve the accuracy, our correlations are based on the developing flow model and derived from numerical simulation data on a sub- set of multiple channel parameters. Our model achieves 57% less error in Nusselt number and 45 % less error in pressure drop for channels with inlet width 100-400 ÎĽm compared to a commonly used approximate model on fully developed flow. Next, we present the correlations for diverging channels as well as complete correlations that extend to any linearly tapering channel models, that include diverging shape, uniformly rectangular shape and converging shape. The complete models provide the flexibility to analyze and optimize any arbitrary geometry based on the piecewise linear channel wall assumption. Finally, we demonstrate the optimized channel designs using the derived correlations. Tapered channel models provided the flexibility to incorporate any arbitrary shapes and explore the advanced geometries during the optimization. The microchannel is divided into small segments in axial direction from inlet to outlet and piecewise optimized. The simulated annealing method is applied in our optimization, and channel width at one randomly chosen segment interface is altered to evaluate the design at each iteration. The objective is to minimize the overall thermal resistance while pressure drop is maintained less than a threshold value and channel widths have minimum and maximum boundaries. We compare the designs with the optimization based on fully developed flow models and verify the channel performance through numerical simulations. To guarantee optimality, accurate analysis is crucial. Our proposed models have significantly improved the accuracy by applying the appropriate flow assumption. However, many opportunities exist to increase the design flexibility and the accuracy. Fluid conditions, such as coolant material and varying volumetric flow rate, can also be part of the optimization parameters to expand the design scope. Moreover, physical phenomena, such as reduced friction on the channel walls or a vortex created on abrupt angle changes, can be considered to improve the accuracy in the closed-form models

    Feasibility Study of Scaling an XMT Many-Core

    Get PDF
    The reason for recent focus on communication avoidance is that high rates of data movement become infeasible due to excessive power dissipation. However, shifting the responsibility of minimizing data movement to the parallel algorithm designer comes at significant costs to programmer’s productivity, as well as: (i) reduced speedups and (ii) the risk of repelling application developers from adopting parallelism. The UMD Explicit Multi-Threading (XMT) framework has demonstrated advantages on ease of parallel programming through its support of PRAM-like programming, combined with strong, often unprecedented speedups. Such programming and speedups involve considerable data movement between processors and shared memory. Another reason that XMT is a good test case for a study of data movement is that XMT permits isolation and direct study of most of its data movement (and its power dissipation). Our new results demonstrate that an XMT single-chip many-core processor with tens of thousands of cores and a high throughput network on chip is thermally feasible, though at some cost. This leads to a perhaps game-changing outcome: instead of imposing upfront strict restrictions on data movement, as advocated in a recent report from the National Academies, opt for due diligence that accounts for the full impact on cost. For example, does the increased cost due to communication avoidance (including programmer’s productivity, reduced speedups and desertion risk) indeed offset the cost of the solution we present? More specifically, we investigate in this paper the design of an XMT many-core for 3D VLSI with microfluidic cooling. We used state-of-the-art simulation tools to model the power and thermal properties of such an architecture with 8k to 64k lightweight cores, requiring between 2 and 8 silicon layers. Inter-chip communication using silicon compatible photonics is also considered. We found that, with the use of microfluidic cooling, power dissipation becomes a cost issue rather than a feasibility constraint. Robustness of the results is also discussed.DARPA, NSF, NI
    corecore