80 research outputs found

    An Artificial Neural Networks based Temperature Prediction Framework for Network-on-Chip based Multicore Platform

    Get PDF
    Continuous improvement in silicon process technologies has made possible the integration of hundreds of cores on a single chip. However, power and heat have become dominant constraints in designing these massive multicore chips causing issues with reliability, timing variations and reduced lifetime of the chips. Dynamic Thermal Management (DTM) is a solution to avoid high temperatures on the die. Typical DTM schemes only address core level thermal issues. However, the Network-on-chip (NoC) paradigm, which has emerged as an enabling methodology for integrating hundreds to thousands of cores on the same die can contribute significantly to the thermal issues. Moreover, the typical DTM is triggered reactively based on temperature measurements from on-chip thermal sensor requiring long reaction times whereas predictive DTM method estimates future temperature in advance, eliminating the chance of temperature overshoot. Artificial Neural Networks (ANNs) have been used in various domains for modeling and prediction with high accuracy due to its ability to learn and adapt. This thesis concentrates on designing an ANN prediction engine to predict the thermal profile of the cores and Network-on-Chip elements of the chip. This thermal profile of the chip is then used by the predictive DTM that combines both core level and network level DTM techniques. On-chip wireless interconnect which is recently envisioned to enable energy-efficient data exchange between cores in a multicore environment, will be used to provide a broadcast-capable medium to efficiently distribute thermal control messages to trigger and manage the DTM schemes

    A Multi-core Testbed on Desktop Computer for Research on Power/Thermal Aware Resource Management

    Get PDF
    Our goal is to develop a flexible, customizable, and practical multi-core testbed based on an Intel desktop computer that can be utilized to assist the theoretical research on power/thermal aware resource management in design of computer systems. By integrating different modules, i.e. thread mapping/scheduling, processor/core frequency and voltage variation, temperature/power measurement, and run-time performance collection, into a systematic and unified framework, our testbed can bridge the gap between the theoretical study and practical implementation. The effectiveness for our system was validated using appropriately selected benchmarks. The importance of this research is that it complements the current theoretical research by validating the theoretical results in practical scenarios, which are closer to that in the real world. In addition, by studying the discrepancies of results of theoretical study and their applications in real world, the research also aids in identifying new research problems and directions

    Thermal, Power Delivery and Reliability Management for 3D ICS

    Get PDF
    Three-dimensional (3D) integration technology is promising to continuously improve the performance of electronic devices by vertically stacking multiple active layers and connecting them with Through-Silicon-Vias (TSVs). Meanwhile, the thermal and power integrity problems are exacerbated since the power flux in 3D integrated circuits (3D ICs) increases linearly with the number of stacked layers. Moreover, the TSV structure in 3D ICs introduces new reliability problems since TSVs are vulnerable to various failure mechanisms (e.g. electromigration) and the failure of power-ground TSVs will cause voltage drop thereby significantly degrading the performance of 3D ICs. To make things worse, the high temperature, thermal gradient and power load in 3D ICs accelerate the failure of TSVs. Therefore, in order to push the 3D integration technology to full commercialization, the thermal, power integrity and reliability problem should be properly addressed in both design-time and run-time. In 3D ICs, the heat flux will easily exceed the capability of the traditional air cooling. Therefore, several aggressive cooling methods are applied to remove heat from the 3D IC, which include micro-fluidic cooling, the phase change material based cooling etc. These cooling schemes are usually implemented close to the heat source to gain high heat removal capability, thus causing more challenges to the design of 3D ICs. Unfortunately, physical design tools for 3D ICs with those aggressive cooling methods are lack. In this thesis, we will focus on 3D ICs with micro-fluidic (MF) cooling. The physical design for this kind of 3D ICs involves complex trade-offs between the circuit performance, power delivery noise, and temperature. For example, both TSVs and micro-cavities for MF cooling are fabricated in the substrate region. Therefore, they will compete in space: the allocation of signal TSVs should avoid micro-cavities to realize a feasible design, thus enforcing more constraints to the physical placement of 3D ICs. Moreover, power delivery networks (PDNs) in 3D ICs are enabled by power-ground (P/G) TSVs. The number and distribution of P/G TSVs are also constrained by micro-cavities which will influence the power integrity of the 3D IC. In addition, the capability of MF cooling degrades downstream the flow of coolant thereby causing large in-layer temperature gradient. The spatial temperature variance will affect the reliability of 3D ICs. in order to avoid it, the gate/modules in 3D ICs should be placed properly. In order to address the trade-offs 3D ICs with MF cooling, different design-time methods for application specific ICs (ASICs) and field programmable gate arrays (FPGAs) are proposed, respectively. For 3D ASICs, we propose a co-design method that integrates the design of MF cooling heat sink and P/G TSVs to the physical placement for 3D ICs. Experiments on publicly available benchmarks show that using our method, we can achieve better results compared to the traditional sequential design flow. The case for 3D FPGAs is more complicated than ASICs since the routing and logic resources are fixed and the chip power and temperature is hard to estimate until the circuit is routed. Therefore, in this thesis, we first build a design space exploration (DSE) framework to study how MF cooling affects the design of 3D FPGAs. Following this, we utilize an existing 3D FPGA placement and routing tool to develop a cooling-aware placement framework for 3D FPGAs to reduce the temperature gradient. Since the activity of 3D ICs cannot be completely estimated at the design stage, the run-time management, besides design-time methods, is required to address the thermal, power and reliability problems in 3D ICs. However, the vertically stacked structure makes the run-time management for 3D ICs more complicated than 2D ICs. The major reason of this is that the power supply noise and temperature can be coupled across layers in 3D ICs. This means the activity of one layer may affect the performance and reliability of other layers through voltage/temperature coupling. As a result, we cannot perform run-time management for each layer (perhaps implemented with dierent chips) of 3D ICs separately as in 2D systems. Therefore, the space of control nodes will become larger and more complicated. To make things worse, the existing run-time management techniques have various drawbacks (e.g. large off-line characterization overhead, poor scalability etc. ), which needs more eort to improve. In this thesis, we propose a phase-driven Q-learning based run-time management technique which can tune the activity of the processor to maximize the 3D CPU performance subject to the reliability constraint

    It is too hot in here! A performance, energy and heat aware scheduler for Asymmetric multiprocessing processors in embedded systems.

    Get PDF
    Modern architecture present in self-power devices such as mobiles or tablet computers proposes the use of asymmetric processors that allow either energy-efficient or performant computation on the same SoC. For energy efficiency and performance consideration, the asymmetry resides in differences in CPU micro-architecture design and results in diverging raw computing capability. Other components such as the processor memory subsystem also show differences resulting in different memory transaction timing. Moreover, based on a bus-snoop protocol, cache coherency between processors comes with a peculiarity in memory latency depending on the processors operating frequencies. All these differences come with challenging decisions on both application schedulability and processor operating frequencies. In addition, because of the small form factor of such embedded systems, these devices generally cannot afford active cooling systems. Therefore thermal mitigation relies on dynamic software solutions. Current operating systems for embedded systems such as Linux or Android do not consider all these particularities. As such, they often fail to satisfy user expectations of a powerful device with long battery life. To remedy this situation, this thesis proposes a unified approach to deliver high-performance and energy-efficiency computation in each of its flavours, considering the memory subsystem and all computation units available in the system. Performance is maximized even when the device is under heavy thermal constraints. The proposed unified solution is based on accurate models targeting both performance and thermal behaviour and resides at the operating systems kernel level to manage all running applications in a global manner. Particularly, the performance model considers both the computation part and also the memory subsystem of symmetric or asymmetric processors present in embedded devices. The thermal model relies on the accurate physical thermal properties of the device. Using these models, application schedulability and processor frequency scaling decisions to either maximize performance or energy efficiency within a thermal budget are extensively studied. To cover a large range of application behaviour, both models are built and designed using a generative workload that considers fine-grain details of the underlying microarchitecture of the SoC. Therefore, this approach can be derived and applied to multiple devices with little effort. Extended evaluation on real-world benchmarks for high performance and general computing, as well as common applications targeting the mobile and tablet market, show the accuracy and completeness of models used in this unified approach to deliver high performance and energy efficiency under high thermal constraints for embedded devices

    Online Thermal Control Methods for Multi-Processor Systems

    Get PDF
    With technological advances, the number of cores integrated on a chip is increasing. This, in turn is leading to thermal constraints and thermal design challenges. Temperature gradients and hot-spots not only affect the performance of the system, but also lead to unreliable circuit operation and affect the life-time of the chip. Meeting temperature constraints and reducing hot-spots are critical for achieving reliable and efficient operation of complex multi-core systems. In this article we analyze the use of four of the most promising families of online control techniques for thermal management of multi-processors system-on-chip (MPSoC). In particular, in our exploration we aim at achieving an online smooth thermal control action that minimizes the performance loss as well as the computational and hardware overhead of embedding a thermal management system inside the MPSoC. The definition of the optimization problem to tackle in this work considers the thermal profile of the system, its evolution over time and current time-varying workload requirements. Thus, this problem is formulated as a finite-horizon optimal control problem and we analyze the control features of different on-line thermal control approaches. In addition, we implemented the policies on an MPSoC hardware simulation platform and performed experiments on a cycle-accurate model of the 8-core Niagara multi-core architecture using benchmarks ranging from web-accessing to playing multimedia. Results show different trade-offs among the analyzed techniques regarding the thermal profile, the frequency setting, the power consumption and the implementation complexity

    Dynamic power management: from portable devices to high performance computing

    Get PDF
    Electronic applications are nowadays converging under the umbrella of the cloud computing vision. The future ecosystem of information and communication technology is going to integrate clouds of portable clients and embedded devices exchanging information, through the internet layer, with processing clusters of servers, data-centers and high performance computing systems. Even thus the whole society is waiting to embrace this revolution, there is a backside of the story. Portable devices require battery to work far from the power plugs and their storage capacity does not scale as the increasing power requirement does. At the other end processing clusters, such as data-centers and server farms, are build upon the integration of thousands multiprocessors. For each of them during the last decade the technology scaling has produced a dramatic increase in power density with significant spatial and temporal variability. This leads to power and temperature hot-spots, which may cause non-uniform ageing and accelerated chip failure. Nonetheless all the heat removed from the silicon translates in high cooling costs. Moreover trend in ICT carbon footprint shows that run-time power consumption of the all spectrum of devices accounts for a significant slice of entire world carbon emissions. This thesis work embrace the full ICT ecosystem and dynamic power consumption concerns by describing a set of new and promising system levels resource management techniques to reduce the power consumption and related issues for two corner cases: Mobile Devices and High Performance Computing

    Design of Thermal Management Control Policies for Multiprocessors Systems on Chip

    Get PDF
    The contribution of this thesis is a thorough study of thermal aware policy design for MPSoCs. The study includes the modelling of their thermal behavior as well as the improvement and the definition of new thermal management and balancing policies. The work is structured on three main specific disciplines. The areas of contributions are: modeling, algorithms and system design. This thesis extends the field of modeling by proposing new techniques to represent the thermal behavior of MPSoCs using a mathematical formalization. Heat transfer and modelling of physical properties of MPSoCs have been extensively studied. Special emphasis is given to the way the system cools down (i.e. micro-cooling, natural heat dissipation etc.) and the heat propagates inside the MPSoC. The second contribution of this work is related to policies, which manage MPSoC working frequencies and micro-cooling pumps to satisfy user requirements in the most effective possible way, while consuming the lowest possible amount of resources. Several families of thermal policies algorithms have been studied and analyzed in this work for both 2D and 3D MPSoCs including liquid cooling technologies. The discipline of system design has also been extended during the development of this thesis. Thermal management policies have been implemented in real emulation platforms and contributions in this area are related to the design and implementation of proposed innovations in real MPSoC platforms

    Power, Energy, and Thermal Management for Clustered Manycores

    Get PDF
    Efficient and effective system-level power, energy, and thermal management are very important issues in modern computing systems, for which clustered architectures with multiple voltage islands are an expected compromise between global and per-core DVFS. In this dissertation, we focus on two of the most relevant problems for such architectures, specifically, optimizing performance under power/thermal constraints, and minimizing energy under performance constraints
    • …
    corecore