534 research outputs found

    A survey of system level power management schemes in the dark-silicon era for many-core architectures

    Get PDF
    Power consumption in Complementary Metal Oxide Semiconductor (CMOS) technology has escalated to a point that only a fractional part of many-core chips can be powered-on at a time. Fortunately, this fraction can be increased at the expense of performance through the dark-silicon solution. However, with many-core integration set to be heading towards its thousands, power consumption and temperature increases per time, meaning the number of active nodes must be reduced drastically. Therefore, optimized techniques are demanded for continuous advancement in technology. Existing eļ¬€orts try to overcome this challenge by activating nodes from diļ¬€erent parts of the chip at the expense of communication latency. Other eļ¬€orts on the other hand employ run-time power management techniques to manage the power performance of the cores trading-oļ¬€ performance for power. We found out that, for a signiļ¬cant amount of power to saved and high temperature to be avoided, focus should be on reducing the power consumption of all the on-chip components. Especially, the memory hierarchy and the interconnect. Power consumption can be minimized by, reducing the size of high leakage power dissipating elements, turning-oļ¬€ idle resources and integrating power saving materials

    Bubble budgeting: throughput optimization for dynamic workloads by exploiting dark cores in many core systems

    Get PDF
    All the cores of a many-core chip cannot be active at the same time, due to reasons like low CPU utilization in server systems and limited power budget in dark silicon era. These free cores (referred to as bubbles) can be placed near active cores for heat dissipation so that the active cores can run at a higher frequency level, boosting the performance of applications that run on active cores. Budgeting inactive cores (bubbles) to applications to boost performance has the following three challenges. First, the number of bubbles varies due to open workloads. Second, communication distance increases when a bubble is inserted between two communicating tasks (a task is a thread or process of a parallel application), leading to performance degradation. Third, budgeting too many bubbles as coolers to running applications leads to insufficient cores for future applications. In order to address these challenges, in this paper, a bubble budgeting scheme is proposed to budget free cores to each application so as to optimize the throughput of the whole system. Throughput of the system depends on the execution time of each application and the waiting time incurred for newly arrived applications. Essentially, the proposed algorithm determines the number and locations of bubbles to optimize the performance and waiting time of each application, followed by tasks of each application being mapped to a core region. A Rollout algorithm is used to budget power to the cores as the last step. Experiments show that our approach achieves 50 percent higher throughput when compared to state-of-the-art thermal-aware runtime task mapping approaches. The runtime overhead of the proposed algorithm is in the order of 1M cycles, making it an efficient runtime task management method for large-scale many-core systems

    Resource Management for Multicores to Optimize Performance under Temperature and Aging Constraints

    Get PDF

    Thermal Management for Dependable On-Chip Systems

    Get PDF
    This thesis addresses the dependability issues in on-chip systems from a thermal perspective. This includes an explanation and analysis of models to show the relationship between dependability and tempature. Additionally, multiple novel methods for on-chip thermal management are introduced aiming to optimize thermal properties. Analysis of the methods is done through simulation and through infrared thermal camera measurements

    Energy-Efficient and Reliable Computing in Dark Silicon Era

    Get PDF
    Dark silicon denotes the phenomenon that, due to thermal and power constraints, the fraction of transistors that can operate at full frequency is decreasing in each technology generation. Mooreā€™s law and Dennard scaling had been backed and coupled appropriately for five decades to bring commensurate exponential performance via single core and later muti-core design. However, recalculating Dennard scaling for recent small technology sizes shows that current ongoing multi-core growth is demanding exponential thermal design power to achieve linear performance increase. This process hits a power wall where raises the amount of dark or dim silicon on future multi/many-core chips more and more. Furthermore, from another perspective, by increasing the number of transistors on the area of a single chip and susceptibility to internal defects alongside aging phenomena, which also is exacerbated by high chip thermal density, monitoring and managing the chip reliability before and after its activation is becoming a necessity. The proposed approaches and experimental investigations in this thesis focus on two main tracks: 1) power awareness and 2) reliability awareness in dark silicon era, where later these two tracks will combine together. In the first track, the main goal is to increase the level of returns in terms of main important features in chip design, such as performance and throughput, while maximum power limit is honored. In fact, we show that by managing the power while having dark silicon, all the traditional benefits that could be achieved by proceeding in Mooreā€™s law can be also achieved in the dark silicon era, however, with a lower amount. Via the track of reliability awareness in dark silicon era, we show that dark silicon can be considered as an opportunity to be exploited for different instances of benefits, namely life-time increase and online testing. We discuss how dark silicon can be exploited to guarantee the system lifetime to be above a certain target value and, furthermore, how dark silicon can be exploited to apply low cost non-intrusive online testing on the cores. After the demonstration of power and reliability awareness while having dark silicon, two approaches will be discussed as the case study where the power and reliability awareness are combined together. The first approach demonstrates how chip reliability can be used as a supplementary metric for power-reliability management. While the second approach provides a trade-off between workload performance and system reliability by simultaneously honoring the given power budget and target reliability

    Dynamic Allocation/Reallocation of Dark Cores in Many-Core Systems for Improved System Performance

    Get PDF
    A significant number of processing cores in any many-core systems nowadays and likely in the future have to be switched off or forced to be idle to become dark cores, in light of ever increasing power density and chip temperature. Although these dark cores cannot make direct contributions to the chip\u27s throughput, they can still be allocated to applications currently running in the system for the sole purpose of heat dissipation enabled by the temperature gradient between the active and dark cores. However, allocating dark cores to applications tends to add extra waiting time to applications yet to be launched, which in return can have adverse implications on the overall system performance. Another big issue related to dark core allocation stems from the fact that application characteristics are prone to undergo rapid changes at runtime, making a fixed dark core allocation scheme less desirable. In this paper, a runtime dark core allocation and dynamic adjustment scheme is thus proposed. Built upon a dynamic programming network (DPN) framework, the proposed scheme attempts to optimize the performance of currently running applications and simultaneously reduce waiting times of incoming applications by taking into account both thermal issues and geometric shapes of regions formed by the active/dark cores. The experimental results show that the proposed approach achieves an average of 61% higher throughput than the two state-of-the-art thermal-aware runtime task mapping approaches, making it the runtime resource management of choice in many-core systems

    Power, Energy, and Thermal Management for Clustered Manycores

    Get PDF
    Efficient and effective system-level power, energy, and thermal management are very important issues in modern computing systems, for which clustered architectures with multiple voltage islands are an expected compromise between global and per-core DVFS. In this dissertation, we focus on two of the most relevant problems for such architectures, specifically, optimizing performance under power/thermal constraints, and minimizing energy under performance constraints

    Adaptive Knobs for Resource Efficient Computing

    Get PDF
    Performance demands of emerging domains such as artificial intelligence, machine learning and vision, Internet-of-things etc., continue to grow. Meeting such requirements on modern multi/many core systems with higher power densities, fixed power and energy budgets, and thermal constraints exacerbates the run-time management challenge. This leaves an open problem on extracting the required performance within the power and energy limits, while also ensuring thermal safety. Existing architectural solutions including asymmetric and heterogeneous cores and custom acceleration improve performance-per-watt in specific design time and static scenarios. However, satisfying applicationsā€™ performance requirements under dynamic and unknown workload scenarios subject to varying system dynamics of power, temperature and energy requires intelligent run-time management. Adaptive strategies are necessary for maximizing resource efficiency, considering i) diverse requirements and characteristics of concurrent applications, ii) dynamic workload variation, iii) core-level heterogeneity and iv) power, thermal and energy constraints. This dissertation proposes such adaptive techniques for efficient run-time resource management to maximize performance within fixed budgets under unknown and dynamic workload scenarios. Resource management strategies proposed in this dissertation comprehensively consider application and workload characteristics and variable effect of power actuation on performance for pro-active and appropriate allocation decisions. Specific contributions include i) run-time mapping approach to improve power budgets for higher throughput, ii) thermal aware performance boosting for efficient utilization of power budget and higher performance, iii) approximation as a run-time knob exploiting accuracy performance trade-offs for maximizing performance under power caps at minimal loss of accuracy and iv) co-ordinated approximation for heterogeneous systems through joint actuation of dynamic approximation and power knobs for performance guarantees with minimal power consumption. The approaches presented in this dissertation focus on adapting existing mapping techniques, performance boosting strategies, software and dynamic approximations to meet the performance requirements, simultaneously considering system constraints. The proposed strategies are compared against relevant state-of-the-art run-time management frameworks to qualitatively evaluate their efficacy

    AMA: An Ageing Task Migration Aware for High Performance Computing

    Get PDF
    The Dark-Silicon challenge poses a design problem for future many-core systems. As a result of this, several techniques have been introduced to improve the number of processing elements that can be powered on. One of the techniques employed by many is Task Migration. In this paper, An Ageing Task Migration Aware for High-Performance Computing (AMA) is proposed to improve the lifetime of nodes. The proposed method determines which clusters applications are mapped to and, migrates high-demand tasks amongst nodes to improve the lifetime at every epoch. Experimental results show that the proposed method outperforms state-of-the-art techniques by more than 10%
    • ā€¦
    corecore