531 research outputs found

    Exploiting heterogeneity in Chip-Multiprocessor Design

    Get PDF
    In the past decade, semiconductor manufacturers are persistent in building faster and smaller transistors in order to boost the processor performance as projected by Moore’s Law. Recently, as we enter the deep submicron regime, continuing the same processor development pace becomes an increasingly difficult issue due to constraints on power, temperature, and the scalability of transistors. To overcome these challenges, researchers propose several innovations at both architecture and device levels that are able to partially solve the problems. These diversities in processor architecture and manufacturing materials provide solutions to continuing Moore’s Law by effectively exploiting the heterogeneity, however, they also introduce a set of unprecedented challenges that have been rarely addressed in prior works. In this dissertation, we present a series of in-depth studies to comprehensively investigate the design and optimization of future multi-core and many-core platforms through exploiting heteroge-neities. First, we explore a large design space of heterogeneous chip multiprocessors by exploiting the architectural- and device-level heterogeneities, aiming to identify the optimal design patterns leading to attractive energy- and cost-efficiencies in the pre-silicon stage. After this high-level study, we pay specific attention to the architectural asymmetry, aiming at developing a heterogeneity-aware task scheduler to optimize the energy-efficiency on a given single-ISA heterogeneous multi-processor. An advanced statistical tool is employed to facilitate the algorithm development. In the third study, we shift our concentration to the device-level heterogeneity and propose to effectively leverage the advantages provided by different materials to solve the increasingly important reliability issue for future processors

    A survey of system level power management schemes in the dark-silicon era for many-core architectures

    Get PDF
    Power consumption in Complementary Metal Oxide Semiconductor (CMOS) technology has escalated to a point that only a fractional part of many-core chips can be powered-on at a time. Fortunately, this fraction can be increased at the expense of performance through the dark-silicon solution. However, with many-core integration set to be heading towards its thousands, power consumption and temperature increases per time, meaning the number of active nodes must be reduced drastically. Therefore, optimized techniques are demanded for continuous advancement in technology. Existing efforts try to overcome this challenge by activating nodes from different parts of the chip at the expense of communication latency. Other efforts on the other hand employ run-time power management techniques to manage the power performance of the cores trading-off performance for power. We found out that, for a significant amount of power to saved and high temperature to be avoided, focus should be on reducing the power consumption of all the on-chip components. Especially, the memory hierarchy and the interconnect. Power consumption can be minimized by, reducing the size of high leakage power dissipating elements, turning-off idle resources and integrating power saving materials

    Energy-Efficient and Reliable Computing in Dark Silicon Era

    Get PDF
    Dark silicon denotes the phenomenon that, due to thermal and power constraints, the fraction of transistors that can operate at full frequency is decreasing in each technology generation. Moore’s law and Dennard scaling had been backed and coupled appropriately for five decades to bring commensurate exponential performance via single core and later muti-core design. However, recalculating Dennard scaling for recent small technology sizes shows that current ongoing multi-core growth is demanding exponential thermal design power to achieve linear performance increase. This process hits a power wall where raises the amount of dark or dim silicon on future multi/many-core chips more and more. Furthermore, from another perspective, by increasing the number of transistors on the area of a single chip and susceptibility to internal defects alongside aging phenomena, which also is exacerbated by high chip thermal density, monitoring and managing the chip reliability before and after its activation is becoming a necessity. The proposed approaches and experimental investigations in this thesis focus on two main tracks: 1) power awareness and 2) reliability awareness in dark silicon era, where later these two tracks will combine together. In the first track, the main goal is to increase the level of returns in terms of main important features in chip design, such as performance and throughput, while maximum power limit is honored. In fact, we show that by managing the power while having dark silicon, all the traditional benefits that could be achieved by proceeding in Moore’s law can be also achieved in the dark silicon era, however, with a lower amount. Via the track of reliability awareness in dark silicon era, we show that dark silicon can be considered as an opportunity to be exploited for different instances of benefits, namely life-time increase and online testing. We discuss how dark silicon can be exploited to guarantee the system lifetime to be above a certain target value and, furthermore, how dark silicon can be exploited to apply low cost non-intrusive online testing on the cores. After the demonstration of power and reliability awareness while having dark silicon, two approaches will be discussed as the case study where the power and reliability awareness are combined together. The first approach demonstrates how chip reliability can be used as a supplementary metric for power-reliability management. While the second approach provides a trade-off between workload performance and system reliability by simultaneously honoring the given power budget and target reliability

    Adaptive Knobs for Resource Efficient Computing

    Get PDF
    Performance demands of emerging domains such as artificial intelligence, machine learning and vision, Internet-of-things etc., continue to grow. Meeting such requirements on modern multi/many core systems with higher power densities, fixed power and energy budgets, and thermal constraints exacerbates the run-time management challenge. This leaves an open problem on extracting the required performance within the power and energy limits, while also ensuring thermal safety. Existing architectural solutions including asymmetric and heterogeneous cores and custom acceleration improve performance-per-watt in specific design time and static scenarios. However, satisfying applications’ performance requirements under dynamic and unknown workload scenarios subject to varying system dynamics of power, temperature and energy requires intelligent run-time management. Adaptive strategies are necessary for maximizing resource efficiency, considering i) diverse requirements and characteristics of concurrent applications, ii) dynamic workload variation, iii) core-level heterogeneity and iv) power, thermal and energy constraints. This dissertation proposes such adaptive techniques for efficient run-time resource management to maximize performance within fixed budgets under unknown and dynamic workload scenarios. Resource management strategies proposed in this dissertation comprehensively consider application and workload characteristics and variable effect of power actuation on performance for pro-active and appropriate allocation decisions. Specific contributions include i) run-time mapping approach to improve power budgets for higher throughput, ii) thermal aware performance boosting for efficient utilization of power budget and higher performance, iii) approximation as a run-time knob exploiting accuracy performance trade-offs for maximizing performance under power caps at minimal loss of accuracy and iv) co-ordinated approximation for heterogeneous systems through joint actuation of dynamic approximation and power knobs for performance guarantees with minimal power consumption. The approaches presented in this dissertation focus on adapting existing mapping techniques, performance boosting strategies, software and dynamic approximations to meet the performance requirements, simultaneously considering system constraints. The proposed strategies are compared against relevant state-of-the-art run-time management frameworks to qualitatively evaluate their efficacy

    An Ageing-Aware and Temperature Mapping Algorithm For Multi-Level Cache Nodes

    Get PDF
    Increase in chip inactivity in the future threatens the performance of many-core systems and therefore, efficient techniques are required for continuous scaling of transistors. As of a result of this challenge, future proposed many-core system designs must consider the possibility of a 50% functioning chip per time as well maintaining performance. Fortunately, this 50% inactivity can be increased by managing the temperature of active nodes and the placement of the dark nodes to leverage a balance working chip whilst considering the lifetime of nodes. However, allocating dark nodes inefficiently can increase the temperature of the chip and increase the waiting time of applications. Consequently, due to stochastic application characteristics, a dynamic rescheduling technique is more desirable compared to fixed design mapping. In this paper, we propose an Ageing Before Temperature Electromigration-Aware, Negative Bias Temperature Instability (NBTI) & Time-dependent Dielectric Breakdown (TDDB) Neighbour Allocation (ABENA 2.0), a dynamic rescheduling management system which considers the ageing and temperature before mapping applications. ABENA also considers the location of active and dark nodes and migrate task based on the characteristics of the nodes. Our proposed algorithm employ Dynamic Voltage Frequency Scaling (DVFS) to reduce the Voltage and Frequency (VF) of the nodes. Results show that, our proposed methods improve the ageing of nodes compared to a conventional round-robin management system by 10% in temperature, and 10% agein

    Optimization-based power and thermal management for dark silicon aware 3D chip multiprocessors using heterogeneous cache hierarchy

    Get PDF
    Management of a problem recently known as “dark silicon” is a new challenge in multicore designs. Prior innovative studies have addressed the dark silicon problem in the fields of power-efficient core design. However, addressing dark silicon challenges in uncore component designs such as cache hierarchy, on-chip interconnect etc. that consume significant portion of the on-chip power consumption is largely unexplored. In this paper, for the first time, we propose an integrated approach which considers the impact of power consumption of core and uncore components simultaneously to improve multi/many-core performance in the dark silicon era. The proposed approach dynamically (1) predicts the changing program behavior on each core; (2) re-determines frequency/voltage, cache capacity and technology in each level of the cache hierarchy based on the program's scalability in order to satisfy the power and temperature constraints. In the proposed architecture, for future chip-multiprocessors (CMPs), we exploit emerging technologies such as non-volatile memories (NVMs) and 3D techniques to combat dark silicon. Also, for the first time, we propose a detailed power model which is useful for future dark silicon CMPs power modeling. Experimental results on SPEC 2000/2006 benchmarks show that the proposed method improves throughput by about 54.3% and energy-delay product by about 61% on average, respectively, in comparison with the conventional CMP architecture with homogenous cache system. (A preliminary short version of this work was presented in the 18th Euromicro Conference on Digital System Design (DSD), 2015.) © 2017 Elsevier B.V

    Model-Based Design, Analysis, and Implementations for Power and Energy-Efficient Computing Systems

    Get PDF
    Modern computing systems are becoming increasingly complex. On one end of the spectrum, personal computers now commonly support multiple processing cores, and, on the other end, Internet services routinely employ thousands of servers in distributed locations to provide the desired service to its users. In such complex systems, concerns about energy usage and power consumption are increasingly important. Moreover, growing awareness of environmental issues has added to the overall complexity by introducing new variables to the problem. In this regard, the ability to abstractly focus on the relevant details allows model-based design to help significantly in the analysis and solution of such problems. In this dissertation, we explore and analyze model-based design for energy and power considerations in computing systems. Although the presented techniques are more generally applicable, we focus their application on large-scale Internet services operating in U.S. electricity markets. Internet services are becoming increasingly popular in the ICT ecosystem of today. The physical infrastructure to support such services is commonly based on a group of cooperative data centers (DCs) operating in tandem. These DCs are geographically distributed to provide security and timing guarantees for their customers. To provide services to millions of customers, DCs employ hundreds of thousands of servers. These servers consume a large amount of energy that is traditionally produced by burning coal and employing other environmentally hazardous methods, such as nuclear and gas power generation plants. This large energy consumption results in significant and fast-growing financial and environmental costs. Consequently, for protection of local and global environments, governing bodies around the globe have begun to introduce legislation to encourage energy consumers, especially corporate entities, to increase the share of renewable energy (green energy) in their total energy consumption. However, in U.S. electricity markets, green energy is usually more expensive than energy generated from traditional sources like coal or petroleum. We model the overall problem in three sub-areas and explore different approaches aimed at reducing the environmental foot print and operating costs of multi-site Internet services, while honoring the Quality of Service (QoS) constraints as contracted in service level agreements (SLAs). Firstly, we model the load distribution among member DCs of a multi-site Internet service. The use of green energy is optimized considering different factors such as (a) geographically and temporally variable electricity prices, (b) the multitude of available energy sources to choose from at each DC, (c) the necessity to support more than one SLA, and, (d) the requirements to offer more than one service at each DC. Various approaches are presented for solving this problem and extensive simulations using Google’s setup in North America are used to evaluate the presented approaches. Secondly, we explore the area of shaving the peaks in the energy demand of large electricity consumers, such as DCs by using a battery-based energy storage system. Electrical demand of DCs is typically peaky based on the usage cycle of their customers. Resultant peaks in the electrical demand require development and maintenance of a costlier energy delivery mechanism, and are often met using expensive gas or diesel generators which often have a higher environmental impact. To shave the peak power demand, a battery can be used which is charged during low load and is discharged during the peak loads. Since the batteries are costly, we present a scheme to estimate the size of battery required for any variable electrical load. The electrical load is modeled using the concept of arrival curves from Network Calculus. Our analysis mechanism can help determine the appropriate battery size for a given load arrival curve to reduce the peak. Thirdly, we present techniques to employ intra-DC scheduling to regulate the peak power usage of each DC. The model we develop is equally applicable to an individual server with multi-/many-core chips as well as a complete DC with an intermix of homogeneous and heterogeneous servers. We evaluate these approaches on single-core and multi-core chip processors and present the results. Overall, our work demonstrates the value of model-based design for intelligent load distribution across DCs, storage integration, and per DC optimizations for efficient energy management to reduce operating costs and environmental footprint for multi-site Internet services

    Digital neural circuits : from ions to networks

    Get PDF
    PhD ThesisThe biological neural computational mechanism is always fascinating to human beings since it shows several state-of-the-art characteristics: strong fault tolerance, high power efficiency and self-learning capability. These behaviours lead the developing trend of designing the next-generation digital computation platform. Thus investigating and understanding how the neurons talk with each other is the key to replicating these calculation features. In this work I emphasize using tailor-designed digital circuits for exactly implementing bio-realistic neural network behaviours, which can be considered a novel approach to cognitive neural computation. The first advance is that biological real-time computing performances allow the presented circuits to be readily adapted for real-time closed-loop in vitro or in vivo experiments, and the second one is a transistor-based circuit that can be directly translated into an impalpable chip for high-level neurologic disorder rehabilitations. In terms of the methodology, first I focus on designing a heterogeneous or multiple-layer-based architecture for reproducing the finest neuron activities both in voltage-and calcium-dependent ion channels. In particular, a digital optoelectronic neuron is developed as a case study. Second, I focus on designing a network-on-chip architecture for implementing a very large-scale neural network (e.g. more than 100,000) with human cognitive functions (e.g. timing control mechanism). Finally, I present a reliable hybrid bio-silicon closed-loop system for central pattern generator prosthetics, which can be considered as a framework for digital neural circuit-based neuro-prosthesis implications. At the end, I present the general digital neural circuit design principles and the long-term social impacts of the presented work
    corecore