181 research outputs found

    Circuits and Systems Advances in Near Threshold Computing

    Get PDF
    Modern society is witnessing a sea change in ubiquitous computing, in which people have embraced computing systems as an indispensable part of day-to-day existence. Computation, storage, and communication abilities of smartphones, for example, have undergone monumental changes over the past decade. However, global emphasis on creating and sustaining green environments is leading to a rapid and ongoing proliferation of edge computing systems and applications. As a broad spectrum of healthcare, home, and transport applications shift to the edge of the network, near-threshold computing (NTC) is emerging as one of the promising low-power computing platforms. An NTC device sets its supply voltage close to its threshold voltage, dramatically reducing the energy consumption. Despite showing substantial promise in terms of energy efficiency, NTC is yet to see widescale commercial adoption. This is because circuits and systems operating with NTC suffer from several problems, including increased sensitivity to process variation, reliability problems, performance degradation, and security vulnerabilities, to name a few. To realize its potential, we need designs, techniques, and solutions to overcome these challenges associated with NTC circuits and systems. The readers of this book will be able to familiarize themselves with recent advances in electronics systems, focusing on near-threshold computing

    Multi-criteria optimization for energy-efficient multi-core systems-on-chip

    Get PDF
    The steady down-scaling of transistor dimensions has made possible the evolutionary progress leading to today’s high-performance multi-GHz microprocessors and core based System-on-Chip (SoC) that offer superior performance, dramatically reduced cost per function, and much-reduced physical size compared to their predecessors. On the negative side, this rapid scaling however also translates to high power densities, higher operating temperatures and reduced reliability making it imperative to address design issues that have cropped up in its wake. In particular, the aggressive physical miniaturization have increased CMOS fault sensitivity to the extent that many reliability constraints pose threat to the device normal operation and accelerate the onset of wearout-based failures. Among various wearout-based failure mechanisms, Negative biased temperature instability (NBTI) has been recognized as the most critical source of device aging. The urge of reliable, low-power circuits is driving the EDA community to develop new design techniques, circuit solutions, algorithms, and software, that can address these critical issues. Unfortunately, this challenge is complicated by the fact that power and reliability are known to be intrinsically conflicting metrics: traditional solutions to improve reliability such as redundancy, increase of voltage levels, and up-sizing of critical devices do contrast with traditional low-power solutions, which rely on compact architectures, scaled supply voltages, and small devices. This dissertation focuses on methodologies to bridge this gap and establishes an important link between low-power solutions and aging effects. More specifically, we proposed new architectural solutions based on power management strategies to enable the design of low-power, aging aware cache memories. Cache memories are one of the most critical components for warranting reliable and timely operation. However, they are also more susceptible to aging effects. Due to symmetric structure of a memory cell, aging occurs regardless of the fact that a cell (or word) is accessed or not. Moreover, aging is a worst-case matric and line with worst-case access pattern determines the aging of the entire cache. In order to stop the aging of a memory cell, it must be put into a proper idle state when a cell (or word) is not accessed which require proper management of the idleness of each atomic unit of power management. We have proposed several reliability management techniques based on the idea of cache partitioning to alleviate NBTI-induced aging and obtain joint energy and lifetime benefits. We introduce graceful degradation mechanism which allows different cache blocks into which a cache is partitioned to age at different rates. This implies that various sub-blocks become unreliable at different times, and the cache keeps functioning with reduced efficiency. We extended the capabilities of this architecture by integrating the concept of reconfigurable caches to maintain the performance of the cache throughout its lifetime. By this strategy, whenever a block becomes unreliable, the remaining cache is reconfigured to work as a smaller size cache with only a marginal degradation of performance. Many mission-critical applications require guaranteed lifetime of their operations and therefore the hardware implementing their functionality. Such constraints are usually enforced by means of various reliability enhancing solutions mostly based on redundancy which are not energy-friendly. In our work, we have proposed a novel cache architecture in which a smart use of cache partitions for redundancy allows us to obtain cache that meet a desired lifetime target with minimal energy consumption

    Optimal MTCMOS reactivation under power supply noise and performance constraints

    Full text link

    Circuit and System Level Design Optimization for Power Delivery And Management

    Get PDF
    As the VLSI technology scales to the nanometer scale, power consumption has become a critical design concern of VLSI circuits. Power gating and dynamic voltage and frequency scaling (DVFS) are two effective power management techniques that are widely utilized in modern chip designs. Various design challenges merge with these power management techniques in nanometer VLSI circuits. For example, power gating introduces unique power integrity issues and trade-offs between switching noise and rush current noise. Assuring power integrity and achieving power efficiency are two highly intertwined design challenges. In addition, these trade-offs significantly vary with the supply voltage. It is difficult to use conventional power-gated power delivery networks (PDNs) to fully meet the involved conflicting design constraints while maximizing power saving and minimizing supply noise. The DVFS controller and the DC-DC power converter are two highly intertwining enablers for DVFS-based systems. However, traditional DVFS techniques treat the design optimizations of the two as separate tasks, giving rise to sub-optimal designs. To address the above research challenges, we propose several circuit and system level design optimization techniques in this dissertation. For power-gated PDN designs, we propose systemic decoupling capacitor (decap) optimization strategies that optimally trade-off between power integrity and leakage saving. First, new global decap and re-routable decap design concepts are proposed to relax the tight interaction between power integrity and leakage power saving of power-gated PDN at a single supply voltage level. Furthermore, we propose to leverage re-routable decaps to provide flexible decap allocation structures to better suit multiple supply voltage levels. The proposed strategies are implemented in an automatic design flow for choosing optimal amount of local decaps, global decaps and re-routable decaps. The proposed techniques significantly increase leakage saving without jeopardizing power integrity. The flexible decap allocations enabled by re-routable decaps lead to optimal design trade-offs for PDNs operating with two supply voltage levels. To improve the effectiveness of DVFS, we analyze the drawbacks of circuit-level only and policy-level only optimizations and the promising opportunities resulted from the cross-layer co-optimization of the DC-DC converter and online learning based DVFS polices. We present a cross-layer approach that optimizes transition time, area, energy overhead of the DC-DC converter along with key parameters of an online learning DVFS controller. We systematically evaluate the benefits of the proposed co-optimization strategy based on several processor architectures, namely single and dual-core processors and processors with DVFS and power gating. Our results indicate that the co-optimization can introduce noticeable additional energy saving without significant performance degradation

    Chapter One – An Overview of Architecture-Level Power- and Energy-Efficient Design Techniques

    Get PDF
    Power dissipation and energy consumption became the primary design constraint for almost all computer systems in the last 15 years. Both computer architects and circuit designers intent to reduce power and energy (without a performance degradation) at all design levels, as it is currently the main obstacle to continue with further scaling according to Moore's law. The aim of this survey is to provide a comprehensive overview of power- and energy-efficient “state-of-the-art” techniques. We classify techniques by component where they apply to, which is the most natural way from a designer point of view. We further divide the techniques by the component of power/energy they optimize (static or dynamic), covering in that way complete low-power design flow at the architectural level. At the end, we conclude that only a holistic approach that assumes optimizations at all design levels can lead to significant savings.Peer ReviewedPostprint (published version

    Power Management for Deep Submicron Microprocessors

    Get PDF
    As VLSI technology scales, the enhanced performance of smaller transistors comes at the expense of increased power consumption. In addition to the dynamic power consumed by the circuits there is a tremendous increase in the leakage power consumption which is further exacerbated by the increasing operating temperatures. The total power consumption of modern processors is distributed between the processor core, memory and interconnects. In this research two novel power management techniques are presented targeting the functional units and the global interconnects. First, since most leakage control schemes for processor functional units are based on circuit level techniques, such schemes inherently lack information about the operational profile of higher-level components of the system. This is a barrier to the pivotal task of predicting standby time. Without this prediction, it is extremely difficult to assess the value of any leakage control scheme. Consequently, a methodology that can predict the standby time is highly beneficial in bridging the gap between the information available at the application level and the circuit implementations. In this work, a novel Dynamic Sleep Signal Generator (DSSG) is presented. It utilizes the usage traces extracted from cycle accurate simulations of benchmark programs to predict the long standby periods associated with the various functional units. The DSSG bases its decisions on the current and previous standby state of the functional units to accurately predict the length of the next standby period. The DSSG presents an alternative to Static Sleep Signal Generation (SSSG) based on static counters that trigger the generation of the sleep signal when the functional units idle for a prespecified number of cycles. The test results of the DSSG are obtained by the use of a modified RISC superscalar processor, implemented by SimpleScalar, the most widely accepted open source vehicle for architectural analysis. In addition, the results are further verified by a Simultaneous Multithreading simulator implemented by SMTSIM. Leakage saving results shows an increase of up to 146% in leakage savings using the DSSG versus the SSSG, with an accuracy of 60-80% for predicting long standby periods. Second, chip designers in their effort to achieve timing closure, have focused on achieving the lowest possible interconnect delay through buffer insertion and routing techniques. This approach, though, taxes the power budget of modern ICs, especially those intended for wireless applications. Also, in order to achieve more functionality, die sizes are constantly increasing. This trend is leading to an increase in the average global interconnect length which, in turn, requires more buffers to achieve timing closure. Unconstrained buffering is bound to adversely affect the overall chip performance, if the power consumption is added as a major performance metric. In fact, the number of global interconnect buffers is expected to reach hundreds of thousands to achieve an appropriate timing closure. To mitigate the impact of the power consumed by the interconnect buffers, a power-efficient multi-pin routing technique is proposed in this research. The problem is based on a graph representation of the routing possibilities, including buffer insertion and identifying the least power path between the interconnect source and set of sinks. The novel multi-pin routing technique is tested by applying it to the ISPD and IBM benchmarks to verify the accuracy, complexity, and solution quality. Results obtained indicate that an average power savings as high as 32% for the 130-nm technology is achieved with no impact on the maximum chip frequency

    Technology Mapping, Design for Testability, and Circuit Optimizations for NULL Convention Logic Based Architectures

    Get PDF
    Delay-insensitive asynchronous circuits have been the target of a renewed research effort because of the advantages they offer over traditional synchronous circuits. Minimal timing analysis, inherent robustness against power-supply, temperature, and process variations, reduced energy consumption, less noise and EMI emission, and easy design reuse are some of the benefits of these circuits. NULL Convention Logic (NCL) is one of the mainstream asynchronous logic design paradigms that has been shown to be a promising method for designing delay-insensitive asynchronous circuits. This dissertation investigates new areas in NCL design and test and is made of three sections. The first section discusses different CMOS implementations of NCL gates and proposes new circuit techniques to enhance their operation. The second section focuses on mapping multi-rail logic expressions to a standard NCL gate library, which is a form of technology mapping for a category of NCL design automation flows. Finally, the last section proposes design for testability techniques for a recently developed low-power variant of NCL called Sleep Convention Logic (SCL)
    • …
    corecore