111 research outputs found

    Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

    Get PDF
    The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

    Improving robustness of dynamic logic based pipelines

    Get PDF
    Domino dynamic circuits are widely used in critical parts of high performance systems. In this paper we show that, in addition to the functional limitation associated to the noninverting behavior of Domino gates, there are also robustness disadvantages when compared to inverting dynamic gates. We analyze and compare the tolerance to parameter and operating conditions variations of gate-level pipelines implemented with Domino and with DOE, an inverting dynamic gate we have recently proposed. Our experiments confirm that DOE pipelines are more robust and that improvements are due to its noninverting feature.Ministerio de Economía y Competitividad FEDER TEC2013-40670-

    Improving robustness of dynamic logic based pipelines

    Get PDF
    Domino dynamic circuits are widely used in critical parts of high performance systems. In this paper we show that, in addition to the functional limitation associated to the noninverting behavior of Domino gates, there are also robustness disadvantages when compared to inverting dynamic gates. We analyze and compare the tolerance to parameter and operating conditions variations of gate-level pipelines implemented with Domino and with DOE, an inverting dynamic gate we have recently proposed. Our experiments confirm that DOE pipelines are more robust and that improvements are due to its noninverting feature.Peer reviewe

    Chapter One – An Overview of Architecture-Level Power- and Energy-Efficient Design Techniques

    Get PDF
    Power dissipation and energy consumption became the primary design constraint for almost all computer systems in the last 15 years. Both computer architects and circuit designers intent to reduce power and energy (without a performance degradation) at all design levels, as it is currently the main obstacle to continue with further scaling according to Moore's law. The aim of this survey is to provide a comprehensive overview of power- and energy-efficient “state-of-the-art” techniques. We classify techniques by component where they apply to, which is the most natural way from a designer point of view. We further divide the techniques by the component of power/energy they optimize (static or dynamic), covering in that way complete low-power design flow at the architectural level. At the end, we conclude that only a holistic approach that assumes optimizations at all design levels can lead to significant savings.Peer ReviewedPostprint (published version

    低電力非同期回路の面積高効率化設計

    Get PDF
    Tohoku University亀山充隆課

    Asynchronous techniques for system-on-chip design

    Get PDF
    SoC design will require asynchronous techniques as the large parameter variations across the chip will make it impossible to control delays in clock networks and other global signals efficiently. Initially, SoCs will be globally asynchronous and locally synchronous (GALS). But the complexity of the numerous asynchronous/synchronous interfaces required in a GALS will eventually lead to entirely asynchronous solutions. This paper introduces the main design principles, methods, and building blocks for asynchronous VLSI systems, with an emphasis on communication and synchronization. Asynchronous circuits with the only delay assumption of isochronic forks are called quasi-delay-insensitive (QDI). QDI is used in the paper as the basis for asynchronous logic. The paper discusses asynchronous handshake protocols for communication and the notion of validity/neutrality tests, and completion tree. Basic building blocks for sequencing, storage, function evaluation, and buses are described, and two alternative methods for the implementation of an arbitrary computation are explained. Issues of arbitration, and synchronization play an important role in complex distributed systems and especially in GALS. The two main asynchronous/synchronous interfaces needed in GALS-one based on synchronizer, the other on stoppable clock-are described and analyzed

    A Structured Design Methodology for High Performance VLSI Arrays

    Get PDF
    abstract: The geometric growth in the integrated circuit technology due to transistor scaling also with system-on-chip design strategy, the complexity of the integrated circuit has increased manifold. Short time to market with high reliability and performance is one of the most competitive challenges. Both custom and ASIC design methodologies have evolved over the time to cope with this but the high manual labor in custom and statistic design in ASIC are still causes of concern. This work proposes a new circuit design strategy that focuses mostly on arrayed structures like TLB, RF, Cache, IPCAM etc. that reduces the manual effort to a great extent and also makes the design regular, repetitive still achieving high performance. The method proposes making the complete design custom schematic but using the standard cells. This requires adding some custom cells to the already exhaustive library to optimize the design for performance. Once schematic is finalized, the designer places these standard cells in a spreadsheet, placing closely the cells in the critical paths. A Perl script then generates Cadence Encounter compatible placement file. The design is then routed in Encounter. Since designer is the best judge of the circuit architecture, placement by the designer will allow achieve most optimal design. Several designs like IPCAM, issue logic, TLB, RF and Cache designs were carried out and the performance were compared against the fully custom and ASIC flow. The TLB, RF and Cache were the part of the HEMES microprocessor.Dissertation/ThesisPh.D. Electrical Engineering 201

    A New Ultra Low-Power and Noise Tolerant Circuit Technique for CMOS Domino Logic

    Get PDF
    Dynamic logic style is used in high performance circuit design because of its fast speed and less transistors requirement as compared to CMOS logic style. But it is not widely accepted for all types of circuit implementations due to its less noise tolerance and charge sharing problems. A small noise at the input of the dynamic logic can change the desired output. Domino logic uses one static CMOS inverter at the output of dynamic node which is more noise immune and consuming very less power as compared to other proposed circuit. In this paper we have proposed a novel circuit for domino logic which has less noise at the output node and has very less power-delay product (PDP) as compared to previous reported articles. Low PDP is achieved by using semi-dynamic logic buffer and also reducing leakage current when PDN is not conducting

    Optimal digital system design in deep submicron technology

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 165-174).The optimization of a digital system in deep submicron technology should be done with two basic principles: energy waste reduction and energy-delay tradeoff. Increased energy resources obtained through energy waste reduction are utilized through energy-delay tradeoffs. The previous practice of obliviously pursuing performance has led to the rapid increase in energy consumption. While energy waste due to unnecessary switching could be reduced with small increases in logic complexity, leakage energy waste still remains as a major design challenge. We find that fine-grain dynamic leakage reduction (FG-DLR), turning off small subblocks for short idle intervals, is the key for successful leakage energy saving. We introduce an FG-DLR circuit technique, Leakage Biasing, which uses leakage currents themselves to bias the circuit into the minimum leakage state, and apply it to primary SRAM arrays for bitline leakage reduction (Leakage-Biased Bitlines) and to domino logic (Leakage-Biased Domino). We also introduce another FG-DLR circuit technique, Dynamic Resizing, which dynamically downsizes transistors on idle paths while maintaining the performance along active critical paths, and apply it to static CMOS circuits.(cont.) We show that significant energy reduction can be achieved at the same computation throughput and communication bandwidth by pipelining logic gates and wires. We find that energy saved by pipelining datapaths is eventually limited by latch energy overhead, leading to a power-optimal pipelining. Structuring global wires into on-chip networks provides a better environment for pipelining and leakage energy saving. We show that the energy-efficiency increase through replacement with dynamically packet-routed networks is bounded by router energy overhead. Finally, we provide a way of relaxing the peak power constraint. We evaluate the use of Activity Migration (AM) for hot spot removal. AM spreads heat by transporting computation to a different location on the die. We show that AM can be used either to increase the power that can be dissipated by a given package, or to lower the operating temperature and hence the operating energy.by Seongmoo Heo.Ph.D

    Managing Static Leakage Energy in Microprocessor Functional Units

    Get PDF
    Static energy due to subthreshold leakage current is projected to become a major component of the total energy in high performance microprocessors. Many studies so far have examined and proposed techniques to reduce leakage in on-chip storage structures. In this study, static energy is reduced in the integer functional units by leveraging the unique qualities of dual threshold voltage domino logic. Domino logic has desirable properties that greatly reduce leakage current while providing fast propagation times. However, due to the energy cost of entering the low leakage current state (sleep mode), domino logic has thus far been used only for leakage reduction in the longterm standby mode. We examine the utility of the sleep mode (while considering the aforementioned costs) when idle times are relatively short, one to a few hundred cycles, as is often the case for functional units. Using an analytical energy model suitable for architecture-level analysis, we explore the interaction of the application and technology, and the effect on energy and performance as the underlying parameters are varied, on a set of benchmarks. Our results show that if the leakage approaches the magnitude as projected in the literature, even for short idle intervals as few as ten cycles, an aggressive policy of activating the sleep mode at every idle period performs well and a more complex control strategy may not be warranted. We also propose a simple design, called Gradual Sleep, to reduce the energy impact of using the sleep mode for smaller idle periods
    corecore