1,093 research outputs found
Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review
The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER
An advanced Framework for efficient IC optimization based on analytical models engine
En base als reptes sorgits a conseqüència de l'escalat de la tecnologia, la present tesis desenvolupa i analitza un conjunt d'eines orientades a avaluar la sensibilitat a la propagaciĂł d'esdeveniments SET en circuits microelectrònics. S'han proposant varies mètriques de propagaciĂł de SETs considerant l'impacto dels emmascaraments lògic, elèctric i combinat lògic-elèctric. Aquestes mètriques proporcionen una via d'anĂ lisi per quantificar tant les regions mĂ©s susceptibles a propagar SETs com les sortides mĂ©s susceptibles de rebre'ls. S'ha desenvolupat un conjunt d'algorismes de cerca de camins sensibilitzables altament adaptables a mĂşltiples aplicacions, un sistema lògic especific i diverses tècniques de simplificaciĂł de circuits. S'ha demostrat que el retard d'un camĂ donat depèn dels vectors de sensibilitzaciĂł aplicats a les portes que formen part del mateix, essent aquesta variaciĂł de retard comparable a la atribuĂŻble a les variacions paramètriques del proces.En base a los desafĂos surgidos a consecuencia del escalado de la tecnologĂa, la presente tesis desarrolla y analiza un conjunto de herramientas orientadas a evaluar la sensibilidad a la propagaciĂłn de eventos SET en circuitos microelectrĂłnicos. Se han propuesto varias mĂ©tricas de propagaciĂłn de SETs considerando el impacto de los enmascaramientos lĂłgico, elĂ©ctrico y combinado lĂłgico-elĂ©ctrico. Estas mĂ©tricas proporcionan una vĂa de análisis para cuantificar tanto las regiones más susceptibles a propagar eventos SET como las salidas más susceptibles a recibirlos. Ha sido desarrollado un conjunto de algoritmos de bĂşsqueda de caminos sensibilizables altamente adaptables a mĂşltiples aplicaciones, un sistema lĂłgico especifico y diversas tĂ©cnicas de simplificaciĂłn de circuitos. Se ha demostrado que el retardo de un camino dado depende de los vectores de sensibilizaciĂłn aplicados a las puertas que forman parte del mismo, siendo esta variaciĂłn de retardo comparable a la atribuible a las variaciones paramĂ©tricas del proceso.Based on the challenges arising as a result of technology scaling, this thesis develops and evaluates a complete framework for SET propagation sensitivity. The framework comprises a number of processing tools capable of handling circuits with high complexity in an efficient way. Various SET propagation metrics have been proposed considering the impact of logic, electric and combined logic-electric masking. Such metrics provide a valuable vehicle to grade either in-circuit regions being more susceptible of propagating SETs toward the circuit outputs or circuit outputs more susceptible to produce SET. A quite efficient and customizable true path finding algorithm with a specific logic system has been constructed and its efficacy demonstrated on large benchmark circuits. It has been shown that the delay of a path depends on the sensitization vectors applied to the gates within the path. In some cases, this variation is comparable to the one caused by process parameters variation
Redundant Logic Insertion and Fault Tolerance Improvement in Combinational Circuits
This paper presents a novel method to identify and insert redundant logic
into a combinational circuit to improve its fault tolerance without having to
replicate the entire circuit as is the case with conventional redundancy
techniques. In this context, it is discussed how to estimate the fault masking
capability of a combinational circuit using the truth-cum-fault enumeration
table, and then it is shown how to identify the logic that can introduced to
add redundancy into the original circuit without affecting its native
functionality and with the aim of improving its fault tolerance though this
would involve some trade-off in the design metrics. However, care should be
taken while introducing redundant logic since redundant logic insertion may
give rise to new internal nodes and faults on those may impact the fault
tolerance of the resulting circuit. The combinational circuit that is
considered and its redundant counterparts are all implemented in semi-custom
design style using a 32/28nm CMOS digital cell library and their respective
design metrics and fault tolerances are compared
Desynchronization: Synthesis of asynchronous circuits from synchronous specifications
Asynchronous implementation techniques, which measure logic delays at run time and activate registers accordingly, are inherently more robust than their synchronous counterparts, which estimate worst-case delays at design time, and constrain the clock cycle accordingly. De-synchronization is a new paradigm to automate the design of asynchronous circuits from synchronous specifications, thus permitting widespread adoption of asynchronicity, without requiring special design skills or tools. In this paper, we first of all study different protocols for de-synchronization and formally prove their correctness, using techniques originally developed for distributed deployment of synchronous language specifications. We also provide a taxonomy of existing protocols for asynchronous latch controllers, covering in particular the four-phase handshake protocols devised in the literature for micro-pipelines. We then propose a new controller which exhibits provably maximal concurrency, and analyze the performance of desynchronized circuits with respect to the original synchronous optimized implementation. We finally prove the feasibility and effectiveness of our approach, by showing its application to a set of real designs, including a complete implementation of the DLX microprocessor architectur
Performance Comparison of Static CMOS and Domino Logic Style in VLSI Design: A Review
Of late, there is a steep rise in the usage of handheld gadgets and high speed applications. VLSI designers often choose static CMOS logic style for low power applications. This logic style provides low power dissipation and is free from signal noise integrity issues. However, designs based on this logic style often are slow and cannot be used in high performance circuits. On the other hand designs based on Domino logic style yield high performance and occupy less area. Yet, they have more power dissipation compared to their static CMOS counterparts. As a practice, designers during circuit synthesis, mix more than one logic style judiciously to obtain the advantages of each logic style. Carefully designing a mixed static Domino CMOS circuit can tap the advantages of both static and Domino logic styles overcoming their own short comings
Recommended from our members
On Co-Optimization Of Constrained Satisfiability Problems For Hardware Software Applications
Manufacturing technology has permitted an exponential growth in transistor count and density. However, making efficient use of the available transistors in the design has become exceedingly difficult. Standard design flow involves synthesis, verification, placement and routing followed by final tape out of the design. Due to the presence of various undesirable effects like capacitive crosstalk, supply noise, high temperatures, etc., verification/validation of the design has become a challenging problem. Therefore, having a good design convergence may not be possible within the target time, due to a need for a large number of design iterations.
Capacitive crosstalk is one of the major causes of design convergence problems in deep sub-micron era. With scaling, the number of crosstalk violations has been increasing because of reduced inter-wire distances. Consequently only the most severe crosstalk faults are fixed pre-silicon while the rest are tested post-silicon. Testing for capacitive crosstalk involves generation of input patterns which can be applied post-silicon to the integrated circuit and comparison of the output response. These patterns are generated at the gate/ Register Transfer Level (RTL) of abstraction using Automatic Test Pattern Generation (ATPG) tools. In this dissertation, anInteger Linear Programming (ILP) based ATPG technique for maximizing crosstalk induced delay increase at the victim net, for multiple aggressor crosstalk faults, is presented. Moreover, various solutions for pattern generation considering both zero as well as unit delay models is also proposed.
With voltage scaling, power supply switching noise has become one of the leading causes of signal integrity related failures in deep sub-micron designs. Hence, during power supply network design and analysis of power supply switching noise, computation of peak supply current is an essential step. Traditional peak current estimation approaches involve addition of peak current associated with all the CMOS gates which are switching in a combinational circuit. Consequently, this approach does not take the Boolean and temporal relationships of the circuit into account. This work presents an ILP based technique for generation of an input pattern pair which maximizes switching supply currents for a combinational circuit in the presence of integer gate delays. The input pattern pair generated using the above approach can be applied post-silicon for power droop testing.
With high level of integration, Multi-Processor Systems on Chip (MPSoC) feature multiple processor cores and accelerators on the same die, so as to exploit the instruction level parallelism in the application. For hardware-software co-design, application programming model is based on a Task Graph, which represents task dependencies and execution/transfer times for various threads and processes within an application. Mapping an application to an MPSoC traditionally involves representing it in the form of a task graph and employing static scheduling in order to minimize the schedule length. Dynamic system behavior is not taken into consideration during static scheduling, while dynamic scheduling requires the knowledge of task graph at runtime. A run-time task graph extraction heuristic to facilitate dynamic scheduling is also presented here. A novel game theory based approach uses this extracted task graph to perform run-time scheduling in order to minimize total schedule length.
With increase in transistor density, power density has gone up substantially. This has lead to generation of regions with very high temperature called Hotspots. Hotspots lead to reliability and performance issues and affect design convergence. In current generation Integrated Circuits (ICs) temperature is controlled by reducing power dissipation using Dynamic Thermal Management (DTM) techniques like frequency and/or voltage scaling. These techniques are reactive in nature and have detrimental effects on performance. Here, a look-ahead based task migration technique is proposed, in order to utilize the multitude of cores available in an MPSoC to eliminate thermal emergencies. Our technique is based on temperature prediction, leveraging upon a novel wavelet based thermal modeling approach.
Hence, this work addresses several optimization problems that can be reduced to constrained max-satisfiability, involving integer as well as Boolean constraints in hardware and software domains. Moreover, it provides domain specific heuristic solutions for each of them
Pipeline-Based Power Reduction in FPGA Applications
This paper shows how temporal parallelism has an important role in the power dissipation reduction in the FPGA field. Glitches propagation is blocked by the flip-flops or registers in the pipeline. Several multiplication structures are implemented over modern FPGAs, StratixII and Virtex4, comparing their results with and without pipeline and hardware duplication
- …