Search CORE

5 research outputs found

Power Management for Deep Submicron Microprocessors

Author: Youssef Ahmed
Publication venue: 'University of Waterloo'
Publication date: 07/07/2008
Field of study

As VLSI technology scales, the enhanced performance of smaller transistors comes at the expense of increased power consumption. In addition to the dynamic power consumed by the circuits there is a tremendous increase in the leakage power consumption which is further exacerbated by the increasing operating temperatures. The total power consumption of modern processors is distributed between the processor core, memory and interconnects. In this research two novel power management techniques are presented targeting the functional units and the global interconnects. First, since most leakage control schemes for processor functional units are based on circuit level techniques, such schemes inherently lack information about the operational profile of higher-level components of the system. This is a barrier to the pivotal task of predicting standby time. Without this prediction, it is extremely difficult to assess the value of any leakage control scheme. Consequently, a methodology that can predict the standby time is highly beneficial in bridging the gap between the information available at the application level and the circuit implementations. In this work, a novel Dynamic Sleep Signal Generator (DSSG) is presented. It utilizes the usage traces extracted from cycle accurate simulations of benchmark programs to predict the long standby periods associated with the various functional units. The DSSG bases its decisions on the current and previous standby state of the functional units to accurately predict the length of the next standby period. The DSSG presents an alternative to Static Sleep Signal Generation (SSSG) based on static counters that trigger the generation of the sleep signal when the functional units idle for a prespecified number of cycles. The test results of the DSSG are obtained by the use of a modified RISC superscalar processor, implemented by SimpleScalar, the most widely accepted open source vehicle for architectural analysis. In addition, the results are further verified by a Simultaneous Multithreading simulator implemented by SMTSIM. Leakage saving results shows an increase of up to 146% in leakage savings using the DSSG versus the SSSG, with an accuracy of 60-80% for predicting long standby periods. Second, chip designers in their effort to achieve timing closure, have focused on achieving the lowest possible interconnect delay through buffer insertion and routing techniques. This approach, though, taxes the power budget of modern ICs, especially those intended for wireless applications. Also, in order to achieve more functionality, die sizes are constantly increasing. This trend is leading to an increase in the average global interconnect length which, in turn, requires more buffers to achieve timing closure. Unconstrained buffering is bound to adversely affect the overall chip performance, if the power consumption is added as a major performance metric. In fact, the number of global interconnect buffers is expected to reach hundreds of thousands to achieve an appropriate timing closure. To mitigate the impact of the power consumed by the interconnect buffers, a power-efficient multi-pin routing technique is proposed in this research. The problem is based on a graph representation of the routing possibilities, including buffer insertion and identifying the least power path between the interconnect source and set of sinks. The novel multi-pin routing technique is tested by applying it to the ISPD and IBM benchmarks to verify the accuracy, complexity, and solution quality. Results obtained indicate that an average power savings as high as 32% for the 130-nm technology is achieved with no impact on the maximum chip frequency

University of Waterloo's Institutional Repository

Multilayer Modeling and Design of Energy Managed Microsystems

Author: Zarrabi Houman
Publication venue
Publication date: 14/04/2011
Field of study

Aggressive energy reduction is one of the key technological challenges that all segments of the semiconductor industry have encountered in the past few years. In addition, the notion of environmental awareness and designing “green” products is yet another major driver for ultra low energy design of electronic systems. Energy management is one of the unique solutions that can address the simultaneous requirements of high-performance, (ultra) low energy and greenness in many classes of computing systems; including high-performance, embedded and wireless. These considerations motivate the focus of this dissertation on the energy efficiency improvement of Energy Managed Microsystems (EMM or EM2). The aim is to maximize the energy efficiency and/or the operational lifetime of these systems. In this thesis we propose solutions that are applicable to many classes of computing systems including high-performance and mobile computing systems. These solutions contribute to make such technologies “greener”. The proposed solutions are multilayer, since they belong to, and may be applicable to, multiple design abstraction layers. The proposed solutions are orthogonal to each other, and if deployed simultaneously in a vertical system integration approach, when possible, the net benefit may be as large as the multiplication of the individual benefits. At high-level, this thesis initially focuses on the modeling and design of interconnections for EM2. For this purpose, a design flow has been proposed for interconnections in EM2. This flow allows designing interconnects with minimum energy requirements that meet all the considered performance objectives, in all specified system operating states. Later, models for energy performance estimation of EM2 are proposed. By energy performance, we refer to the improvements of energy savings of the computing platforms, obtained when some enhancements are applied to those platforms. These models are based on the components of the application profile. The adopted method is inspired by Amdahl’s law, which is driven by the fact that ‘energy’ is ‘additive’, as ‘time’ is ‘additive’. These models can be used for the design space exploration of EM2. The proposed models are high-level and therefore they are easy to use and show fair accuracy, 9.1% error on average, when compared to the results of the implemented benchmarks. Finally, models to estimate energy consumption of EM2 according to their “activity” are proposed. By “activity” we mean the rate at which EM2 perform a set of predefined application functions. Good estimations of energy requirements are very useful when designing and managing the EM2 activity, in order to extend their battery lifetime. The study of the proposed models on some Wireless Sensor Network (WSN) application benchmark confirms a fair accuracy for the energy estimation models, 3% error on average on the considered benchmarks

Concordia University Research Repository

Power Optimal MTCMOS Repeater Insertion for Global Buses

Author: Hanif Fatemi
Publication venue
Publication date
Field of study

This paper addresses the problem of power-optimal repeater insertion for global buses in the presence of crosstalk noise. MTCMOS technique by inserting high-V th sleep transistors to reduce the leakage power consumption in the idle mode is used. We simultaneously calculate the repeater sizes, repeater distances, and the size of the sleep transistors to minimize the power dissipation. The effect of crosstalk coupling capacitance on propagation delay and (switching and short circuit) power dissipation is considered. Experimental results show that depending on the activity factor of the circuit, the proposed technique can significantly reduce the power consumption of the global bus interconnects

CiteSeerX

Power Optimal MTCMOS Repeater Insertion for Global Buses

Author
Publication venue
Publication date
Field of study

ABSTRACT –This paper addresses the problem of power-optimal repeater insertion for global buses in the presence of crosstalk noise. We use MTCMOS technique by inserting high-Vth sleep transistors to reduce the leakage power consumption in the idle mode. We simultaneously calculate the repeater sizes, repeater distances, and the size of the sleep transistors to minimize the power dissipation. The effect of crosstalk coupling capacitance on propagation delay and (switching and short circuit) power dissipation is considered. Experimental results show that depending on the activity factor of the circuit, the proposed technique can significantly reduce the power consumption of the global bus interconnects. I

CiteSeerX

Predicting power scalability in a reconfigurable platform

Author: Beckett P
Publication venue: RMIT University
Publication date: 01/01/2007
Field of study

This thesis focuses on the evolution of digital hardware systems. A reconfigurable platform is proposed and analysed based on thin-body, fully-depleted silicon-on-insulator Schottky-barrier transistors with metal gates and silicide source/drain (TBFDSBSOI). These offer the potential for simplified processing that will allow them to reach ultimate nanoscale gate dimensions. Technology CAD was used to show that the threshold voltage in TBFDSBSOI devices will be controllable by gate potentials that scale down with the channel dimensions while remaining within appropriate gate reliability limits. SPICE simulations determined that the magnitude of the threshold shift predicted by TCAD software would be sufficient to control the logic configuration of a simple, regular array of these TBFDSBSOI transistors as well as to constrain its overall subthreshold power growth. Using these devices, a reconfigurable platform is proposed based on a regular 6-input, 6-output NOR LUT block in which the logic and configuration functions of the array are mapped onto separate gates of the double-gate device. A new analytic model of the relationship between power (P), area (A) and performance (T) has been developed based on a simple VLSI complexity metric of the form ATσ = constant. As σ defines the performance “return” gained as a result of an increase in area, it also represents a bound on the architectural options available in power-scalable digital systems. This analytic model was used to determine that simple computing functions mapped to the reconfigurable platform will exhibit continuous power-area-performance scaling behavior. A number of simple arithmetic circuits were mapped to the array and their delay and subthreshold leakage analysed over a representative range of supply and threshold voltages, thus determining a worse-case range for the device/circuit-level parameters of the model. Finally, an architectural simulation was built in VHDL-AMS. The frequency scaling described by σ, combined with the device/circuit-level parameters predicts the overall power and performance scaling of parallel architectures mapped to the array

RMIT Research Repository