72 research outputs found

    ILP-based Supply and Threshold Voltage Assignment For Total Power Minimization

    Get PDF
    In this paper we present an ILP-based method to simultaneously assign supply and threshold voltages to individual gates for dynamic and leakage power minimization. In our three-step approach, low power min-flipflop (FF) retiming is first performed to reduce the clock period while taking the FF delay/power into consideration. Next, the subsequent voltage assignment formulated in ILP makes the best possible supply/threshold voltage assignment under the given clock period constraint set by the retiming. Finally, a post-process further refines the voltage assignment solution by exploiting the remaining timing slack in the circuit. Related experiments show that the min-FF retiming plus simultaneous Vdd/Vth assignment approach outperforms the existing max-FF retiming plus Vdd-only assignment approach

    Enhancing Power Efficient Design Techniques in Deep Submicron Era

    Get PDF
    Excessive power dissipation has been one of the major bottlenecks for design and manufacture in the past couple of decades. Power efficient design has become more and more challenging when technology scales down to the deep submicron era that features the dominance of leakage, the manufacture variation, the on-chip temperature variation and higher reliability requirements, among others. Most of the computer aided design (CAD) tools and algorithms currently used in industry were developed in the pre deep submicron era and did not consider the new features explicitly and adequately. Recent research advances in deep submicron design, such as the mechanisms of leakage, the source and characterization of manufacture variation, the cause and models of on-chip temperature variation, provide us the opportunity to incorporate these important issues in power efficient design. We explore this opportunity in this dissertation by demonstrating that significant power reduction can be achieved with only minor modification to the existing CAD tools and algorithms. First, we consider peak current, which has become critical for circuit's reliability in deep submicron design. Traditional low power design techniques focus on the reduction of average power. We propose to reduce peak current while keeping the overhead on average power as small as possible. Second, dual Vt technique and gate sizing have been used simultaneously for leakage savings. However, this approach becomes less effective in deep submicron design. We propose to use the newly developed process-induced mechanical stress to enhance its performance. Finally, in deep submicron design, the impact of on-chip temperature variation on leakage and performance becomes more and more significant. We propose a temperature-aware dual Vt approach to alleviate hot spots and achieve further leakage reduction. We also consider this leakage-temperature dependency in the dynamic voltage scaling approach and discover that a commonly accepted result is incorrect for the current technology. We conduct extensive experiments with popular design benchmarks, using the latest industry CAD tools and design libraries. The results show that our proposed enhancements are promising in power saving and are practical to solve the low power design challenges in deep submicron era

    Design Space Re-Engineering for Power Minimization in Modern Embedded Systems

    Get PDF
    Power minimization is a critical challenge for modern embedded system design. Recently, due to the rapid increase of system's complexity and the power density, there is a growing need for power control techniques at various design levels. Meanwhile, due to technology scaling, leakage power has become a significant part of power dissipation in the CMOS circuits and new techniques are needed to reduce leakage power. As a result, many new power minimization techniques have been proposed such as voltage island, gate sizing, multiple supply and threshold voltage, power gating and input vector control, etc. These design options further enlarge the design space and make it prohibitively expensive to explore for the most energy efficient design solution. Consequently, heuristic algorithms and randomized algorithms are frequently used to explore the design space, seeking sub-optimal solutions to meet the time-to-market requirements. These algorithms are based on the idea of truncating the design space and restricting the search in a subset of the original design space. While this approach can effectively reduce the runtime of searching, it may also exclude high-quality design solutions and cause design quality degradation. When the solution to one problem is used as the base for another problem, such solution quality degradation will accumulate. In modern electronics system design, when several such algorithms are used in series to solve problems in different design levels, the final solution can be far off the optimal one. In my Ph.D. work, I develop a {\em re-engineering} methodology to facilitate exploring the design space of power efficient embedded systems design. The direct goal is to enhance the performance of existing low power techniques. The methodology is based on the idea that design quality can be improved via iterative ``re-shaping'' the design space based on the ``bad'' structure in the obtained design solutions; the searching run-time can be reduced by the guidance from previous exploration. This approach can be described in three phases: (1) apply the existing techniques to obtain a sub-optimal solution; (2) analyze the solution and expand the design space accordingly; and (3) re-apply the technique to re-explore the enlarged design space. We apply this methodology at different levels of embedded system design to minimize power: (i) switching power reduction in sequential logic synthesis; (ii) gate-level static leakage current reduction; (iii) dual threshold voltage CMOS circuits design; and (iv) system-level energy-efficient detection scheme for wireless sensor networks. An extensive amount of experiments have been conducted and the results have shown that this methodology can effectively enhance the power efficiency of the existing embedded system design flows with very little overhead

    IMPACT OF DYNAMIC VOLTAGE SCALING (DVS) ON CIRCUIT OPTIMIZATION

    Get PDF
    Circuit designers perform optimization procedures targeting speed and power during the design of a circuit. Gate sizing can be applied to optimize for speed, while Dual-VT and Dynamic Voltage Scaling (DVS) can be applied to optimize for leakage and dynamic power, respectively. Both gate sizing and Dual-VT are design-time techniques, which are applied to the circuit at a fixed voltage. On the other hand, DVS is a run-time technique and implies that the circuit will be operating at a different voltage than that used during the optimization phase at design-time. After some analysis, the risk of non-critical paths becoming critical paths at run-time is detected under these circumstances. The following questions arise: 1) should we take DVS into account during the optimization phase? 2) Does DVS impose any restrictions while performing design-time circuit optimizations?. This thesis is a case study of applying DVS to a circuit that has been optimized for speed and power, and aims at answering the previous two questions. We used a 45-nm CMOS design kit and flow. Synthesis, placement and routing, and timing analysis were applied to the benchmark circuit ISCAS?85 c432. Logical Effort and Dual-VT algorithms were implemented and applied to the circuit to optimize for speed and leakage power, respectively. Optimizations were run for the circuit operating at different voltages. Finally, the impact of DVS on circuit optimization was studied based on HSPICE simulations sweeping the supply voltage for each optimization. The results showed that DVS had no impact on gate sizing optimizations, but it did on Dual-VT optimizations. It is shown that we should not optimize at an arbitrary voltage. Moreover, simulations showed that Dual-VT optimizations should be performed at the lowest voltage that DVS is intended to operate, otherwise non-critical paths will become critical paths at run-time

    Lagrangian relaxation-based multi-threaded discrete gate sizer

    Get PDF
    In integrated circuit design gate sizing is one of the key optimization techniques which is repeatedly invoked to trade-off delays for area and/or power of the gates during logic design and physical design stages. With increasing design sizes of a million gates and larger, discrete gate sizes and non-convex delay models the gate sizing algorithms that were designed for continuous sizes and convex delay models are slow and timing inaccurate. Of the several published discrete gate sizing algorithms, recent works have shown that Lagrangian relaxation based gate sizers have produced designs with the lowest power on average with high timing accuracy. But they are also very slow due to a large number of expensive timing updates spread across hundreds of iterations of solving the Lagrangian sub-problem. In this thesis we present a Lagrangian relaxation based multi-threaded discrete gate sizer for fast timing and power reduction by swapping the gate sizes and the threshold voltages. We developed two parallelization enabling techniques to reduce the runtime of Lagrangian sub-problem solver, namely, mutual exclusion edge (MEE) assignment and directed acyclic graph (DAG) based netlist traversal. MEEs are dummy edges assigned to reduce computational dependencies among gates sharing one or more common fan-ins. DAG based netlist traversal facilitates simultaneous resizing of gates belonging to different topological levels. We designed a Lagrange multiplier update framework that enables rapid convergence of the timing recovery and power recovery algorithms. To reduce the runtime of timing updates, we proposed a simple and fast-to-compute effective capacitance model and several mechanisms to calibrate the timing models to improve their accuracy. Compared to the state-of-the-art gate sizer, our proposed gate sizer is on average 15x faster and the optimized designs have only 1.7\% higher power. In digital synchronous designs simultaneous gate sizing and clock skew scheduling provides significantly more power saving. We extend the gate sizer to simultaneously schedule the clock skew. It can achieve an average of 18.8\% more reduction in power with only 20\% increase in the runtime

    Simultaneous slack budgeting and retiming for synchronous circuits optimization

    Full text link
    Abstract- With the challenges of growing functionality and scaling chip size, the possible performance improvements should be considered in the earlier IC design stages, which gives more freedom to the later optimization. Potential slack as an effective metric of possible performance improvements is considered in this work which, as far as we known, is the first work that maximizes the potential slack by retiming for synchronous sequential circuit. A simultaneous slack budgeting and incremental retiming algorithm is proposed for maximizing potential slack. The overall slack budget is optimized by relocating the FFs iteratively with the MIS-based slack estimation. Compared with the potential slack of a well-known min-period retiming, our algorithm improves potential slack averagely 19.6 % without degrading the circuit performance in reasonable runtime. Furthermore, at the expense of a small amount of timing performance, 0.52 % and 2.08%, the potential slack is increased averagely by 19.89 % and 28.16 % separately, which give a hint of the tradeoff between the timing performance and the slack budget.

    Voltage Set-up Problem on Embedded Systems with Multiple Voltages

    Get PDF
    Dynamic voltage scaling (DVS), arguably the most effective energy reduction technique, can be enabled by having multiple voltages physically implemented on the chip and allowing the operating system to decide which voltage to use at run-time. Indeed, this is predicted as the future low-power system by International Technology Roadmap for Semiconductors (ITRS). There still exist many important unsolved problems on how to reduce the system's dynamic and/or total power by DVS. One of such problems, which we refer to as the voltage set-up problem, is "how many levels and at which values should voltages be implemented for the system to achieve the maximum energy saving". It challenges whether DVS technique's full potential in energy saving can be reached on multiple-voltage systems. In this paper, (1) we derive analytical solutions for dual-voltage system. (2) For the general case that does not have analytic solutions, we develop efficient numerical methods that can take the overhead of voltage switch and leakage into account. (3) We demonstrate how to apply the proposed algorithms on system design. (4) Interestingly, the experimental results, on both real life DSP applications and random created applications, suggest that multiple-voltage DVS systems with only a couple levels of voltages, when set up properly, can be very close to DVS technique's full potential in energy saving. Parts of this report were published in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 13, No. 7, pp. 869-872, July 2005

    Design methodologies for variation-aware integrated circuits

    Get PDF
    The scaling of VLSI technology has spurred a rapid growth in the semiconductor industry. With the CMOS device dimension scaling to and beyond 90nm technology, it is possible to achieve higher performance and to pack more complex functionalities on a single chip. However, the scaling trend has introduced drastic variation of process and design parameters, leading to severe variability of chip performance in nanometer regime. Also, the manufacturing community projects CMOS will scale for three to four more generations. Since the uncertainties due to variations are expected to increase in each generation, it will significantly impact the performance of design and consequently the yield. Another challenging issue in the nanometer IC design is the high power consumption due to the greater packing density, higher frequency of operation and excessive leakage power. Moreover, the circuits are usually over-designed to compensate for uncertainties due to variations. The over-designed circuits not only make timing closure difficult but also cause excessive power consumption. For portable electronics, excessive power consumption may reduce battery life; for non-portable systems it may impose great difficulties in cooling and packaging. The objective of my research has been to develop design methodologies to address variations and power dissipation for reliable circuit operation. The proposed work has been divided into three parts: the first part addresses the issues related with power/ground noise induced by clock distribution network and proposes techniques to reduce power/ground noise considering the effects of process variations. The second part proposes an elastic pipeline scheme for random circuits with feedback loops. The proposed scheme provides a low-power solution that has the same variation tolerance as the conventional approaches. The third section deals with discrete buffer and wire sizing for link-based non-tree clock network, which is an energy efficient structure for skew tolerance to variations. For the power/ground noise problem, our approach could reduce the peak current and the delay variations by 50% and 51% respectively. Compared to conventional approach, the elastic timing scheme reduces power dissipation by 20% − 27%. The sizing method achieves clock skew reduction of 45% with a small increase in power dissipation

    An Ultra-Low-Energy, Variation-Tolerant FPGA Architecture Using Component-Specific Mapping

    Get PDF
    As feature sizes scale toward atomic limits, parameter variation continues to increase, leading to increased margins in both delay and energy. Parameter variation both slows down devices and causes devices to fail. For applications that require high performance, the possibility of very slow devices on critical paths forces designers to reduce clock speed in order to meet timing. For an important and emerging class of applications that target energy-minimal operation at the cost of delay, the impact of variation-induced defects at very low voltages mandates the sizing up of transistors and operation at higher voltages to maintain functionality. With post-fabrication configurability, FPGAs have the opportunity to self-measure the impact of variation, determining the speed and functionality of each individual resource. Given that information, a delay-aware router can use slow devices on non-critical paths, fast devices on critical paths, and avoid known defects. By mapping each component individually and customizing designs to a component's unique physical characteristics, we demonstrate that we can eliminate delay margins and reduce energy margins caused by variation. To quantify the potential benefit we might gain from component-specific mapping, we first measure the margins associated with parameter variation, and then focus primarily on the energy benefits of FPGA delay-aware routing over a wide range of predictive technologies (45 nm--12 nm) for the Toronto20 benchmark set. We show that relative to delay-oblivious routing, delay-aware routing without any significant optimizations can reduce minimum energy/operation by 1.72x at 22 nm. We demonstrate how to construct an FPGA architecture specifically tailored to further increase the minimum energy savings of component-specific mapping by using the following techniques: power gating, gate sizing, interconnect sparing, and LUT remapping. With all optimizations considered we show a minimum energy/operation savings of 2.66x at 22 nm, or 1.68--2.95x when considered across 45--12 nm. As there are many challenges to measuring resource delays and mapping per chip, we discuss methods that may make component-specific mapping more practical. We demonstrate that a simpler, defect-aware routing achieves 70% of the energy savings of delay-aware routing. Finally, we show that without variation tolerance, scaling from 16 nm to 12 nm results in a net increase in minimum energy/operation; component-specific mapping, however, can extend minimum energy/operation scaling to 12 nm and possibly beyond.</p
    • …
    corecore