8 research outputs found

    Performance-Driven Energy-Efficient VLSI.

    Full text link
    Today, there are two prevalent platforms in VLSI systems: high-performance and ultra-low power. High-speed designs, usually operating at GHz level, provide the required computation abilities to systems but also consume a large amount of power; microprocessors and signal processing units are examples of this type of designs. For ultra-low power designs, voltage scaling methods are usually used to reduce power consumption and extend battery life. However, circuit delay in ultra-low power designs increases exponentially, as voltage is scaled below Vth, and subthreshold leakage energy also increases in a near-exponential fashion. Many methods have been proposed to address key design challenges on these two platforms, energy consumption in high-performance designs, and performance/reliability in ultra-low power designs. In this thesis, charge-recovery design is explored as a solution targeting both platforms to achieve increased energy efficiency over conventional CMOS designs without compromising performance or reliability. To improve performance while still achieving high energy efficiency for ultra-low power designs, we propose Subthreshold Boost Logic (SBL), a new circuit family that relies on charge-recovery design techniques to achieve order-of-magnitude improvements in operating frequencies, and achieve high energy efficiency compared to conventional subthreshold designs. To demonstrate the performance and energy efficiency of SBL, we present a 14-tap 8-bit finite-impulse response (FIR) filter test-chip fabricated in a 0.13µm process. With a single 0.27V supply, the test-chip achieves its most energy efficient operating point at 20MHz, consuming 15.57pJ per cycle with a recovery rate of 89% and a FoM equal to 17.37 nW/Tap/MHz/InBit/CoeffBit. To reduce energy consumption at multi-GHz level frequencies, we explore the application of resonant-clocking to the design of a 5-bit non-interleaved resonant-clock ash ADC with a sampling rate of 7GS/s. The ADC has been designed in a 65nm bulk CMOS process. An integrated 0.77nH inductor is used to resonate the entire clock distribution network to achieve energy efficient operation. Operating at 5.5GHz, the ADC consumes 28mW, yielding 396fJ per conversion step. The clock network accounts for 10.7% of total power and consumes 54% less energy over CV^2. By comparison, in a typical ash ADC design, 30% of total power is clock-related.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/89779/1/wsma_1.pd

    Energy-Efficient Digital Signal Processing Hardware Design.

    Full text link
    As CMOS technology has developed considerably in the last few decades, many SoCs have been implemented across different application areas due to reduced area and power consumption. Digital signal processing (DSP) algorithms are frequently employed in these systems to achieve more accurate operation or faster computation. However, CMOS technology scaling started to slow down recently and relatively large systems consume too much power to rely only on the scaling effect while system power budget such as battery capacity improves slowly. In addition, there exist increasing needs for miniaturized computing systems including sensor nodes that can accomplish similar operations with significantly smaller power budget. Voltage scaling is one of the most promising power saving techniques due to quadratic switching power reduction effect, making it necessary feature for even high-end processors. However, in order to achieve maximum possible energy efficiency, systems should operate in near or sub-threshold regimes where leakage takes significant portion of power. In this dissertation, a few key energy-aware design approaches are described. Considering prominent leakage and larger PVT variability in low operating voltages, multi-level energy saving techniques to be described are applied to key building blocks in DSP applications: architecture study, algorithm-architecture co-optimization, and robust yet low-power memory design. Finally, described approaches are applied to design examples including a visual navigation accelerator, ultra-low power biomedical SoC and face detection/recognition processor, resulting in 2~100 times power savings than state-of-the-art.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/110496/1/djeon_1.pd

    ULTRALOW-POWER, LOW-VOLTAGE DIGITAL CIRCUITS FOR BIOMEDICAL SENSOR NODES

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Standard cell library design for sub-threshold operation

    Get PDF

    Near-Threshold Computing: Past, Present, and Future.

    Full text link
    Transistor threshold voltages have stagnated in recent years, deviating from constant-voltage scaling theory and directly limiting supply voltage scaling. To overcome the resulting energy and power dissipation barriers, energy efficiency can be improved through aggressive voltage scaling, and there has been increased interest in operating at near-threshold computing (NTC) supply voltages. In this region sizable energy gains are achieved with moderate performance loss, some of which can be regained through parallelism. This thesis first provides a methodical definition of how near to threshold is "near threshold" and continues with an in-depth examination of NTC across past, present, and future CMOS technologies. By systematically defining near-threshold, the trends and tradeoffs are analyzed, lending insight in how best to design and optimize near-threshold systems. NTC works best for technologies that feature good circuit delay scalability, therefore technologies without strong short-channel effects. Early planar technologies (prior to 90nm or so) featured good circuit scalability (8x energy gains), but lacked area in which to add cores for parallelization. Recent planar nodes (32nm – 20nm) feature more area for cores but suffer from poor delay scalability, and so are not well-suited for NTC (4x energy gains). The switch to FinFET CMOS technology allows for a return to strong voltage scalability (8x gain), reversing trends seen in planar technologies, while dark silicon has created an opportunity to add cores for parallelization. Improved FinFET voltage scalability even allows for latency reduction of a single task, as long as the task is sufficiently parallelizable (< 10% serial code). Finally, we will look at a technique for fast voltage boosting, called Shortstop, in which a core's operating voltage is raised in 10s of cycles. Shortstop can be used to quickly respond to single-threaded performance demands of a near-threshold system by leveraging the innate parasitic inductance of a dedicated dirty supply rail, further improving energy efficiency. The technique is demonstrated in a wirebond implementation and is able to boost a core up to 1.8x faster than a header-based approach, while reducing supply droop by 2-7x. An improved flip-chip architecture is also proposed.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113600/1/npfet_1.pd

    Timing-Error Tolerance Techniques for Low-Power DSP: Filters and Transforms

    Get PDF
    Low-power Digital Signal Processing (DSP) circuits are critical to commercial System-on-Chip design for battery powered devices. Dynamic Voltage Scaling (DVS) of digital circuits can reclaim worst-case supply voltage margins for delay variation, reducing power consumption. However, removing static margins without compromising robustness is tremendously challenging, especially in an era of escalating reliability concerns due to continued process scaling. The Razor DVS scheme addresses these concerns, by ensuring robustness using explicit timing-error detection and correction circuits. Nonetheless, the design of low-complexity and low-power error correction is often challenging. In this thesis, the Razor framework is applied to fixed-precision DSP filters and transforms. The inherent error tolerance of many DSP algorithms is exploited to achieve very low-overhead error correction. Novel error correction schemes for DSP datapaths are proposed, with very low-overhead circuit realisations. Two new approximate error correction approaches are proposed. The first is based on an adapted sum-of-products form that prevents errors in intermediate results reaching the output, while the second approach forces errors to occur only in less significant bits of each result by shaping the critical path distribution. A third approach is described that achieves exact error correction using time borrowing techniques on critical paths. Unlike previously published approaches, all three proposed are suitable for high clock frequency implementations, as demonstrated with fully placed and routed FIR, FFT and DCT implementations in 90nm and 32nm CMOS. Design issues and theoretical modelling are presented for each approach, along with SPICE simulation results demonstrating power savings of 21 – 29%. Finally, the design of a baseband transmitter in 32nm CMOS for the Spectrally Efficient FDM (SEFDM) system is presented. SEFDM systems offer bandwidth savings compared to Orthogonal FDM (OFDM), at the cost of increased complexity and power consumption, which is quantified with the first VLSI architecture
    corecore