10 research outputs found

    Low-Power Design of Digital VLSI Circuits around the Point of First Failure

    Get PDF
    As an increase of intelligent and self-powered devices is forecasted for our future everyday life, the implementation of energy-autonomous devices that can wirelessly communicate data from sensors is crucial. Even though techniques such as voltage scaling proved to effectively reduce the energy consumption of digital circuits, additional energy savings are still required for a longer battery life. One of the main limitations of essentially any low-energy technique is the potential degradation of the quality of service (QoS). Thus, a thorough understanding of how circuits behave when operated around the point of first failure (PoFF) is key for the effective application of conventional energy-efficient methods as well as for the development of future low-energy techniques. In this thesis, a variety of circuits, techniques, and tools is described to reduce the energy consumption in digital systems when operated either in the safe and conservative exact region, close to the PoFF, or even inside the inexact region. A straightforward approach to reduce the power consumed by clock distribution while safely operating in the exact region is dual-edge-triggered (DET) clocking. However, the DET approach is rarely taken, primarily due to the perceived complexity of its integration. In this thesis, a fully automated design flow is introduced for applying DET clocking to a conventional single-edge-triggered (SET) design. In addition, the first static true-single-phase-clock DET flip-flop (DET-FF) that completely avoids clock-overlap hazards of DET registers is proposed. Even though the correct timing of synchronous circuits is ensured in worst-case conditions, the critical path might not always be excited. Thus, dynamic clock adjustment (DCA) has been proposed to trim any available dynamic timing margin by changing the operating clock frequency at runtime. This thesis describes a dynamically-adjustable clock generator (DCG) capable of modifying the period of the produced clock signal on a cycle-by-cycle basis that enables the DCA technique. In addition, a timing-monitoring sequential (TMS) that detects input transitions on either one of the clock phases to enable the selection of the best timing-monitoring strategy at runtime is proposed. Energy-quality scaling techniques aimat trading lower energy consumption for a small degradation on the QoS whenever approximations can be tolerated. In this thesis, a low-power methodology for the perturbation of baseline coefficients in reconfigurable finite impulse response (FIR) filters is proposed. The baseline coefficients are optimized to reduce the switching activity of the multipliers in the FIR filter, enabling the possibility of scaling the power consumption of the filter at runtime. The area as well as the leakage power of many system-on-chips is often dominated by embedded memories. Gain-cell embedded DRAM (GC-eDRAM) is a compact, low-power and CMOS-compatible alternative to the conventional static random-access memory (SRAM) when a higher memory density is desired. However, due to GC-eDRAMs relying on many interdependent variables, the adaptation of existing memories and the design of future GCeDRAMs prove to be highly complex tasks. Thus, the first modeling tool that estimates timing, memory availability, bandwidth, and area of GC-eDRAMs for a fast exploration of their design space is proposed in this thesis

    Microarchitectural Low-Power Design Techniques for Embedded Microprocessors

    Get PDF
    With the omnipresence of embedded processing in all forms of electronics today, there is a strong trend towards wireless, battery-powered, portable embedded systems which have to operate under stringent energy constraints. Consequently, low power consumption and high energy efficiency have emerged as the two key criteria for embedded microprocessor design. In this thesis we present a range of microarchitectural low-power design techniques which enable the increase of performance for embedded microprocessors and/or the reduction of energy consumption, e.g., through voltage scaling. In the context of cryptographic applications, we explore the effectiveness of instruction set extensions (ISEs) for a range of different cryptographic hash functions (SHA-3 candidates) on a 16-bit microcontroller architecture (PIC24). Specifically, we demonstrate the effectiveness of light-weight ISEs based on lookup table integration and microcoded instructions using finite state machines for operand and address generation. On-node processing in autonomous wireless sensor node devices requires deeply embedded cores with extremely low power consumption. To address this need, we present TamaRISC, a custom-designed ISA with a corresponding ultra-low-power microarchitecture implementation. The TamaRISC architecture is employed in conjunction with an ISE and standard cell memories to design a sub-threshold capable processor system targeted at compressed sensing applications. We furthermore employ TamaRISC in a hybrid SIMD/MIMD multi-core architecture targeted at moderate to high processing requirements (> 1 MOPS). A range of different microarchitectural techniques for efficient memory organization are presented. Specifically, we introduce a configurable data memory mapping technique for private and shared access, as well as instruction broadcast together with synchronized code execution based on checkpointing. We then study an inherent suboptimality due to the worst-case design principle in synchronous circuits, and introduce the concept of dynamic timing margins. We show that dynamic timing margins exist in microprocessor circuits, and that these margins are to a large extent state-dependent and that they are correlated to the sequences of instruction types which are executed within the processor pipeline. To perform this analysis we propose a circuit/processor characterization flow and tool called dynamic timing analysis. Moreover, this flow is employed in order to devise a high-level instruction set simulation environment for impact-evaluation of timing errors on application performance. The presented approach improves the state of the art significantly in terms of simulation accuracy through the use of statistical fault injection. The dynamic timing margins in microprocessors are then systematically exploited for throughput improvements or energy reductions via our proposed instruction-based dynamic clock adjustment (DCA) technique. To this end, we introduce a 6-stage 32-bit microprocessor with cycle-by-cycle DCA. Besides a comprehensive design flow and simulation environment for evaluation of the DCA approach, we additionally present a silicon prototype of a DCA-enabled OpenRISC microarchitecture fabricated in 28 nm FD-SOI CMOS. The test chip includes a suitable clock generation unit which allows for cycle-by-cycle DCA over a wide range with fine granularity at frequencies exceeding 1 GHz. Measurement results of speedups and power reductions are provided

    Voltage stacking for near/sub-threshold operation

    Get PDF

    Digital Intensive Mixed Signal Circuits with In-situ Performance Monitors

    Get PDF
    University of Minnesota Ph.D. dissertation.November 2016. Major: Electrical/Computer Engineering. Advisor: Chris Kim. 1 computer file (PDF); x, 137 pages.Digital intensive circuit design techniques of different mixed-signal systems such as data converters, clock generators, voltage regulators etc. are gaining attention for the implementation of modern microprocessors and system-on-chips (SoCs) in order to fully utilize the benefits of CMOS technology scaling. Moreover different performance improvement schemes, for example, noise reduction, spur cancellation, linearity improvement etc. can be easily performed in digital domain. In addition to that, increasing speed and complexity of modern SoCs necessitate the requirement of in-situ measurement schemes, primarily for high volume testing. In-situ measurements not only obviate the need for expensive measurement equipments and probing techniques, but also reduce the test time significantly when a large number of chips are required to be tested. Several digital intensive circuit design techniques are proposed in this dissertation along with different in-situ performance monitors for a variety of mixed signal systems. First, a novel beat frequency quantization technique is proposed in a two-step VCO quantizer based ADC implementation for direct digital conversion of low amplitude bio- potential signals. By direct conversion, it alleviates the requirement of the area and power consuming analog-frontend (AFE) used in a conventional ADC designs. This prototype design is realized in a 65nm CMOS technology. Measured SNDR is 44.5dB from a 10mVpp, 300Hz signal and power consumption is only 38μW. Next, three different clock generation circuits, a phase-locked loop (PLL), a multiplying delay-locked loop (MDLL) and a frequency-locked loop (FLL) are presented. First a 0.4-to-1.6GHz sub-sampling fractional-N all digital PLL architecture is discussed that utilizes a D-flip-flop as a digital sub-sampler. Measurement results from a 65nm CMOS test-chip shows 5dB lower phase noise at 100KHz offset frequency, compared to a conventional architecture. The Digital PLL (DPLL) architecture is further extended for a digital MDLL implementation in order to suppress the VCO phase noise beyond the DPLL bandwidth. A zero-offset aperture phase detector (APD) and a digital- to-time converter (DTC) are employed for static phase-offset (SPO) cancellation. A unique in-situ detection circuitry achieves a high resolution SPO measurement in time domain. A 65nm test-chip shows 0.2-to-1.45GHz output frequency range while reducing the phase-noise by 9dB compared to a DPLL. Next, a frequency-to-current converter (FTC) based fractional FLL is proposed for a low accuracy clock generation in an extremely low area for IoT application. High density deep-trench capacitors are used for area reduction. The test-chip is fabricated in a 32nm SOI technology that takes only 0.0054mm2 active area. A high-resolution in-situ period jitter measurement block is also incorporated in this design. Finally, a time based digital low dropout (DLDO) regulator architecture is proposed for fine grain power delivery over a wide load current dynamic range and input/output voltage in order to facilitate dynamic voltage and frequency scaling (DVFS). High- resolution beat frequency detector dynamically adjusts the loop sampling frequency for ripple and settling time reduction due to load transients. A fixed steady-state voltage offset provides inherent active voltage positioning (AVP) for ripple reduction. Circuit simulations in a 65nm technology show more than 90% current efficiency for 100X load current variation, while it can operate for an input voltage range of 0.6V – 1.2V

    Circuits and Systems Advances in Near Threshold Computing

    Get PDF
    Modern society is witnessing a sea change in ubiquitous computing, in which people have embraced computing systems as an indispensable part of day-to-day existence. Computation, storage, and communication abilities of smartphones, for example, have undergone monumental changes over the past decade. However, global emphasis on creating and sustaining green environments is leading to a rapid and ongoing proliferation of edge computing systems and applications. As a broad spectrum of healthcare, home, and transport applications shift to the edge of the network, near-threshold computing (NTC) is emerging as one of the promising low-power computing platforms. An NTC device sets its supply voltage close to its threshold voltage, dramatically reducing the energy consumption. Despite showing substantial promise in terms of energy efficiency, NTC is yet to see widescale commercial adoption. This is because circuits and systems operating with NTC suffer from several problems, including increased sensitivity to process variation, reliability problems, performance degradation, and security vulnerabilities, to name a few. To realize its potential, we need designs, techniques, and solutions to overcome these challenges associated with NTC circuits and systems. The readers of this book will be able to familiarize themselves with recent advances in electronics systems, focusing on near-threshold computing

    Cross-Layer Optimization for Power-Efficient and Robust Digital Circuits and Systems

    Full text link
    With the increasing digital services demand, performance and power-efficiency become vital requirements for digital circuits and systems. However, the enabling CMOS technology scaling has been facing significant challenges of device uncertainties, such as process, voltage, and temperature variations. To ensure system reliability, worst-case corner assumptions are usually made in each design level. However, the over-pessimistic worst-case margin leads to unnecessary power waste and performance loss as high as 2.2x. Since optimizations are traditionally confined to each specific level, those safe margins can hardly be properly exploited. To tackle the challenge, it is therefore advised in this Ph.D. thesis to perform a cross-layer optimization for digital signal processing circuits and systems, to achieve a global balance of power consumption and output quality. To conclude, the traditional over-pessimistic worst-case approach leads to huge power waste. In contrast, the adaptive voltage scaling approach saves power (25% for the CORDIC application) by providing a just-needed supply voltage. The power saving is maximized (46% for CORDIC) when a more aggressive voltage over-scaling scheme is applied. These sparsely occurred circuit errors produced by aggressive voltage over-scaling are mitigated by higher level error resilient designs. For functions like FFT and CORDIC, smart error mitigation schemes were proposed to enhance reliability (soft-errors and timing-errors, respectively). Applications like Massive MIMO systems are robust against lower level errors, thanks to the intrinsically redundant antennas. This property makes it applicable to embrace digital hardware that trades quality for power savings.Comment: 190 page

    A 28nm FD-SOI Standard Cell 0.6-1.2V Open-Loop Frequency Multiplier for Low Power SoC Clocking

    No full text
    IEEE International Symposium on Circuits and Systems (ISCAS), Montreal, CANADA, MAY 22-25, 2016International audienceThis paper presents a new design for SoC clocking based on open-loop frequency multiplication. The architecture, fully implemented in 28nm FD-SOI standard cells, achieves frequency tracking within one input reference period making it a promising candidate for Dynamic Voltage and Frequency Scaling (DVFS) schemes. A calibration scheme is implemented for wide voltage range (0.6-1.2V) operation and to offset P&R and variability induced mismatch. Measurements at 0.6V (0.8/0.4V FBB) show a Fmax of 93MHz with a power of 2.91mW/MHz and a jitter of 2.7 % UI

    A 28nm FD-SOI Standard Cell 0.6-1.2V Open-Loop Frequency Multiplier for Low Power SoC Clocking

    No full text
    IEEE International Symposium on Circuits and Systems (ISCAS), Montreal, CANADA, MAY 22-25, 2016International audienceThis paper presents a new design for SoC clocking based on open-loop frequency multiplication. The architecture, fully implemented in 28nm FD-SOI standard cells, achieves frequency tracking within one input reference period making it a promising candidate for Dynamic Voltage and Frequency Scaling (DVFS) schemes. A calibration scheme is implemented for wide voltage range (0.6-1.2V) operation and to offset P&R and variability induced mismatch. Measurements at 0.6V (0.8/0.4V FBB) show a Fmax of 93MHz with a power of 2.91mW/MHz and a jitter of 2.7 % UI

    Cutting Edge Nanotechnology

    Get PDF
    The main purpose of this book is to describe important issues in various types of devices ranging from conventional transistors (opening chapters of the book) to molecular electronic devices whose fabrication and operation is discussed in the last few chapters of the book. As such, this book can serve as a guide for identifications of important areas of research in micro, nano and molecular electronics. We deeply acknowledge valuable contributions that each of the authors made in writing these excellent chapters
    corecore