1,867 research outputs found

    A Charge-Recycling Scheme and Ultra Low Voltage Self-Startup Charge Pump for Highly Energy Efficient Mixed Signal Systems-On-A-Chip

    Get PDF
    The advent of battery operated sensor-based electronic systems has provided a pressing need to design energy-efficient, ultra-low power integrated circuits as a means to improve the battery lifetime. This dissertation describes a scheme to lower the power requirement of a digital circuit through the use of charge-recycling and dynamic supply-voltage scaling techniques. The novel charge-recycling scheme proposed in this research demonstrates the feasibility of operating digital circuits using the charge scavenged from the leakage and dynamic load currents inherent to digital design. The proposed scheme efficiently gathers the “ground-bound” charge into storage capacitor banks. This reclaimed charge is then subsequently recycled to power the source digital circuit. The charge-recycling methodology has been implemented on a 12-bit Gray-code counter operating at frequencies of less than 50 MHz. The circuit has been designed in a 90-nm process and measurement results reveal more than 41% reduction in the average energy consumption of the counter. The total energy savings including the power consumed for the generation of control signals aggregates to an average of 23%. The proposed methodology can be applied to an existing digital path without any design change to the circuit but with only small loss to the performance. Potential applications of this scheme are described, specifically in wide-temperature dynamic power reduction and as a source for energy harvesters. The second part of this dissertation deals with the design and development of a self-starting, ultra-low voltage, switched-capacitor (SC) DC-DC converter that is essential to an energy harvesting system. The proposed charge-pump based SC-converter operates from 125-mV input and thus enables battery-less operation in ultra-low voltage energy harvesters. The charge pump does not require any external components or expensive post-fabrication processing to enable low-voltage operation. This design has been implemented in a 130-nm CMOS process. While the proposed charge pump provides significant efficiency enhancement in energy harvesters, it can also be incorporated within charge recycling systems to facilitate adaptable charge-recycling levels. In total, this dissertation provides key components needed for highly energy-efficient mixed signal systems-on-a-chip

    Low energy digital circuits in advanced nanometer technologies

    Get PDF
    The demand for portable devices and the continuing trend towards the Internet ofThings (IoT) have made of energy consumption one of the main concerns in the industry and researchers. The most efficient way of reducing the energy consump-tion of digital circuits is decreasing the supply voltage (Vdd) since the dynamicenergy quadratically depends onVdd. Several works have shown that an optimumsupply voltage exists that minimizes the energy consumption of digital circuits. This optimum supply voltage is usually around 200 mV and 400 mV dependingon the circuit and technology used. To obtain these low supply voltages, on-chipdc-dc converters with high efficiency are needed.This thesis focuses on the study of subthreshold digital systems in advancednanometer technologies. These systems usually can be divided into a Power Man-agement Unit (PMU) and a digital circuit operating at the subthreshold regime.In particular, while considering the PMU, one of the key circuits is the dc-dcconverter. This block converts the voltage from the power source (battery, supercapacitor or wireless power transfer link) to a voltage between 200 mV and 400mV in order to power the digital circuit. In this thesis, we developed two chargerecycling techniques in order to improve the efficiency of switched capacitors dc-dcconverters. The first one is based on a technique used in adiabatic circuits calledstepwise charging. This technique was used in circuits and applications wherethe switching consumption of a big capacitance is very important. We analyzedthe possibility of using this technique in switched capacitor dc-dc converters withintegrated capacitors. We showed through measurements that a 29% reductionin the gate drive losses can be obtained with this technique. The second one isa simplification of stepwise charging which can be applied in some architecturesof switched capacitors dc-dc converters. We also fabricated and tested a dc-dcconverter with this technique and obtained a 25% energy reduction in the drivingof the switches that implement the converter.Furthermore, we studied the digital circuit working in the subthreshold regime,in particular, operating at the minimum energy point. We studied different modelsfor circuits working in these conditions and improved them by considering thedifferences between the NMOS and PMOS transistors. We obtained an optimumNMOS/PMOS leakage current imbalance that minimizes the total leakage energy per operation. This optimum depends on the architecture of the digital circuitand the input data. However, we also showed that important energy reductionscan be obtained by operating at a mean optimum imbalance. We proposed two techniques to achieve the optimum imbalance. We used aFully Depleted Silicon on Insulator (FD-SOI) 28 nm technology for most of the simulations, but we also show that these techniques can be applied in traditionalbulk CMOS technologies. The first one consists in using the back plane voltage of the transistors (or bulk voltage in traditional CMOS) to adjust independently theleakage current of the NMOS and PMOS transistor to work under the optimum NMOS/PMOS leakage current imbalance. We called this approach the OptimumBack Plane Biasing (OBB). A second technique consists of using the length of the transistors to adjust this leakage current imbalance. In the subthreshold regimeand in advanced nanometer technologies a moderate increase in the length has little impact in the output capacitance of the gates and thus in the dynamic energy.We called this approach an Asymmetric Length Biasing (ALB). Finally, we use these techniques in some basic circuits such as adders. We show that around 50% energy reduction can be obtained, in a wide range of frequency while working near the minimum energy point and using these techniques. The main contributions of this thesis are: • Analysis of the stepwise charging technique in small capacitances. •Implementation of stepwise charging technique as a charge recycling tech-nique for efficiency improvement in switched capacitor dc-dc converters. • Development of a charge sharing technique for efficiency improvement inswitched capacitor dc-dc converters. • Analysis of minimum operating voltage of digital circuits due to intrinsicnoise and the impact of technology scaling in this minimum. • Improvement in the modeling of the minimum energy point while considering NMOS and PMOS transistors difference. • Demonstration of the existence of an optimum leakage current imbalance be-tween the NMOS and PMOS transistors that minimizes energy consumptionin the subthreshold regiion. • Development of a back plane (bulk) voltage strategy for working in this optimum.• Development of a sizing strategy for working in the aforementioned optimum. • Analysis of the impact of architecture and input data on the optimum im-balance. The thesis is based on the publications [1–8]. During the Ph.D. program, other publications were generated [9–16] that are partially related with the thesis butwere not included in it.La constante demanda de dispositivos portables y los avances hacia la Internet de las Cosas han hecho del consumo de energía uno de los mayores desafíos y preocupación en la industria y la academia. La forma más eficiente de reducir el consumo de energía de los circuitos digitales es reduciendo su voltaje de alimentación ya que la energía dinámica depende de manera cuadrática con dicho voltaje. Varios trabajos demostraron que existe un voltaje de alimentación óptimo, que minimiza la energía consumida para realizar cierta operación en un circuito digital, llamado punto de mínima energía. Este óptimo voltaje se encuentra usualmente entre 200 mV y 400 mV dependiendo del circuito y de la tecnología utilizada. Para obtener estos voltajes de alimentación de la fuente de energía, se necesitan conversores dc-dc integrados con alta eficiencia. Esta tesis se concentra en el estudio de sistemas digitales trabajando en la región sub umbral diseñados en tecnologías nanométricas avanzadas (28 nm). Estos sistemas se pueden dividir usualmente en dos bloques, uno llamado bloque de manejo de potencia, y el segundo, el circuito digital operando en la region sub umbral. En particular, en lo que corresponde al bloque de manejo de potencia, el circuito más crítico es en general el conversor dc-dc. Este circuito convierte el voltaje de una batería (o super capacitor o enlace de transferencia inalámbrica de energía o unidad de cosechado de energía) en un voltaje entre 200 mV y 400 mV para alimentar el circuito digital en su voltaje óptimo. En esta tesis desarrollamos dos técnicas que, mediante el reciclado de carga, mejoran la eficiencia de los conversores dc-dc a capacitores conmutados. La primera es basada en una técnica utilizada en circuitos adiabáticos que se llama carga gradual o a pasos. Esta técnica se ha utilizado en circuitos y aplicaciones en donde el consumo por la carga y descarga de una capacidad grande es dominante. Nosotros analizamos la posibilidad de utilizar esta técnica en conversores dc-dc a capacitores conmutados con capacitores integrados. Se demostró a través de medidas que se puede reducir en un 29% el consumo debido al encendido y apagado de las llaves que implementan el conversor dc-dc. La segunda técnica, es una simplificación de la primera, la cual puede ser aplicada en ciertas arquitecturas de conversores dc-dc a capacitores conmutados. También se fabricó y midió un conversor con esta técnica y se obtuvo una reducción del 25% en la energía consumida por el manejo de las llaves del conversor. Por otro lado, estudiamos los circuitos digitales operando en la región sub umbral y en particular cerca del punto de mínima energía. Estudiamos diferentes modelos para circuitos operando en estas condiciones y los mejoramos considerando las diferencias entre los transistores NMOS y PMOS. Mediante este modelo demostramos que existe un óptimo en la relación entre las corrientes de fuga de ambos transistores que minimiza la energía de fuga consumida por operación. Este óptimo depende de la arquitectura del circuito digital y ademas de los datos de entrada del circuito. Sin embargo, demostramos que se puede reducir el consumo de manera considerable al operar en un óptimo promedio. Propusimos dos técnicas para alcanzar la relación óptima. Utilizamos una tecnología FD-SOI de 28nm para la mayoría de las simulaciones, pero también mostramos que estas técnicas pueden ser utilizadas en tecnologías bulk convencionales. La primer técnica, consiste en utilizar el voltaje de la puerta trasera (o sustrato en CMOS convencional) para ajustar de manera independiente las corrientes del NMOS y PMOS para que el circuito trabaje en el óptimo de la relación de corrientes. Esta técnica la llamamos polarización de voltaje de puerta trasera óptimo. La segunda técnica, consiste en utilizar los largos de los transistores para ajustar las corrientes de fugas de cada transistor y obtener la relación óptima. Trabajando en la región sub umbral y en tecnologías avanzadas, incrementar moderadamente el largo del transistor tiene poco impacto en la energía dinámica y es por eso que se puede utilizar. Finalmente, utilizamos estas técnicas en circuitos básicos como sumadores y mostramos que se puede obtener una reducción de la energía consumida de aproximadamente 50%, en un amplio rango de frecuencias, mientras estos circuitos trabajan cerca del punto de energía mínima. Las principales contribuciones de la tesis son: • Análisis de la técnica de carga gradual o a pasos en capacidades pequeñas. • Implementación de la técnica de carga gradual para la mejora de eficiencia de conversores dc-dc a capacitores conmutados. • Simplificación de la técnica de carga gradual para mejora de la eficiencia en algunas arquitecturas de conversores dc-dc de capacitores conmutados. • Análisis del mínimo voltaje de operación en circuitos digitales debido al ruido intrínseco del dispositivo y el impacto del escalado de las tecnologías en el mismo. • Mejoras en el modelado del punto de energía mínima de operación de un circuito digital en el cual se consideran las diferencias entre el transistor PMOS y NMOS. • Demostración de la existencia de un óptimo en la relación entre las corrientes de fuga entre el NMOS y PMOS que minimiza la energía de fugas consumida en la región sub umbral. • Desarrollo de una estrategia de polarización del voltaje de puerta trasera para que el circuito digital trabaje en el óptimo antes mencionado. • Desarrollo de una estrategia para el dimensionado de los transistores que componen las compuertas digitales que permite al circuito digital operar en el óptimo antes mencionado. • Análisis del impacto de la arquitectura del circuito y de los datos de entrada del mismo en el óptimo antes mencionado

    Power Reductions with Energy Recovery Using Resonant Topologies

    Get PDF
    The problem of power densities in system-on-chips (SoCs) and processors has become more exacerbated recently, resulting in high cooling costs and reliability issues. One of the largest components of power consumption is the low skew clock distribution network (CDN), driving large load capacitance. This can consume as much as 70% of the total dynamic power that is lost as heat, needing elaborate sensing and cooling mechanisms. To mitigate this, resonant clocking has been utilized in several applications over the past decade. An improved energy recovering reconfigurable generalized series resonance (GSR) solution with all the critical support circuitry is developed in this work. This LC resonant clock driver is shown to save about 50% driver power (\u3e40% overall), on a 22nm process node and has 50% less skew than a non-resonant driver at 2GHz. It can operate down to 0.2GHz to support other energy savings techniques like dynamic voltage and frequency scaling (DVFS). As an example, GSR can be configured for the simpler pulse series resonance (PSR) operation to enable further power saving for double data rate (DDR) applications, by using de-skewing latches instead of flip-flop banks. A PSR based subsystem for 40% savings in clocking power with 40% driver active area reduction xii is demonstrated. This new resonant driver generates tracking pulses at each transition of clock for dual edge operation across DVFS. PSR clocking is designed to drive explicit-pulsed latches with negative setup time. Simulations using 45nm IBM/PTM device and interconnect technology models, clocking 1024 flip-flops show the reductions, compared to non-resonant clocking. DVFS range from 2GHz/1.3V to 200MHz/0.5V is obtained. The PSR frequency is set \u3e3× the clock rate, needing only 1/10th the inductance of prior-art LC resonance schemes. The skew reductions are achieved without needing to increase the interconnect widths owing to negative set-up times. Applications in data circuits are shown as well with a 90nm example. Parallel resonant and split-driver non-resonant configurations as well are derived from GSR. Tradeoffs in timing performance versus power, based on theoretical analysis, are compared for the first time and verified. This enables synthesis of an optimal topology for a given application from the GSR

    Standard cell library design for sub-threshold operation

    Get PDF

    Radiation Hardened by Design Methodologies for Soft-Error Mitigated Digital Architectures

    Get PDF
    abstract: Digital architectures for data encryption, processing, clock synthesis, data transfer, etc. are susceptible to radiation induced soft errors due to charge collection in complementary metal oxide semiconductor (CMOS) integrated circuits (ICs). Radiation hardening by design (RHBD) techniques such as double modular redundancy (DMR) and triple modular redundancy (TMR) are used for error detection and correction respectively in such architectures. Multiple node charge collection (MNCC) causes domain crossing errors (DCE) which can render the redundancy ineffectual. This dissertation describes techniques to ensure DCE mitigation with statistical confidence for various designs. Both sequential and combinatorial logic are separated using these custom and computer aided design (CAD) methodologies. Radiation vulnerability and design overhead are studied on VLSI sub-systems including an advanced encryption standard (AES) which is DCE mitigated using module level coarse separation on a 90-nm process with 99.999% DCE mitigation. A radiation hardened microprocessor (HERMES2) is implemented in both 90-nm and 55-nm technologies with an interleaved separation methodology with 99.99% DCE mitigation while achieving 4.9% increased cell density, 28.5 % reduced routing and 5.6% reduced power dissipation over the module fences implementation. A DMR register-file (RF) is implemented in 55 nm process and used in the HERMES2 microprocessor. The RF array custom design and the decoders APR designed are explored with a focus on design cycle time. Quality of results (QOR) is studied from power, performance, area and reliability (PPAR) perspective to ascertain the improvement over other design techniques. A radiation hardened all-digital multiplying pulsed digital delay line (DDL) is designed for double data rate (DDR2/3) applications for data eye centering during high speed off-chip data transfer. The effect of noise, radiation particle strikes and statistical variation on the designed DDL are studied in detail. The design achieves the best in class 22.4 ps peak-to-peak jitter, 100-850 MHz range at 14 pJ/cycle energy consumption. Vulnerability of the non-hardened design is characterized and portions of the redundant DDL are separated in custom and auto-place and route (APR). Thus, a range of designs for mission critical applications are implemented using methodologies proposed in this work and their potential PPAR benefits explored in detail.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Embedding Logic and Non-volatile Devices in CMOS Digital Circuits for Improving Energy Efficiency

    Get PDF
    abstract: Static CMOS logic has remained the dominant design style of digital systems for more than four decades due to its robustness and near zero standby current. Static CMOS logic circuits consist of a network of combinational logic cells and clocked sequential elements, such as latches and flip-flops that are used for sequencing computations over time. The majority of the digital design techniques to reduce power, area, and leakage over the past four decades have focused almost entirely on optimizing the combinational logic. This work explores alternate architectures for the flip-flops for improving the overall circuit performance, power and area. It consists of three main sections. First, is the design of a multi-input configurable flip-flop structure with embedded logic. A conventional D-type flip-flop may be viewed as realizing an identity function, in which the output is simply the value of the input sampled at the clock edge. In contrast, the proposed multi-input flip-flop, named PNAND, can be configured to realize one of a family of Boolean functions called threshold functions. In essence, the PNAND is a circuit implementation of the well-known binary perceptron. Unlike other reconfigurable circuits, a PNAND can be configured by simply changing the assignment of signals to its inputs. Using a standard cell library of such gates, a technology mapping algorithm can be applied to transform a given netlist into one with an optimal mixture of conventional logic gates and threshold gates. This approach was used to fabricate a 32-bit Wallace Tree multiplier and a 32-bit booth multiplier in 65nm LP technology. Simulation and chip measurements show more than 30% improvement in dynamic power and more than 20% reduction in core area. The functional yield of the PNAND reduces with geometry and voltage scaling. The second part of this research investigates the use of two mechanisms to improve the robustness of the PNAND circuit architecture. One is the use of forward and reverse body biases to change the device threshold and the other is the use of RRAM devices for low voltage operation. The third part of this research focused on the design of flip-flops with non-volatile storage. Spin-transfer torque magnetic tunnel junctions (STT-MTJ) are integrated with both conventional D-flipflop and the PNAND circuits to implement non-volatile logic (NVL). These non-volatile storage enhanced flip-flops are able to save the state of system locally when a power interruption occurs. However, manufacturing variations in the STT-MTJs and in the CMOS transistors significantly reduce the yield, leading to an overly pessimistic design and consequently, higher energy consumption. A detailed analysis of the design trade-offs in the driver circuitry for performing backup and restore, and a novel method to design the energy optimal driver for a given yield is presented. Efficient designs of two nonvolatile flip-flop (NVFF) circuits are presented, in which the backup time is determined on a per-chip basis, resulting in minimizing the energy wastage and satisfying the yield constraint. To achieve a yield of 98%, the conventional approach would have to expend nearly 5X more energy than the minimum required, whereas the proposed tunable approach expends only 26% more energy than the minimum. A non-volatile threshold gate architecture NV-TLFF are designed with the same backup and restore circuitry in 65nm technology. The embedded logic in NV-TLFF compensates performance overhead of NVL. This leads to the possibility of zero-overhead non-volatile datapath circuits. An 8-bit multiply-and- accumulate (MAC) unit is designed to demonstrate the performance benefits of the proposed architecture. Based on the results of HSPICE simulations, the MAC circuit with the proposed NV-TLFF cells is shown to consume at least 20% less power and area as compared to the circuit designed with conventional DFFs, without sacrificing any performance.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Performance and power optimization in VLSI physical design

    Get PDF
    As VLSI technology enters the nanoscale regime, a great amount of efforts have been made to reduce interconnect delay. Among them, buffer insertion stands out as an effective technique for timing optimization. A dramatic rise in on-chip buffer density has been witnessed. For example, in two recent IBM ASIC designs, 25% gates are buffers. In this thesis, three buffer insertion algorithms are presented for the procedure of performance and power optimization. The second chapter focuses on improving circuit performance under inductance effect. The new algorithm works under the dynamic programming framework and runs in provably linear time for multiple buffer types due to two novel techniques: restrictive cost bucketing and efficient delay update. The experimental results demonstrate that our linear time algorithm consistently outperforms all known RLC buffering algorithms in terms of both solution quality and runtime. That is, the new algorithm uses fewer buffers, runs in shorter time and the buffered tree has better timing. The third chapter presents a method to guarantee a high fidelity signal transmission in global bus. It proposes a new redundant via insertion technique to reduce via variation and signal distortion in twisted differential line. In addition, a new buffer insertion technique is proposed to synchronize the transmitted signals, thus further improving the effectiveness of the twisted differential line. Experimental results demonstrate a 6GHz signal can be transmitted with high fidelity using the new approaches. In contrast, only a 100MHz signal can be reliably transmitted using a single-end bus with power/ground shielding. Compared to conventional twisted differential line structure, our new techniques can reduce the magnitude of noise by 45% as witnessed in our simulation. The fourth chapter proposes a buffer insertion and gate sizing algorithm for million plus gates. The algorithm takes a combinational circuit as input instead of individual nets and greatly reduces the buffer and gate cost of the entire circuit. The algorithm has two main features: 1) A circuit partition technique based on the criticality of the primary inputs, which provides the scalability for the algorithm, and 2) A linear programming formulation of non-linear delay versus cost tradeoff, which formulates the simultaneous buffer insertion and gate sizing into linear programming problem. Experimental results on ISCAS85 circuits show that even without the circuit partition technique, the new algorithm achieves 17X speedup compared with path based algorithm. In the meantime, the new algorithm saves 16.0% buffer cost, 4.9% gate cost, 5.8% total cost and results in less circuit delay
    corecore