422 research outputs found

    Scalable Energy-Recovery Architectures.

    Full text link
    Energy efficiency is a critical challenge for today's integrated circuits, especially for high-end digital signal processing and communications that require both high throughput and low energy dissipation for extended battery life. Charge-recovery logic recovers and reuses charge using inductive elements and has the potential to achieve order-of-magnitude improvement in energy efficiency while maintaining high performance. However, the lack of large-scale high-speed silicon demonstrations and inductor area overheads are two major concerns. This dissertation focuses on scalable charge-recovery designs. We present a semi-automated design flow to enable the design of large-scale charge-recovery chips. We also present a new architecture that uses in-package inductors, eliminating the area overheads caused by the use of integrated inductors in high-performance charge-recovery chips. To demonstrate our semi-automated flow, which uses custom-designed standard-cell-like dynamic cells, we have designed a 576-bit charge-recovery low-density parity-check (LDPC) decoder chip. Functioning correctly at clock speeds above 1 GHz, this prototype is the first-ever demonstration of a GHz-speed charge-recovery chip of significant complexity. In terms of energy consumption, this chip improves over recent state-of-the-art LDPCs by at least 1.3 times with comparable or better area efficiency. To demonstrate our architecture for eliminating inductor overheads, we have designed a charge-recovery LDPC decoder chip with in-package inductors. This test-chip has been fabricated in a 65nm CMOS flip-chip process. A custom 6-layer FC-BGA package substrate has been designed with 16 inductors embedded in the fifth layer of the package substrate, yielding higher Q and significantly improving area efficiency and energy efficiency compared to their on-chip counterparts. From measurements, this chip achieves at least 2.3 times lower energy consumption with better area efficiency over state-of-the-art published designs.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/116653/1/terryou_1.pd

    DESIGNING and ANALYSIS of GATES BASED on ADIABATIC LOGIC

    Get PDF
    Due to various advantages, CMOS are being widely used in designing of LSI(Large Scale Integration) & VLSI(Very Large Scale Integration).However there are some other sources present in CMOS which are responsible for the power dissemination that can be pigeonholed as follows: Dynamic Power Consumption, Short Circuit Currents, Leakage Current.To reduce this power consumption another CMOS logic family called the adiabatic switching logic based on adiabatic switching principle. The adiabatic logic structure significantly decreases the power scattering. The switching technique puts forward a methodology to reuse the energy put away in the load capacitors as opposed to the conventional method which used to release the energy of capacitors into the ground and squandering this energy.Presnt paper discusses the standards of adiabatic logic, its arrangement and classification of different adiabatic logic circuits. An endeavor has been made in this paper to change 2PASCL (Two Phase adiabatic Static CMOS Logic) adiabatic logic circuit by replacing the MOS diode with simple PN diode which decreases the impact of Capacitances at high  clock frequency

    Techniques of Energy-Efficient VLSI Chip Design for High-Performance Computing

    Get PDF
    How to implement quality computing with the limited power budget is the key factor to move very large scale integration (VLSI) chip design forward. This work introduces various techniques of low power VLSI design used for state of art computing. From the viewpoint of power supply, conventional in-chip voltage regulators based on analog blocks bring the large overhead of both power and area to computational chips. Motivated by this, a digital based switchable pin method to dynamically regulate power at low circuit cost has been proposed to make computing to be executed with a stable voltage supply. For one of the widely used and time consuming arithmetic units, multiplier, its operation in logarithmic domain shows an advantageous performance compared to that in binary domain considering computation latency, power and area. However, the introduced conversion error reduces the reliability of the following computation (e.g. multiplication and division.). In this work, a fast calibration method suppressing the conversion error and its VLSI implementation are proposed. The proposed logarithmic converter can be supplied by dc power to achieve fast conversion and clocked power to reduce the power dissipated during conversion. Going out of traditional computation methods and widely used static logic, neuron-like cell is also studied in this work. Using multiple input floating gate (MIFG) metal-oxide semiconductor field-effect transistor (MOSFET) based logic, a 32-bit, 16-operation arithmetic logic unit (ALU) with zipped decoding and a feedback loop is designed. The proposed ALU can reduce the switching power and has a strong driven-in capability due to coupling capacitors compared to static logic based ALU. Besides, recent neural computations bring serious challenges to digital VLSI implementation due to overload matrix multiplications and non-linear functions. An analog VLSI design which is compatible to external digital environment is proposed for the network of long short-term memory (LSTM). The entire analog based network computes much faster and has higher energy efficiency than the digital one

    Designing Novel Hardware Security Primitives for Smart Computing Devices

    Get PDF
    Smart computing devices are miniaturized electronics devices that can sense their surroundings, communicate, and share information autonomously with other devices to work cohesively. Smart devices have played a major role in improving quality of the life and boosting the global economy. They are ubiquitously present, smart home, smart city, smart girds, industry, healthcare, controlling the hazardous environment, and military, etc. However, we have witnessed an exponential rise in potential threat vectors and physical attacks in recent years. The conventional software-based security approaches are not suitable in the smart computing device, therefore, hardware-enabled security solutions have emerged as an attractive choice. Developing hardware security primitives, such as True Random Number Generator (TRNG) and Physically Unclonable Function (PUF) from electrical properties of the sensor could be a novel research direction. Secondly, the Lightweight Cryptographic (LWC) ciphers used in smart computing devices are found vulnerable against Correlation Power Analysis (CPA) attack. The CPA performs statistical analysis of the power consumption of the cryptographic core and reveals the encryption key. The countermeasure against CPA results in an increase in energy consumption, therefore, they are not suitable for battery operated smart computing devices. The primary goal of this dissertation is to develop novel hardware security primitives from existing sensors and energy-efficient LWC circuit implementation with CPA resilience. To achieve these. we focus on developing TRNG and PUF from existing photoresistor and photovoltaic solar cell sensors in smart devices Further, we explored energy recovery computing (also known as adiabatic computing) circuit design technique that reduces the energy consumption compared to baseline CMOS logic design and same time increasing CPA resilience in low-frequency applications, e.g. wearable fitness gadgets, hearing aid and biomedical instruments. The first contribution of this dissertation is to develop a TRNG prototype from the uncertainty present in photoresistor sensors. The existing sensor-based TRNGs suffer a low random bit generation rate, therefore, are not suitable in real-time applications. The proposed prototype has an average random bit generation rate of 8 kbps, 32 times higher than the existing sensor-based TRNG. The proposed lightweight scrambling method results in random bit entropy close to ideal value 1. The proposed TRNG prototype passes all 15 statistical tests of the National Institute of Standards and Technology (NIST) Statistical Test Suite with quality performance. The second contribution of this dissertation is to develop an integrated TRNG-PUF designed using photovoltaic solar cell sensors. The TRNG and PUF are mutually independent in the way they are designed, therefore, integrating them as one architecture can be beneficial in resource-constrained computing devices. We propose a novel histogram-based technique to segregate photovoltaic solar cell sensor response suitable for TRNG and PUF respectively. The proposed prototype archives approximately 34\% improvement in TRNG output. The proposed prototype achieves an average of 92.13\% reliability and 50.91\% uniformity performance in PUF response. The proposed sensor-based hardware security primitives do not require additional interfacing hardware. Therefore, they can be ported as a software update on existing photoresistor and photovoltaic sensor-based devices. Furthermore, the sensor-based design approach can identify physically tempered and faulty sensor nodes during authentication as their response bit differs. The third contribution is towards the development of a novel 2-phase sinusoidal clocking implementation, 2-SPGAL for existing Symmetric Pass Gate Adiabatic Logic (SPGAL). The proposed 2-SPGAL logic-based LWC cipher PRESENT shows an average of 49.34\% energy saving compared to baseline CMOS logic implementation. Furthermore, the 2-SPGAL prototype has an average of 22.76\% better energy saving compared to 2-EE-SPFAL (2-phase Energy-Efficient-Secure Positive Feedback Adiabatic Logic). The proposed 2-SPGAL was tested for energy-efficiency performance for the frequency range of 50 kHz to 250 kHz, used in healthcare gadgets and biomedical instruments. The proposed 2-SPGAL based design saves 16.78\% transistor count compared to 2-EE-SPFAL counterpart. The final contribution is to explore Clocked CMOS Adiabatic Logic (CCAL) to design a cryptographic circuit. Previously proposed 2-SPGAL and 2-EE-SPFAL uses two complementary pairs of the transistor evaluation network, thus resulting in a higher transistor count compared to the CMOS counterpart. The CCAL structure is very similar to CMOS and unlike 2-SPGAL and 2-EE-SPFAL, it does not require discharge circuitry to improve security performance. The case-study implementation LWC cipher PRESENT S-Box using CCAL results into 45.74\% and 34.88\% transistor count saving compared to 2-EE-SPFAL and 2-SPGAL counterpart. Furthermore, the case-study implementation using CCAL shows more than 95\% energy saving compared to CMOS logic at frequency range 50 kHz to 125 kHz, and approximately 60\% energy saving at frequency 250 kHz. The case study also shows 32.67\% and 11.21\% more energy saving compared to 2-EE-SPFAL and 2-SPGAL respectively at frequency 250 kHz. We also show that 200 fF of tank capacitor in the clock generator circuit results in optimum energy and security performance in CCAL

    Comparative Study on Performance and Variation Tolerance of Low Power Circuit

    Get PDF
    The demand for low-power electronic devices is increasing rapidly in current VLSI technology. Instead of conventional CMOS circuit operating at nominal supply voltage, several kinds of circuits are brought about with the goal of reducing power consumption. This research is mainly focused on evaluating performance, power and variation tolerance of near/sub-threshold computing and adiabatic logic circuits. Arithmetic logic units (ALUs) are designed with 15nm FinFET process technologies for these circuit styles. The evaluation is carried out by simulations on these ALU designs. The variation model considers ambient temperature variations and power supply fluctuations that emulate wireless sensor node applications. The results shows that conventional static CMOS circuit operating in near-threshold region exhibits similar power efficiency with adiabatic logic circuit operating in the same region, while at the same time it bears better temperature and voltage variation tolerance in most of the cases. The study results provide helpful guidance to low-power electronic system designs

    Optimizing the integration and energy efficiency of through silicon via-based 3D interconnects

    Get PDF
    The aggressive scaling of CMOS process technology has been driving the rapid growth of the semiconductor industry for more than three decades. In recent years, the performance gains enabled by CMOS scaling have been increasingly challenged by highlyparasitic on-chip interconnects as wire parasitics do not scale at the same pace. Emerging 3D integration technologies based on vertical through-silicon vias (TSVs) promise a solution to the interconnect performance bottleneck, along with reduced fabrication cost and heterogeneous integration. As TSVs are a relatively recent interconnect technology, innovative test structures are required to evaluate and optimise the process, as well as extract parameters for the generation of design rules and models. From the circuit designer’s perspective, critical TSV characteristics are its parasitic capacitance, and thermomechanical stress distribution. This work proposes new test structures for extracting these characteristics. The structures were fabricated on a 65nm 3D process and used for the evaluation of that technology. Furthermore, as TSVs are implemented in large, densely interconnected 3D-system-on-chips (SoCs), the TSV parasitic capacitance may become an important source of energy dissipation. Typical low-power techniques based on voltage scaling can be used, though this represents a technical challenge in modern technology nodes. In this work, a novel TSV interconnection scheme is proposed based on reversible computing, which shows frequencydependent energy dissipation. The scheme is analysed using theoretical modelling, while a demonstrator IC was designed based on the developed theory and fabricated on a 130nm 3D process.EThOS - Electronic Theses Online ServiceEngineering and Physical Science Research Council (EPSRC)GBUnited Kingdo

    Power Reductions with Energy Recovery Using Resonant Topologies

    Get PDF
    The problem of power densities in system-on-chips (SoCs) and processors has become more exacerbated recently, resulting in high cooling costs and reliability issues. One of the largest components of power consumption is the low skew clock distribution network (CDN), driving large load capacitance. This can consume as much as 70% of the total dynamic power that is lost as heat, needing elaborate sensing and cooling mechanisms. To mitigate this, resonant clocking has been utilized in several applications over the past decade. An improved energy recovering reconfigurable generalized series resonance (GSR) solution with all the critical support circuitry is developed in this work. This LC resonant clock driver is shown to save about 50% driver power (\u3e40% overall), on a 22nm process node and has 50% less skew than a non-resonant driver at 2GHz. It can operate down to 0.2GHz to support other energy savings techniques like dynamic voltage and frequency scaling (DVFS). As an example, GSR can be configured for the simpler pulse series resonance (PSR) operation to enable further power saving for double data rate (DDR) applications, by using de-skewing latches instead of flip-flop banks. A PSR based subsystem for 40% savings in clocking power with 40% driver active area reduction xii is demonstrated. This new resonant driver generates tracking pulses at each transition of clock for dual edge operation across DVFS. PSR clocking is designed to drive explicit-pulsed latches with negative setup time. Simulations using 45nm IBM/PTM device and interconnect technology models, clocking 1024 flip-flops show the reductions, compared to non-resonant clocking. DVFS range from 2GHz/1.3V to 200MHz/0.5V is obtained. The PSR frequency is set \u3e3Ă— the clock rate, needing only 1/10th the inductance of prior-art LC resonance schemes. The skew reductions are achieved without needing to increase the interconnect widths owing to negative set-up times. Applications in data circuits are shown as well with a 90nm example. Parallel resonant and split-driver non-resonant configurations as well are derived from GSR. Tradeoffs in timing performance versus power, based on theoretical analysis, are compared for the first time and verified. This enables synthesis of an optimal topology for a given application from the GSR

    Adiabatic quasi-static CMOS

    Get PDF
    Mak Wing-sum.Thesis (M.Phil.)--Chinese University of Hong Kong, 2000.Includes bibliographical references (leaf [68]).Abstracts in English and Chinese.List of Figures --- p.IList of Tables --- p.IIIACKNOWLEDGMENTSABSTRACTChapter Chapter I --- IntroductionChapter 1.1 --- Introduction - Low Power --- p.I-1Chapter 1.2 --- Power Consumption in cmos Circuit --- p.I-1Chapter 1.2.1 --- Static Power Dissipation --- p.I-2Chapter 1.2.2 --- Dynamic Power Dissipation --- p.I-5Chapter 1.2.3 --- Short Circuit Power Dissipation --- p.I-8Chapter 1.3 --- Total Power Consumption of a CMOS Circuit --- p.I-10Chapter 1.4 --- Objective of the Project --- p.I-10Chapter CHAPTER II --- Background : Low Power Electronic - Adiabatic LogicChapter 2.1 --- Low Power Design --- p.II-12Chapter 2.2 --- Adiabatic Switching --- p.II-12Chapter 2.3 --- Adiabatic Logic --- p.II-14Chapter 2.4 --- History of Adiabatic Logic --- p.II-17Chapter CHAPTER III --- Adiabatic Quasi-Static CMOS InverterChapter 3.1 --- Building Block of AqsCMOS Logic --- p.III -18Chapter 3.2.1 --- AqsCMOS Inverter --- p.III -20Chapter 3.2.2 --- Diodes of AqsCMOS Inverter --- p.III -22Chapter 3.3 --- Pipeline Clocking of AqsCMOS Inverter Chain --- p.III -23Chapter Chapter IV --- Power Clock GeneratorChapter 4.1 --- Inductor - Capacitor Oscillator --- p.IV -24Chapter 4.2 --- Power Clock GeneratorChapter 4.2.1 --- Structure of Power Clock Generator --- p.IVChapter 4.2.2 --- power Consumption of Power Clock Generator --- p.IV -27Chapter Chapter V --- Adiabatic QuasI-Static CMOS MultiplierChapter 5.1 --- Baugh - Wooley Multiplier --- p.V-32Chapter 5.2 --- Structure of Multiplier --- p.V-34Chapter Chapter VI --- SimulationsChapter 6.1 --- AqsCMOS InverterChapter 6.1.1 --- Logic Alignment of AqsCMOS Inverter --- p.VI -38Chapter 6.1.2 --- Practical Implementation of AqsCMOS Inverter --- p.VI -39Chapter 6.1.3 --- Pipeline Clocking of AqsCMOS Inverter Chain --- p.VIChapter 6.2 --- Power Clock Generator --- p.VI -42Chapter 6.3 --- AqsCMOS Pipeline Multiplier --- p.VI -45Chapter 6.3.1 --- power estimation of multiplier --- p.VI -46Chapter ChapterVII --- evaluationsChapter 7.1 --- Testing Modules of AqsCMOS Inverter Chain --- p.VII -51Chapter 7.2 --- Evaluation of AqsCMOS Multiplier Testing ModulusChapter 7.2.1 --- Multiplier Chips Implementation --- p.VII -54Chapter 7.2.2 --- AQSCMOS Vs CMOS MULTIPLIER --- p.VII -55Chapter 7.2.3 --- Input Current Measurement --- p.VII -58Chapter 7.3 --- Power Measurement --- p.VII -63Chapter Chapter VIII --- Conclusions and Fiirthfr DevelopmentsChapter 8.1 --- Conclusions --- p.VIII -65Chapter 8.1.1 --- AqsCMOS Inverter --- p.VIII -65Chapter 8.1.2 --- Power Clock Generator --- p.VIII -65Chapter 8.1.3 --- AQSCMOS MULTIPLIER --- p.VIII -66Chapter 8.2 --- Further Development --- p.VIII -66Appendix I micro-photography of aqscmos multiplierAppendix II micro-Photography of CMOS multiplierAppendix III micro-photography of AqsCMOS inverter chain testing modulesAppendix IV power - meter simulation approachAppendix V Measurement Setting of AqsCMOS & CMOS MultipliersReferenc

    Performance-Driven Energy-Efficient VLSI.

    Full text link
    Today, there are two prevalent platforms in VLSI systems: high-performance and ultra-low power. High-speed designs, usually operating at GHz level, provide the required computation abilities to systems but also consume a large amount of power; microprocessors and signal processing units are examples of this type of designs. For ultra-low power designs, voltage scaling methods are usually used to reduce power consumption and extend battery life. However, circuit delay in ultra-low power designs increases exponentially, as voltage is scaled below Vth, and subthreshold leakage energy also increases in a near-exponential fashion. Many methods have been proposed to address key design challenges on these two platforms, energy consumption in high-performance designs, and performance/reliability in ultra-low power designs. In this thesis, charge-recovery design is explored as a solution targeting both platforms to achieve increased energy efficiency over conventional CMOS designs without compromising performance or reliability. To improve performance while still achieving high energy efficiency for ultra-low power designs, we propose Subthreshold Boost Logic (SBL), a new circuit family that relies on charge-recovery design techniques to achieve order-of-magnitude improvements in operating frequencies, and achieve high energy efficiency compared to conventional subthreshold designs. To demonstrate the performance and energy efficiency of SBL, we present a 14-tap 8-bit finite-impulse response (FIR) filter test-chip fabricated in a 0.13µm process. With a single 0.27V supply, the test-chip achieves its most energy efficient operating point at 20MHz, consuming 15.57pJ per cycle with a recovery rate of 89% and a FoM equal to 17.37 nW/Tap/MHz/InBit/CoeffBit. To reduce energy consumption at multi-GHz level frequencies, we explore the application of resonant-clocking to the design of a 5-bit non-interleaved resonant-clock ash ADC with a sampling rate of 7GS/s. The ADC has been designed in a 65nm bulk CMOS process. An integrated 0.77nH inductor is used to resonate the entire clock distribution network to achieve energy efficient operation. Operating at 5.5GHz, the ADC consumes 28mW, yielding 396fJ per conversion step. The clock network accounts for 10.7% of total power and consumes 54% less energy over CV^2. By comparison, in a typical ash ADC design, 30% of total power is clock-related.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/89779/1/wsma_1.pd

    Low power predictable memory and processing architectures

    Get PDF
    Great demand in power optimized devices shows promising economic potential and draws lots of attention in industry and research area. Due to the continuously shrinking CMOS process, not only dynamic power but also static power has emerged as a big concern in power reduction. Other than power optimization, average-case power estimation is quite significant for power budget allocation but also challenging in terms of time and effort. In this thesis, we will introduce a methodology to support modular quantitative analysis in order to estimate average power of circuits, on the basis of two concepts named Random Bag Preserving and Linear Compositionality. It can shorten simulation time and sustain high accuracy, resulting in increasing the feasibility of power estimation of big systems. For power saving, firstly, we take advantages of the low power characteristic of adiabatic logic and asynchronous logic to achieve ultra-low dynamic and static power. We will propose two memory cells, which could run in adiabatic and non-adiabatic mode. About 90% dynamic power can be saved in adiabatic mode when compared to other up-to-date designs. About 90% leakage power is saved. Secondly, a novel logic, named Asynchronous Charge Sharing Logic (ACSL), will be introduced. The realization of completion detection is simplified considerably. Not just the power reduction improvement, ACSL brings another promising feature in average power estimation called data-independency where this characteristic would make power estimation effortless and be meaningful for modular quantitative average case analysis. Finally, a new asynchronous Arithmetic Logic Unit (ALU) with a ripple carry adder implemented using the logically reversible/bidirectional characteristic exhibiting ultra-low power dissipation with sub-threshold region operating point will be presented. The proposed adder is able to operate multi-functionally
    • …
    corecore