139 research outputs found

    Techniques of Energy-Efficient VLSI Chip Design for High-Performance Computing

    Get PDF
    How to implement quality computing with the limited power budget is the key factor to move very large scale integration (VLSI) chip design forward. This work introduces various techniques of low power VLSI design used for state of art computing. From the viewpoint of power supply, conventional in-chip voltage regulators based on analog blocks bring the large overhead of both power and area to computational chips. Motivated by this, a digital based switchable pin method to dynamically regulate power at low circuit cost has been proposed to make computing to be executed with a stable voltage supply. For one of the widely used and time consuming arithmetic units, multiplier, its operation in logarithmic domain shows an advantageous performance compared to that in binary domain considering computation latency, power and area. However, the introduced conversion error reduces the reliability of the following computation (e.g. multiplication and division.). In this work, a fast calibration method suppressing the conversion error and its VLSI implementation are proposed. The proposed logarithmic converter can be supplied by dc power to achieve fast conversion and clocked power to reduce the power dissipated during conversion. Going out of traditional computation methods and widely used static logic, neuron-like cell is also studied in this work. Using multiple input floating gate (MIFG) metal-oxide semiconductor field-effect transistor (MOSFET) based logic, a 32-bit, 16-operation arithmetic logic unit (ALU) with zipped decoding and a feedback loop is designed. The proposed ALU can reduce the switching power and has a strong driven-in capability due to coupling capacitors compared to static logic based ALU. Besides, recent neural computations bring serious challenges to digital VLSI implementation due to overload matrix multiplications and non-linear functions. An analog VLSI design which is compatible to external digital environment is proposed for the network of long short-term memory (LSTM). The entire analog based network computes much faster and has higher energy efficiency than the digital one

    IDPAL – A Partially-Adiabatic Energy-Efficient Logic Family: Theory and Applications to Secure Computing

    Get PDF
    Low-power circuits and issues associated with them have gained a significant amount of attention in recent years due to the boom in portable electronic devices. Historically, low-power operation relied heavily on technology scaling and reduced operating voltage, however this trend has been slowing down recently due to the increased power density on chips. This dissertation introduces a new very-low power partially-adiabatic logic family called Input-Decoupled Partially-Adiabatic Logic (IDPAL) with applications in low-power circuits. Experimental results show that IDPAL reduces energy usage by 79% compared to equivalent CMOS implementations and by 25% when compared to the best adiabatic implementation. Experiments ranging from a simple buffer/inverter up to a 32-bit multiplier are explored and result in consistent energy savings, showing that IDPAL could be a viable candidate for a low-power circuit implementation. This work also shows an application of IDPAL to secure low-power circuits against power analysis attacks. It is often assumed that encryption algorithms are perfectly secure against attacks, however, most times attacks using side channels on the hardware implementation of an encryption operation are not investigated. Power analysis attacks are a subset of side channel attacks and can be implemented by measuring the power used by a circuit during an encryption operation in order to obtain secret information from the circuit under attack. Most of the previously proposed solutions for power analysis attacks use a large amount of power and are unsuitable for a low-power application. The almost-equal energy consumption for any given input in an IDPAL circuit suggests that this logic family is a good candidate for securing low-power circuits again power analysis attacks. Experimental results ranging from small circuits to large multipliers are performed and the power-analysis attack resistance of IDPAL is investigated. Results show that IDPAL circuits are not only low-power but also the most secure against power analysis attacks when compared to other adiabatic low-power circuits. Finally, a hybrid adiabatic-CMOS microprocessor design is presented. The proposed microprocessor uses IDPAL for the implementation of circuits with high switching activity (e.g. ALU) and CMOS logic for other circuits (e.g. memory, controller). An adiabatic-CMOS interface for transforming adiabatic signals to square-wave signals is presented and issues associated with a hybrid implementation and their solutions are also discussed

    Scalable Energy-Recovery Architectures.

    Full text link
    Energy efficiency is a critical challenge for today's integrated circuits, especially for high-end digital signal processing and communications that require both high throughput and low energy dissipation for extended battery life. Charge-recovery logic recovers and reuses charge using inductive elements and has the potential to achieve order-of-magnitude improvement in energy efficiency while maintaining high performance. However, the lack of large-scale high-speed silicon demonstrations and inductor area overheads are two major concerns. This dissertation focuses on scalable charge-recovery designs. We present a semi-automated design flow to enable the design of large-scale charge-recovery chips. We also present a new architecture that uses in-package inductors, eliminating the area overheads caused by the use of integrated inductors in high-performance charge-recovery chips. To demonstrate our semi-automated flow, which uses custom-designed standard-cell-like dynamic cells, we have designed a 576-bit charge-recovery low-density parity-check (LDPC) decoder chip. Functioning correctly at clock speeds above 1 GHz, this prototype is the first-ever demonstration of a GHz-speed charge-recovery chip of significant complexity. In terms of energy consumption, this chip improves over recent state-of-the-art LDPCs by at least 1.3 times with comparable or better area efficiency. To demonstrate our architecture for eliminating inductor overheads, we have designed a charge-recovery LDPC decoder chip with in-package inductors. This test-chip has been fabricated in a 65nm CMOS flip-chip process. A custom 6-layer FC-BGA package substrate has been designed with 16 inductors embedded in the fifth layer of the package substrate, yielding higher Q and significantly improving area efficiency and energy efficiency compared to their on-chip counterparts. From measurements, this chip achieves at least 2.3 times lower energy consumption with better area efficiency over state-of-the-art published designs.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/116653/1/terryou_1.pd

    Digital logic circuit design using adiabatic approach

    Get PDF
    A major challenge for the circuit designers nowadays is to meet the demand for low power, especially those used in portable and wearable devices which have limited energy power supply. The reasons of designing low power consumption circuit are to reduce energy usage and minimize dissipation of heat. Adiabatic technique is an attractive approach to obtain power optimization where some of the charge in capacitance can be recycled instead of being dissipated as heat. In this thesis, a methodology for designing sequential adiabatic circuits employing a single-phase power clock was investigated. Initially, methods to simulate dynamic power were analysed by identifying a better and reliable method to simulate adiabatic dynamic power. In addition, a method to validate the output voltage swing was presented. The relationship between voltage swing and power dissipation was analysed. Then, several adiabatic sequential D flip flops (DFF) designs which make use of combinational adiabatic circuit design based on quasi-adiabatic were proposed and suitable types of alternating current power supply which influence dynamic power were analysed and selected. The functionality and performance of the proposed circuits were compared against other adiabatic and traditional Complimentary Metal-Oxide Semiconductor (CMOS) circuits and verified to function up to 1 GHz operating region. Besides the circuits, the layout of the proposed sequential adiabatic design was also produced. All simulations were carried out using 0.25 ^m CMOS technology parameters using Tanner Electronic Design Aided and HSPICE tools. The findings showed that the proposed combinational circuit had less transistor count, lower power dissipation with lower voltage swing as compared to reference adiabatic circuits. Furthermore, the proposed sequential DFF circuit showed 25% less power dissipation compared to traditional CMOS

    An adiabatic charge pump based charge recycling design style

    Get PDF
    A typical CMOS gate draws charge equal to C[subscript L]Vdd2 from the power supply (Vdd) where C[subscript L] is the load capacitance. Half of the energy is dissipated in the pull-up p-type network, and the other half is dissipated in the pull-down n-type network. Adiabatic CMOS circuit reduces the dissipated energy by providing the charge at a rate significantly lower than the inherent RC delay of the gate. The charge can also be recovered with an RLC oscillator based power supply. However, the two main problems with adiabatic design style are the design of a high frequency RLC oscillator for the power supply, and the need to slow down the rate of charge supply for lower energy. This reduction in speed of operation renders this adiabatic technique inapplicable in certain situations. A new approach incorporating an adiabatic charge pump that moves the slower adiabatic components away from the critical path of the logic is proposed in this work. The adiabatic delays of a charge pump are overlapped with the computing path logic delays. Hence, the proposed charge pump based recycling technique is especially effective for pipelined datapath computations (digital signal processing, DSP, is such a domain) where timing considerations are important. Also the proposed design style does not interfere with the critical path of the system, and hence the delay introduced by this scheme does not reduce the overall computational speed. In this work, we propose one implementation schema that involves tapping the ground-bound charge in a capacitor (virtual ground) and using an adiabatic charge-pump circuit to feed internal virtual power supplies. As the design relies on leakage charge to generate virtual power supplies, it is most effective in large circuits that undergo considerable switching activity resulting in substantial charge tapping by the proposed scheme. The proposed method has been implemented in DSP applications like FIR filter, DCT/IDCT filters and FFT filters. Simulations results in SPICE indicate that the proposed scheme reduces energy consumption in these DSP circuits by as much as 18% with no loss in performance, paving way for a new approach towards conserving energy in complex digital systems

    Energy-Efficient Neural Network Architectures

    Full text link
    Emerging systems for artificial intelligence (AI) are expected to rely on deep neural networks (DNNs) to achieve high accuracy for a broad variety of applications, including computer vision, robotics, and speech recognition. Due to the rapid growth of network size and depth, however, DNNs typically result in high computational costs and introduce considerable power and performance overheads. Dedicated chip architectures that implement DNNs with high energy efficiency are essential for adding intelligence to interactive edge devices, enabling them to complete increasingly sophisticated tasks by extending battery lie. They are also vital for improving performance in cloud servers that support demanding AI computations. This dissertation focuses on architectures and circuit technologies for designing energy-efficient neural network accelerators. First, a deep-learning processor is presented for achieving ultra-low power operation. Using a heterogeneous architecture that includes a low-power always-on front-end and a selectively-enabled high-performance back-end, the processor dynamically adjusts computational resources at runtime to support conditional execution in neural networks and meet performance targets with increased energy efficiency. Featuring a reconfigurable datapath and a memory architecture optimized for energy efficiency, the processor supports multilevel dynamic activation of neural network segments, performing object detection tasks with 5.3x lower energy consumption in comparison with a static execution baseline. Fabricated in 40nm CMOS, the processor test-chip dissipates 0.23mW at 5.3 fps. It demonstrates energy scalability up to 28.6 TOPS/W and can be configured to run a variety of workloads, including severely power-constrained ones such as always-on monitoring in mobile applications. To further improve the energy efficiency of the proposed heterogeneous architecture, a new charge-recovery logic family, called zero-short-circuit current (ZSCC) logic, is proposed to decrease the power consumption of the always-on front-end. By relying on dedicated circuit topologies and a four-phase clocking scheme, ZSCC operates with significantly reduced short-circuit currents, realizing order-of-magnitude power savings at relatively low clock frequencies (in the order of a few MHz). The efficiency and applicability of ZSCC is demonstrated through an ANSI S1.11 1/3 octave filter bank chip for binaural hearing aids with two microphones per ear. Fabricated in a 65nm CMOS process, this charge-recovery chip consumes 13.8µW with a 1.75MHz clock frequency, achieving 9.7x power reduction per input in comparison with a 40nm monophonic single-input chip that represents the published state of the art. The ability of ZSCC to further increase the energy efficiency of the heterogeneous neural network architecture is demonstrated through the design and evaluation of a ZSCC-based front-end. Simulation results show 17x power reduction compared with a conventional static CMOS implementation of the same architecture.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147614/1/hsiwu_1.pd

    Adiabatic quasi-static CMOS

    Get PDF
    Mak Wing-sum.Thesis (M.Phil.)--Chinese University of Hong Kong, 2000.Includes bibliographical references (leaf [68]).Abstracts in English and Chinese.List of Figures --- p.IList of Tables --- p.IIIACKNOWLEDGMENTSABSTRACTChapter Chapter I --- IntroductionChapter 1.1 --- Introduction - Low Power --- p.I-1Chapter 1.2 --- Power Consumption in cmos Circuit --- p.I-1Chapter 1.2.1 --- Static Power Dissipation --- p.I-2Chapter 1.2.2 --- Dynamic Power Dissipation --- p.I-5Chapter 1.2.3 --- Short Circuit Power Dissipation --- p.I-8Chapter 1.3 --- Total Power Consumption of a CMOS Circuit --- p.I-10Chapter 1.4 --- Objective of the Project --- p.I-10Chapter CHAPTER II --- Background : Low Power Electronic - Adiabatic LogicChapter 2.1 --- Low Power Design --- p.II-12Chapter 2.2 --- Adiabatic Switching --- p.II-12Chapter 2.3 --- Adiabatic Logic --- p.II-14Chapter 2.4 --- History of Adiabatic Logic --- p.II-17Chapter CHAPTER III --- Adiabatic Quasi-Static CMOS InverterChapter 3.1 --- Building Block of AqsCMOS Logic --- p.III -18Chapter 3.2.1 --- AqsCMOS Inverter --- p.III -20Chapter 3.2.2 --- Diodes of AqsCMOS Inverter --- p.III -22Chapter 3.3 --- Pipeline Clocking of AqsCMOS Inverter Chain --- p.III -23Chapter Chapter IV --- Power Clock GeneratorChapter 4.1 --- Inductor - Capacitor Oscillator --- p.IV -24Chapter 4.2 --- Power Clock GeneratorChapter 4.2.1 --- Structure of Power Clock Generator --- p.IVChapter 4.2.2 --- power Consumption of Power Clock Generator --- p.IV -27Chapter Chapter V --- Adiabatic QuasI-Static CMOS MultiplierChapter 5.1 --- Baugh - Wooley Multiplier --- p.V-32Chapter 5.2 --- Structure of Multiplier --- p.V-34Chapter Chapter VI --- SimulationsChapter 6.1 --- AqsCMOS InverterChapter 6.1.1 --- Logic Alignment of AqsCMOS Inverter --- p.VI -38Chapter 6.1.2 --- Practical Implementation of AqsCMOS Inverter --- p.VI -39Chapter 6.1.3 --- Pipeline Clocking of AqsCMOS Inverter Chain --- p.VIChapter 6.2 --- Power Clock Generator --- p.VI -42Chapter 6.3 --- AqsCMOS Pipeline Multiplier --- p.VI -45Chapter 6.3.1 --- power estimation of multiplier --- p.VI -46Chapter ChapterVII --- evaluationsChapter 7.1 --- Testing Modules of AqsCMOS Inverter Chain --- p.VII -51Chapter 7.2 --- Evaluation of AqsCMOS Multiplier Testing ModulusChapter 7.2.1 --- Multiplier Chips Implementation --- p.VII -54Chapter 7.2.2 --- AQSCMOS Vs CMOS MULTIPLIER --- p.VII -55Chapter 7.2.3 --- Input Current Measurement --- p.VII -58Chapter 7.3 --- Power Measurement --- p.VII -63Chapter Chapter VIII --- Conclusions and Fiirthfr DevelopmentsChapter 8.1 --- Conclusions --- p.VIII -65Chapter 8.1.1 --- AqsCMOS Inverter --- p.VIII -65Chapter 8.1.2 --- Power Clock Generator --- p.VIII -65Chapter 8.1.3 --- AQSCMOS MULTIPLIER --- p.VIII -66Chapter 8.2 --- Further Development --- p.VIII -66Appendix I micro-photography of aqscmos multiplierAppendix II micro-Photography of CMOS multiplierAppendix III micro-photography of AqsCMOS inverter chain testing modulesAppendix IV power - meter simulation approachAppendix V Measurement Setting of AqsCMOS & CMOS MultipliersReferenc

    ENERGY-EFFICIENT AND SECURE HARDWARE FOR INTERNET OF THINGS (IoT) DEVICES

    Get PDF
    Internet of Things (IoT) is a network of devices that are connected through the Internet to exchange the data for intelligent applications. Though IoT devices provide several advantages to improve the quality of life, they also present challenges related to security. The security issues related to IoT devices include leakage of information through Differential Power Analysis (DPA) based side channel attacks, authentication, piracy, etc. DPA is a type of side-channel attack where the attacker monitors the power consumption of the device to guess the secret key stored in it. There are several countermeasures to overcome DPA attacks. However, most of the existing countermeasures consume high power which makes them not suitable to implement in power constraint devices. IoT devices are battery operated, hence it is important to investigate the methods to design energy-efficient and secure IoT devices not susceptible to DPA attacks. In this research, we have explored the usefulness of a novel computing platform called adiabatic logic, low-leakage FinFET devices and Magnetic Tunnel Junction (MTJ) Logic-in-Memory (LiM) architecture to design energy-efficient and DPA secure hardware. Further, we have also explored the usefulness of adiabatic logic in the design of energy-efficient and reliable Physically Unclonable Function (PUF) circuits to overcome the authentication and piracy issues in IoT devices. Adiabatic logic is a low-power circuit design technique to design energy-efficient hardware. Adiabatic logic has reduced dynamic switching energy loss due to the recycling of charge to the power clock. As the first contribution of this dissertation, we have proposed a novel DPA-resistant adiabatic logic family called Energy-Efficient Secure Positive Feedback Adiabatic Logic (EE-SPFAL). EE-SPFAL based circuits are energy-efficient compared to the conventional CMOS based design because of recycling the charge after every clock cycle. Further, EE-SPFAL based circuits consume uniform power irrespective of input data transition which makes them resilience against DPA attacks. Scaling of CMOS transistors have served the industry for more than 50 years in providing integrated circuits that are denser, and cheaper along with its high performance, and low power. However, scaling of the transistors leads to increase in leakage current. Increase in leakage current reduces the energy-efficiency of the computing circuits,and increases their vulnerability to DPA attack. Hence, it is important to investigate the crypto circuits in low leakage devices such as FinFET to make them energy-efficient and DPA resistant. In this dissertation, we have proposed a novel FinFET based Secure Adiabatic Logic (FinSAL) family. FinSAL based designs utilize the low-leakage FinFET device along with adiabatic logic principles to improve energy-efficiency along with its resistance against DPA attack. Recently, Magnetic Tunnel Junction (MTJ)/CMOS based Logic-in-Memory (LiM) circuits have been explored to design low-power non-volatile hardware. Some of the advantages of MTJ device include non-volatility, near-zero leakage power, high integration density and easy compatibility with CMOS devices. However, the differences in power consumption between the switching of MTJ devices increase the vulnerability of Differential Power Analysis (DPA) based side-channel attack. Further, the MTJ/CMOS hybrid logic circuits which require frequent switching of MTJs are not very energy-efficient due to the significant energy required to switch the MTJ devices. In the third contribution of this dissertation, we have investigated a novel approach of building cryptographic hardware in MTJ/CMOS circuits using Look-Up Table (LUT) based method where the data stored in MTJs are constant during the entire encryption/decryption operation. Currently, high supply voltage is required in both writing and sensing operations of hybrid MTJ/CMOS based LiM circuits which consumes a considerable amount of energy. In order to meet the power budget in low-power devices, it is important to investigate the novel design techniques to design ultra-low-power MTJ/CMOS circuits. In the fourth contribution of this dissertation, we have proposed a novel energy-efficient Secure MTJ/CMOS Logic (SMCL) family. The proposed SMCL logic family consumes uniform power irrespective of data transition in MTJ and more energy-efficient compared to the state-of-art MTJ/ CMOS designs by using charge sharing technique. The other important contribution of this dissertation is the design of reliable Physical Unclonable Function (PUF). Physically Unclonable Function (PUF) are circuits which are used to generate secret keys to avoid the piracy and device authentication problems. However, existing PUFs consume high power and they suffer from the problem of generating unreliable bits. This dissertation have addressed this issue in PUFs by designing a novel adiabatic logic based PUF. The time ramp voltages in adiabatic PUF is utilized to improve the reliability of the PUF along with its energy-efficiency. Reliability of the adiabatic logic based PUF proposed in this dissertation is tested through simulation based temperature variations and supply voltage variations

    Adiabatic Circuits for Power-Constrained Cryptographic Computations

    Get PDF
    This thesis tackles the need for ultra-low power operation in power-constrained cryptographic computations. An example of such an application could be smartcards. One of the techniques which has proven to have the potential of rendering ultra-low power operation is ‘Adiabatic Logic Technique’. However, the adiabatic circuits has associated challenges due to high energy dissipation of the Power-Clock Generator (PCG) and complexity of the multi-phase power-clocking scheme. Energy efficiency of the adiabatic system is often degraded due to the high energy dissipation of the PCG. In this thesis, nstep charging strategy using tank capacitors is considered for the power-clock generation and several design rules and trade-offs between the circuit complexity and energy efficiency of the PCG using n-step charging circuits have been proposed. Since pipelining is inherent in adiabatic logic design, careful selection of architecture is essential, as otherwise overhead in terms of area and energy due to synchronization buffers is induced specifically, in the case of adiabatic designs using 4-phase power-clocking scheme. Several architectures for the Montgomery multiplier using adiabatic logic technique are implemented and compared. An architecture which constitutes an appropriate trade-off between energy efficiency and throughput is proposed along with its methodology. Also, a strategy to reduce the overhead due to synchronization buffers is proposed. A modification in the Montgomery multiplication algorithm is proposed. Furthermore, a problem due to the application of power-clock gating in cascade stages of adiabatic logic is identified. The problem degrades the energy savings that would otherwise be obtained by the application of power-clock gating. A solution to this problem is proposed. Cryptographic implementations also present an obvious target for Power Analysis Attacks (PAA). There are several existing secure adiabatic logic designs which are proposed as a countermeasure against PAA. Shortcomings of the existing logic designs are identified, and two novel secure adiabatic logic designs are proposed as the countermeasures against PAA and improvement over the existing logic designs
    corecore