Search CORE

5,354 research outputs found

High Performance Logic for Arithmetic Circuits

Author: Das Neeharika
Publication venue
Publication date: 14/05/2012
Field of study

The objective of this project is to design high performance arithmetic circuits which are faster and have lower power consumption using a new dynamic logic family of CMOS and to analyze its performance for sequential circuits and effects upon cascading. This new dynamic logic family is known as Feedthrough logic. It has two basic structures: high speed (HS0) and low power (LP0). It allows for commencement of evaluation in a computational block before its evaluation phase begins, and quickly performs a final evaluation as soon as the inputs are valid. This dynamic logic family is best suited to arithmetic circuits because the critical path is made of a long chain of cascaded inverting gates. As the major advantage of this logic which is higher speed is observed upon cascading, it’s most suitable for arithmetic circuits. We compare a set of ripple carry adders 4 bit and 16 bit in domino logic with the two basic structures derived. Experimental results have shown that the lower power structure provides for smaller power delay product when compared with domino logic. Certain modifications in the logic style are proposed to optimize the performance when applied to a single ended or double ended flip flops. The effects upon cascading are analyzed by using a 4-bit register. As delay is not propagated in a register circuit or any other synchronous sequential circuit (the circuit being edge triggered), the major advantage of this logic which is observed upon cascading cannot possibly be observed for sequential circuits. So even though the circuit can be optimised by feedthrough logic, this logic is not preferred for sequential circuits. So finally we have carried out the tapeout of 16 bit adder in LP0 using 180 UMC CMOS process flow

ethesis@nitr

Recommended from our members

Architecting SkyBridge-CMOS

Author: Li Mingyu
Publication venue: ScholarWorks@UMass Amherst
Publication date: 18/03/2015
Field of study

As the scaling of CMOS approaches fundamental limits, revolutionary technology beyond the end of CMOS roadmap is essential to continue the progress and miniaturization of integrated circuits. Recent research efforts in 3-D circuit integration explore pathways of continuing the scaling by co-designing for device, circuit, connectivity, heat and manufacturing challenges in a 3-D fabric-centric manner. SkyBridge fabric is one such approach that addresses fine-grained integration in 3-D, achieves orders of magnitude benefits over projected scaled 2-D CMOS, and provides a pathway for continuing scaling beyond 2-D CMOS. However, SkyBridge fabric utilizes only single type transistors in order to reduce manufacture complexity, which limits its circuit implementation to dynamic logic. This design choice introduces multiple challenges for SkyBridge such as high switching power consumption, susceptibility to noise, and increased complexity for clocking. In this thesis we propose a new 3-D fabric, similar in mindset to SkyBridge, but with static logic circuit implementation in order to mitigate the afore-mentioned challenges. We present an integrated framework to realize static circuits with vertical nanowires, and co-design it across all layers spanning fundamental fabric structures to large circuits. The new fabric, named as SkyBridge-CMOS, introduces new technology, structures and circuit designs to meet the additional requirements for implementing static circuits. One of the critical challenges addressed here is integrating both n-type and p-type nanowires. Molecular bonding process allows precise control between different doping regions, and novel fabric components are proposed to achieve 3-D routing between various doping regions. Core fabric components are designed, optimized and modeled with their physical level information taken into account. Based on these basic structures we design and evaluate various logic gates, arithmetic circuits and SRAM in terms of power, area footprint and delay. A comprehensive evaluation methodology spanning material/device level to circuit level is followed. Benchmarking against 16nm 2-D CMOS shows significant improvement of up to 50X in area footprint and 9.3X in total power efficiency for low power applications, and 3X in throughput for high performance applications. Also, better noise resilience and better power efficiency can be guaranteed when compared with original SkyBridge fabrics

ScholarWorks@UMass Amherst

Design of Adiabatic MTJ-CMOS Hybrid Circuits

Author: Badawy Abdel-Hameed
Saifullah Z. M.
Sharifi Fazel
Publication venue
Publication date: 25/08/2017
Field of study

Low-power designs are a necessity with the increasing demand of portable devices which are battery operated. In many of such devices the operational speed is not as important as battery life. Logic-in-memory structures using nano-devices and adiabatic designs are two methods to reduce the static and dynamic power consumption respectively. Magnetic tunnel junction (MTJ) is an emerging technology which has many advantages when used in logic-in-memory structures in conjunction with CMOS. In this paper, we introduce a novel adiabatic hybrid MTJ/CMOS structure which is used to design AND/NAND, XOR/XNOR and 1-bit full adder circuits. We simulate the designs using HSPICE with 32nm CMOS technology and compared it with a non-adiabatic hybrid MTJ/CMOS circuits. The proposed adiabatic MTJ/CMOS full adder design has more than 7 times lower power consumtion compared to the previous MTJ/CMOS full adder

arXiv.org e-Print Archive

Crossref

High-Performance, Energy-Efficient CMOS Arithmetic Circuits

Author: Chuang Pierce I-Jen
Publication venue: 'University of Waterloo'
Publication date: 01/01/2014
Field of study

In a modern microprocessor, datapath/arithmetic circuits have always been an important building block in delivering high-performance, energy-efficient computing, because arithmetic operations such as addition and binary number comparison are two of the most commonly used computing instructions. Besides the manufacturing CMOS process, the two most critical design considerations for arithmetic circuits are the logic style and micro-architecture. In this thesis, a constant-delay (CD) logic style is proposed targeting full-custom high-speed applications. The constant delay characteristic of this logic style (regardless of the logic type) makes it suitable for implementing complicated logic expressions such as addition. CD logic exhibits a unique characteristic where the output is pre-evaluated before the inputs from the preceding stage are ready. This feature enables a performance advantage over static and dynamic domino logic styles in a single cycle, multi-stage circuit block. Several design considerations including timing window width adjustment and clock distribution are discussed. Using a 65-nm general-purpose CMOS technology, the proposed logic style demonstrates an average speedup of 94% and 56% over static and dynamic domino logic, respectively, in five different logic gates. Simulation results of 8-bit ripple carry adders conclude that CD logic is 39% and 23% faster than the static and dynamic-based adders, respectively. CD logic also demonstrates 39% speedup and 64% (22%) energy-delay product reduction from static logic at 100% (10%) data activity in 32-bit carry lookahead adders. To confirm CD logic's potential, a 148 ps, single-cycle 64-bit adder with CD logic implemented in the critical path is fabricated in a 65-nm, 1-V CMOS process. A new 64-bit Ling adder micro-architecture, which utilizes both inversion and absorption properties to minimize the number of CD logic and the number of logic stage in the critical path, is also proposed. At 1-V supply, this adder's measured worst-case power and leakage power are 135 mW and 0.22 mW, respectively. A single-cycle 64-bit binary comparator utilizing a radix-2 tree structure is also proposed. This comparator architecture is specifically designed for static logic to achieve both low-power and high-performance operation, especially in low input data activity environments. At 65-nm technology with 25% (10%) data activity, the proposed design demonstrates 2.3x (3.5x) and 3.7x (5.8x) power and energy-delay product efficiency, respectively. This comparator is also 2.7x faster at iso-energy (80 fJ) or 3.3x more energy-efficient at iso-delay (200 ps) than existing designs. An improved comparator, where CD logic is utilized in the critical path to achieve high performance without sacrificing the overall energy efficiency, is also realized in a 65-nm 1-V CMOS process. At 1-V supply, the proposed comparator's measured delay is 167 ps, and has an average power and a leakage power of 2.34 mW and 0.06 mW, respectively. At 0.3-pJ iso-energy or 250-ps iso-delay budget, the proposed comparator with CD logic is 20% faster or 17% more energy-efficient compared to a comparator implemented with just the static logic

University of Waterloo's Institutional Repository

Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

Author: Gujarathi Hemal S
McDonald-Maier Klaus D
Qadri Muhammad Yasir
Publication venue: 'Academy Publisher'
Publication date: 01/01/2009
Field of study

The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

University of Essex Research Repository

CiteSeerX

Crossref

Comparison of Various Pipelined and Non-Pipelined SCl 8051 ALUs

Author: Zhao Jingyi
Publication venue: ScholarWorks@UARK
Publication date: 01/08/2012
Field of study

This paper describes the development of an 8-bit SCL 8051 ALU with two versions: SCL 8051 ALU with nsleep and sleep signals and SCL 8051 ALU without nsleep. Both versions have combinational logic (C/L), registers, and completion components, which all utilize slept gates. Both three-stage pipelined and non-pipelined designs were examined for both versions. The four designs were compared in terms of area, speed, leakage power, average power and energy per operation. The SCL 8051 ALU without nsleep is smaller and faster, but it has greater leakage power. It also has lower average power, and less energy consumption than the SCL 8051 ALU with both nsleep and sleep signals. The pipelined SCL 8051 ALU is bigger, slower, and has larger leakage power, average power and energy consumption than the non-pipelined SCL 8051 ALU

ScholarWorks@UARK

UARK (University of Arkansas )

Motion estimation and CABAC VLSI co-processors for real-time high-quality H.264/AVC video coding

Author: Casula M.
Fanucci L.
Martina Maurizio
Masera Guido
Saponara S.
Publication venue: Elsevier
Publication date: 01/01/2010
Field of study

Real-time and high-quality video coding is gaining a wide interest in the research and industrial community for different applications. H.264/AVC, a recent standard for high performance video coding, can be successfully exploited in several scenarios including digital video broadcasting, high-definition TV and DVD-based systems, which require to sustain up to tens of Mbits/s. To that purpose this paper proposes optimized architectures for H.264/AVC most critical tasks, Motion estimation and context adaptive binary arithmetic coding. Post synthesis results on sub-micron CMOS standard-cells technologies show that the proposed architectures can actually process in real-time 720 × 480 video sequences at 30 frames/s and grant more than 50 Mbits/s. The achieved circuit complexity and power consumption budgets are suitable for their integration in complex VLSI multimedia systems based either on AHB bus centric on-chip communication system or on novel Network-on-Chip (NoC) infrastructures for MPSoC (Multi-Processor System on Chip

Archivio della Ricerca - Università di Pisa

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino