Search CORE

3 research outputs found

An Ultra-Low-Power 75mV 64-Bit Current-Mode Majority-Function Adder

Author: Ebrahimi Manuchehr
Publication venue: 'University of Waterloo'
Publication date: 18/05/2012
Field of study

Ultra-low-power circuits are becoming more desirable due to growing portable device markets and they are also becoming more interesting and applicable today in biomedical, pharmacy and sensor networking applications because of the nano-metric scaling and CMOS reliability improvements. In this thesis, three main achievements are presented in ultra-low-power adders. First, a new majority function algorithm for carry and the sum generation is presented. Then with this algorithm and implied new architecture, we achieved a circuit with 75mV supply voltage operation. Last but not least, a 64 bit current-mode majority-function adder based on the new architecture and algorithm is successfully tested at 75mV supply voltage. The circuit consumed 4.5nW or 3.8pJ in one of the worst conditions

University of Waterloo's Institutional Repository

A full-custom digital-signal-processing unit for real-time cortical blood flow monitoring

Author: HONG ZHIQIAN
Publication venue
Publication date: 24/12/2009
Field of study

Master'sMASTER OF ENGINEERIN

ScholarBank@NUS

[[alternative]]高效能互補式金氧半邏輯電路與功率效能關切之算術電路架構在深次微米製程技術上的設計與分析

Author: 鄭舜文
Publication venue
Publication date
Field of study

博士[[abstract]]功率消耗總是可攜式行動系統的首要考量，而在高效能系統、工作站系統亦如是。製程技術持續進步的一個結果，就是使得「降低每個邏輯電路的功率消耗、但增進其操作速度」，成為時下互補式金氧半(CMOS)晶片設計的普遍共識。這共識將帶領我們進入深次微米時代。本論文先從各個方向來看深次微米電子學上不同的高速、低功率電路設計。首先吾人提出一個可以實現任意邏輯函數的混合邏輯程序─俱優先權之首要原項拼圖(PPIPP)─依循這個程序，可以得到一個新的混合邏輯電路族系。此程序以最少數目的電晶體來實現邏輯函數，而且所實現之電路在所有節點上都有全擺幅的電壓差，有著對於深次微米電晶體尺寸縮小和操作電壓縮減的高強健性。在雙端差動之數位應用領域上，介紹了低功率電流偵測互補式帶通電晶體邏輯設計(LCSCPTL)。在低的操作電壓之下，電流偵測機構比起電壓偵測機構擁有較快的感應速度，所以此邏輯電路的操作速度快於傳統的互補式帶通電晶體邏輯(CPL)電路。論文的第一部分，以兩個真單相時脈(TSPC)邏輯電路的改進電路的討論做一個結束。這兩個改進電路分別是非全電壓擺幅真單相時脈(NSTSPC)電路與全NMOS真單相時脈(ANTSPC)電路。ANTSPC邏輯電路使用NMOS取代PMOS，因而有效減少了Φ-區段的輸出負載，並得到較高的ΦB-區段佈局密度。藉著交替使用NSTSPC與ANTSPC，成功的建構了一個無須PMOS、真單相時脈的高速管線結構。論文的第二部分，談到功率效能關切之算術電路架構。傳統的「低功率設計」與所謂「功率效能關切設計」的不同，在於「低功率設計」的目標在於如何把功率消耗降到最低；而「功率效能關切設計」是在有限的功率消耗配額上，如何提升其他效能指標到最大。功率效率正開始變成數位訊號處理(DSP)的重要指標；而該數位訊號處理的效能是由所配置的加法器來主導。在高速算術應用領域中，條件和加法器(CSA)已展現其優異性能。本研究對條件和加法器做出改進，提出一個更優異的加法器，稱之為條件進位加法器(CCA)。此條件進位架構不只適用於加法器，亦適用於減法器、數值比較器與排序器的實現。條件進位加法器的改進，減少了多工選擇器與內部節點的數目，有效降低功率消耗，與增進操作速度。在所提架構之下，吾人以單端CMOS、雙端差動之CPL與LCSCPTL不同邏輯電路實現32位元條件和加法器與條件進位加法器，並做了詳細的分析比較。到目前為止，降低開關時的動態功率消耗，是許多低功率電路技術的首要目標。因而認為關閉狀態下的漏電流所造成的功率消耗，相較於動態功率消耗，是可忽略不計的。然而，當製程技術向深次微米大歩跨進，漏電流所消耗的功率大增，不可再等閒視之。對此，多重臨限電壓互補金氧半(MTCMOS)技術的出現，日益受到歡迎。此技術抑制了隨製程進歩而逐步上升的漏電流功率消耗，並維持了高效能的目標。為了找出最具功率效能的架構，以台積電0.25微米單層矽化物、五層金屬(1P5M+)多重臨限電壓CMOS製程技術為平台，吾人提出六個64位元、混合雙臨限電壓條件進位加法器架構加以討論。在緊要關鍵路徑上的元件採用低臨限電壓電晶體以加快操作速度；而在其他路徑上的元件採用高臨限電壓電晶體以節省功率。如此，非常有益於實現功率效能關切的架構。經比較後發現，其中之一的電路架構有著最低的功率-延遲乘積與最低的能量-延遲乘積。此雙臨限電壓架構展現出功率與效能的良好妥協，其功率效率優於其他任何單一臨限電壓架構設計。[[abstract]]Power dissipation is always a major design consideration for mobile, portable systems as well as high-performance, workstation systems. Decreasing power consumption but increasing operation speed per logic circuit has become a general awareness in almost all CMOS chips designed nowadays as a result of the lasting progress in processing technology that has led us into the deep-submicron era. The dissertation first deals with the different aspects of low-power high-speed logic circuits for deep-submicron electronics. A hybrid logic synthesis procedure for arbitrary logic function was proposed. Following the proposed procedure, Prioritized Prime Implicant Patterns Puzzle (PPIPP), may get a new hybrid logic circuit family. The PPIPP gets a mixed logic with minimum transistor count, and it has full-swing signal in all nodes and high robustness against transistor downsizing and voltage scaling. For differential-end digital applications, a Low-power Current-Sensing Complementary Pass-Transistor Logic (LCSCPTL) was introduced. The current-sensing scheme yields a faster sensing speed under small voltage swing than the voltage-sensing scheme; hence the operation speed of LCSCPTL is faster than conventional CPL. The first part ends up with the discussions of two improved dynamic circuit techniques of True Single-Phase Clocking (TSPC) logic, which called Non-full Swing TSPC (NSTSPC) and All-N-TSPC (ANTSPC). The ANTSPC uses NMOS transistors to replace PMOS transistors; the output loading of Φ-Section is therefore reduced and a higher layout density of ΦB-Section is obtained. By using the techniques of NSTSPC and ANTSPC alternately, a high-speed, TSPC pipelined structure without PMOS logic block was successfully constructed. The second part gives the proposed architecture and circuit techniques for the power-aware arithmetic applications. The difference between conventional low-power design and power aware design is that whereas low-power design refers to minimizing power, yet power-aware design refers to maximizing some other performance metric, subject to a power quota. The power efficiency is becoming an important index of digital signal processing (DSP), and the performance of DSP is predominantly determined by its adder. The Conditional Sum Adder (CSA) has been shown to outperform other adders applied in high-speed arithmetic applications. This investigation proposes a modified CSA called the conditional carry adder (CCA). Besides, the conditional carry architecture can be used on subtractor, integer comparator and sorter design. Architectural modification of the CCA lowers the number of multiplexers and internal nodes, effectively decreases the power dissipation and raises the operation speed. Based on the proposed scheme, 32-bit CSAs and CCAs by CMOS, CPL and LCSCPL were carefully compared and analyzed. Up to now, reducing the switching dynamic power dissipation was the primary focus in many of the proposed low-power circuit techniques. Hence the off-state leakage power was neglected compared to dynamic power. However, as technology scales into the deep-submicron age, the increase in leakage power can no longer be ignored. Therefore the Multi-Threshold voltage CMOS (MTCMOS) technology has appeared as an increasingly popular technique to restrain the escalating leakage power, while keeping the goal of high performance. Based upon TSMC 0.25µm Single-layer Salicide 5-layer Metal (1P5M+) MTCMOS Process technology, six 64-bit hybrid dual-threshold CCAs for power-aware applications were presented systematically. Components on critical paths use a low threshold voltage to accelerate the speed of operation, and other components use the normal threshold voltage to save power. This feature is very useful in implementing power-aware arithmetic systems. One of the proposed circuits has the lowest power-delay product and energy-delay product. The hybrid circuit represents a fine compromise between power and performance; its power efficiency is better than that of the single threshold voltage circuit designs.[[tableofcontents]]Chapter 1 Introduction ..................................1 1.1 Evolution of the VLSI ...............................1 1.2 Power Dissipation in CMOS Digital Circuits ..........2 1.2.1 Dynamic Power Dissipation .........................3 1.2.2 Short-Circuit Power Dissipation ...................5 1.2.3 Static Power Dissipation ..........................6 1.2.3.1 DC Power Dissipation ............................6 1.2.3.2 Leakage Power Dissipation .......................7 1.3 Various CMOS Logic Circuits..........................10 1.3.1 Single-end logic: CMOS vs. pass-transistor logic...10 1.3.2 Differential-end logic: complementary pass-transistor logic...11 1.3.3 Dynamic logics ....................................12 1.4 Conventional Adder Schemes ..........................15 1.4.1 Ripple Carry Adder ................................15 1.4.2 Carry Lookahead Adder and Manchester Carry-Chain...16 1.4.3 Carry Select Adder ................................18 1.4.4 Parallel Prefix Adders ............................19 1.4.4.1 Brent-Kung Adder ................................20 1.4.4.2 Ladner-Fischer Adder ............................22 1.4.4.3 Kogge-Stone Adder ...............................22 1.4.4.4 Han-Carlson Adder ...............................22 1.4.5 Conditional Sum Adder ............................24 1.5 Thesis Organization ................................25 Chapter 2 Prioritized Prime Implicant Patterns Puzzle for Logic Synthesis...28 2.1 Motivation of Logic Minimization....................28 2.2 Basic Circling Concepts ............................29 2.2.1 Square of the Karnaugh map (K-map) ...............29 2.2.2 Modified K-map ...................................30 2.2.3 Loop Circling for Simplification .................30 2.2.4 Selected Set of Control Variables ................31 2.2.5 Implicate Loop ...................................32 2.2.6 Circuit Implementation Methods ...................32 2.3 Prioritized Prime Implicant Patterns Puzzle (PPIPP)...32 2.4 Comparisons of Various Logics.........................35 2.5 Summary of the Hybrid Logic by PPIPP..................39 Chapter 3 Low-Power Current-Sensing Complementary Pass-Transistor Logic...40 3.1 Basic Concept of Differential-end Digital Logic ....40 3.2 Circuit Structure and Operational Principle ........43 3.2.1 The CSCPTL Circuit ...............................44 3.2.2 The LCSCPTL Circuit ..............................48 3.3 Performance Comparisons of CPL, CSCPL and LCSCPL ...51 3.4 Chapter Summary ....................................54 Chapter 4 ALL-N-Transistor TSPC Logics ..................56 4.1 Introduction to True-Single Phase Clocking Logic (TSPC) ...56 4.2 Circuit Structures and Operational Principles ......58 4.2.1 Non-full Voltage Swing TSPC (NSTSPC) .............58 4.2.2 All-N-Block TSPC (ANTSPC) ........................60 4.3 Performance Evaluations and Comparisons ............63 4.3.1 Stacks of the MOS Transistors ....................63 4.3.2 Maximum Operation Frequency ......................64 4.4　64-Bit Hierarchical Pipeline Adder Circuit Implementation...67 4.5 Summary of the All-N TSPC Logics....................70 Chapter 5 Mechanisms of Conditional Carry ...............71 5.1 Conditional Sum Adder (CSA) ........................71 5.2 Conditional Carry Architecture .....................75 5.2.1 Conditional Carry Addition Rules .................75 5.2.2 Construction of Conditional Carry Adder (CCA) ....76 5.3 Comparisons of CSA and CCA .........................80 Chapter 6 Implementation and Analysis of Conditional Carry Adder (CCA)...84 6.1 Circuit Implementations of 32-Bit Conditional Carry Adder by CMOS, CPL and LCSCPTL...84 6.2 64-bit Dual-threshold Voltage Conditional Carry Adder Designs...92 6.2.1 Multi-Vth Design Concept .........................92 6.2.2 Construction of Hybrid Architectures .............94 6.2.3 Simulation Results and Comparisons ...............103 6.2.4 Summary of the Power-Aware Design ................105 6.3 High-Speed 64-bit Integer Comparator using Conditional Carry Mechanism...107 6.3.1 Introduction to Integer Comparator and Hardware Sorter...107 6.3.2 Modified 1''s Complement for Comparator Design ....109 6.3.3 Proposed Comparator Architecture .................111 6.3.4 Summary of the Comparator Design .................113 Chapter 7 Concluding Remarks ............................115 Bibliography ............................................118 Appendix A Related Publication List .....................127 List of Tables Table 2.1 Various circuit comparison results of the full swing 2-input XOR function .....37 Table 4.1 Power dissipation and maximum frequency of the 64-bit adder under various MOS models ...69 Table 5.1 Leading control carry of the CSA and the CCA .............79 Table 5.2 Comparisons of 2-to-1 multiplexer numbers of the CSA and the proposed CCA .....83 Table 6.1 Layout area comparisons of 32-bit CSAs and CCAs ..........89 Table 6.2 Three threshold voltage types of TSMC 0.25µm 1p5m+ CMOS Process Technology .....93 Table 6.3 Comparisons of multiplexer number of 64-bit power-aware CCAs ......103 Table 6.4 Comparisons of 64-bit Conditional Sum Adder and hybrid Vth scheme Conditional Carry Adders (worst delay case) .....105 Table 6.5 Another comparisons of 64-bit CSA and hybrid Vth scheme Conditional Carry Adders (worst power dissipation case) ...105 Table 6.6 Transistor count comparison of various bit-length comparators .....113 List of Figures Figure 1.1 Evolution of VLSI .......................................2 Figure 1.2 Propagation delay and power dissipation vs. supply voltage Vdd .....4 Figure 1.3 CMOS inverter and its transfer curve ....................5 Figure 1.4 Short-circuit current of a CMOS inverter during input transition ...5 Figure 1.5 Short channel transistor leakage current mechanisms .....8 Figure 1.6 Estimated leakage power over generations.................9 Figure 1.7 Single-end full-swing logic: (a) CMOS, (b) PTL ..........10 Figure 1.8 CPL Circuit Diagram .....................................11 Figure 1.9 Schematics of NORA ......................................14 Figure 1.10 Schematics of TSPC .....................................14 Figure 1.11 Ripple carry addition rule..............................15 Figure 1.12 Basic carry lookahead adder scheme .....................17 Figure 1.13 4-bit static Manchester carry adder module..............18 Figure 1.14 16-bit carry select adder...............................19 Figure 1.15 Structure of the 16-bit Brent-Kung adder................21 Figure 1.16 Brent-Kung carry-lookahead adder can get a regular layout. Each output wire is a bundle of gi+1 + pi and pi pi+1.....21 Figure 1.17 Structure of the 16-bit Ladner-Fischer adder............22 Figure 1.18 Structure of the 16-bit Kogge-Stone adder...............23 Figure 1.19 Structure of the 16-bit Han-Carlson parallel prefix adder......23 Figure 1.20 Schematic of a 4-bit conditional sum adder..............24 Figure 2.1 Compare CMOS with PTL, a question was raised in the author''s mind: "Does any rule exist that contains all good?" .....28 Figure 2.2 (a) K-map of the XOR function. (b) Modified K-map of the XOR Function .........30 Figure 2.3 (a),(b) The original circling procedures of the 2-input XOR modified K-map.....31 Figure 2.4 Priorities of prime implicant patterns are deduced from electrical characteristics .....33,34 Figure 2.5 2-input variables prime implicant patterns'' priority...........36 Figure 2.6 The proposed PPIPP successfully gets a mixed logic with minimumgate count......36 Figure 2.7 Using bit field structure to reduce the memory requirement effectively.........37 Figure 2.8 Full swing 2-input XOR functions.(a) The proposed logic style. (b) The DPL structure. (c) The DVL structure. (b) The static CMOS structure......37 Figure 2.9 Function F = Not(A) + B * Not(C) by PPIPP.................38 Figure 3.1 (a) Comparisons of the subthreshold conduction currents of low Vt devices. (b) Low Vt devices with the switch-source impedance (SSI) circuit.....41 Figure 3.2 Schematic diagram of the latched CPL circuit..............42 Figure 3.3 Schematic diagram of the DPL circuit......................42 Figure 3.4 Block diagram of the CSCPTL circuit.......................43 Figure 3.5 Schematic diagram of the CSCPTL circuit...................45 Figure 3.6 HSPICE-simulated timing diagram of the CSCPTL circuit.....47 Figure 3.7 Schematic diagram of the current-sensing buffer of the LCSCPTL circuit.....48 Figure 3.8 HSPICE-simulated timing diagrams of the LCSCPTL and CSCPTL circuits......50 Figure 3.9 Logic trees of the LCSCPTL circuit........................50 Figure 3.10 Layout of the CPL and the LCSCPTL circuits for a two-input AND gate.......51 Figure 3.11 Gates speed comparisons of the CPL, the CSCPTL, the LCSCPTL and the static CMOS circuits...........53 Figure 3.12 Power dissipation comparisons of the CPL, the CSCPTL, the LCSCPTL and the static CMOS circuits.....53 Figure 3.13 Energy consumed for the active mode and idle mode........54 Figure 4.1 Pipeline system..............................56 Figure 4.2 Schematic of TSPC Φ-Sec.....................57 Figure 4.3 Schematic of TSPC ΦB-Sec....................57 Figure 4.4 Schematic of the NSTSPC......................58 Figure 4.5 Schematic of the ANTSPC......................58 Figure 4.6 Number of stacked MOS’s versus the output delay time......63 Figure 4.7 Model of the number of stacked MOS''s versus Maximum frequency.....64 Figure 4.8 Max. frequency and power-freq ratio versus number of stacked MOS transistors. The loading is one minimum size inverter......65 Figure 4.9 Max. frequency and power-freq ratio versus number of stacked MOS transistors. The loading is four minimum size inverters....65 Figure 4.10 Max. frequency and power-freq ratio versus supply voltage........65 Figure 4.11 8-bit pipeline CLA architecture. (●: operator, ○: latch) ......66 Figure 4.12 (a) Circuit by NSTSPC for Φ-Sec. (b) Circuit by ANTSPC forΦB-Sec.....66 Figure 4.13 Two-level hierarchical 64-bit CLA architecture............68 Figure 4.14 Test scheme for post-layout simulation....................68 Figure 4.15 2.5V 1.25GHz carry-lookahead adder simulation results.....69 Figure 5.1 Conditional sum addition rules.............................71 Figure 5.2 Schematic and critical delay path of 8-bit conditional sum adder......73 Figure 5.3 The Ware''s 8-bit conditional carry adder uses no multiplexer..........74 Figure 5.4 Conditional carry addition rules...........................75 Figure 5.5 After remove the multiplexers for sum output signals in the CSA, the rest multiplexer network only generate C0, C1, C3, and C7.....77 Figure 5.6 Schematic and critical delay path of 8-bit conditional carry adder.....78 Figure 6.1 Layout of the CPL and the LCSCPTL circuits for a two-input AND gate......84 Figure 6.2 Layout of 32-bit Conditional Sum Adder (CSA) using static CMOS (Single-end) logic. [AREA: 920µm × 550µm].......87 Figure 6.3 Layout of 32-bit Conditional Carry Adder (CCA) using static CMOS (Single-end) logic. [AREA: 920µm × 460µm].....87 Figure 6.4 Layout of 32-bit Conditional Sum Adder (CSA) using CPL logic. (Differential-end) [AREA: 905µm × 720µm].........88 Figure 6.5 Layout of 32-bit Conditional Carry Adder (CCA) using CPL logic. (Differential-end) [AREA: 905µm × 550µm].......88 Figure 6.6 Chip photographs of the 32-b Conditional Carry Adder (CCA) using LCSCPL logic. (Differential-end) [AREA: 905µm × 550µm].....89 Figure 6.7 Propagation delay simulation comparison of 32-bit static CMOS CSAs and CCAs.......90 Figure 6.8 Power-delay product simulation comparison of 32-bit static CMOS CSAs and CCAs.....90 Figure 6.9 Propagation delay measurement comparison of 32-bit CSAs andCCAs.........91 Figure 6.10 Power-delay product measurement comparison of 32-bit CSAs and CCAs.....91 Figure 6.11 Propagation delay comparisons of 32-bit CCAs by CPL and LCSCPTL......92 Figure 6.12 8-bit CCA example by pure normal threshold voltage architecture......95 Figure 6.13 8-bit CCA example by normal-medium hybrid 1 architecture.......96 Figure 6.14 8-bit CCA example by normal-medium hybrid 2 architecture.......97 Figure 6.15 8-bit CCA example by normal-medium hybrid 3 architecture.......98 Figure 6.16 8-bit CCA example by normal-medium hybrid 4 architecture.......99 Figure 6.17 8-bit CCA example by normal-medium hybrid 5 architecture.......100 Figure 6.18 8-bit CCA example by normal-medium hybrid 6 architecture.......101 Figure 6.19 8-bit CCA example by pure medium threshold voltage architecture.....102 Figure 6.20 Chip implementation of the 64-bit static CMOS CCA by NM5 hybrid scheme using TSMC 0.25 µm 1p5m+ multi-threshold CMOS process. (Core area: 350µm × 400µm, die size: 539µm × 636µm).....106 Figure 6.21 Compare and swap elements are vital for sorting................108 Figure 6.22 Three-level bitonic sorter.....................................108 Figure 6.23 Classical circuits of integer comparator ......................109 Figure 6.24 Modified 1''s complement for comparator design..................110 Figure 6.25 8-bit brief example of the proposed comparator architecture.....112[[note]]學號: 686390096, 學年度: 9

Tamkang University Institutional Repository