41 research outputs found

    High-Performance Design of a 4-Bit Carry Look-Ahead Adder in Static CMOS Logic

    Get PDF
    Design of a 4-bit Carry Look-Ahead (CLA) process in static CMOS logic has been presented. CLA architecture proposed in this work computes carry-out terms without using carry-propagate and carry-generate signals which are used in conventional static CMOS (C-CMOS) 4-bit CLA adder. Performance parameters of the proposed 4-bit CLA architecture have been simulated and validated by comparing with the conventional design using Cadence design toolset in 45 nm technology. The designs were compared in terms of average power consumption, propagation delay and power delay product (PDP). The proposed 4-bit CLA topology obtained 34.53 % improvement in speed, 4.84 % improvement in power consumption and 37.696 % improvement in PDP while the source voltage was 1.0 V. Hence, as per acquired simulation results, the proposed 4-bit CLA structure is proven to be an excellent alternative to the conventional design for data-path design in modern high-performance processors.Design of a 4-bit Carry Look-Ahead (CLA) process in static CMOS logic has been presented. CLA architecture proposed in this work computes carry-out terms without using carry-propagate and carry-generate signals which are used in conventional static CMOS (C-CMOS) 4-bit CLA adder. Performance parameters of the proposed 4-bit CLA architecture has been simulated and validated by comparing with the conventional design using Cadence design toolset in 45 nm technology. The designs were compared in terms of average power consumption, propagation delay and power delay product (PDP). The proposed 4-bit CLA topology obtained 26.67 % improvement in speed, 5.966 % improvement in power consumption and 31.06 % improvement in PDP while the source voltage was 1.0 V. Hence, as per acquired simulation results, the proposed 4-bit CLA structure is proven to be an excellent alternative to the conventional design for data-path design in modern high-performance processors

    High Speed and Low Power Consumption Carry Skip Adder using Binary to Excess-One Converter

    Get PDF
    Arithmetic and Logic Unit (ALU) is a vital component of any CPU. In ALU, adders play a major role not only in addition but also in performing many other basic arithmetic operations like subtraction, multiplication, etc. Thus realizing an efficient adder is required for better performance of an ALU and therefore the processor. For the optimization of speed in adders, the most important factor is carry generation. For the implementation of a fast adder, the generated carry should be driven to the output as fast as possible, thereby reducing the worst path delay which determines the ultimate speed of the digital structure. In conventional carry skip adder the multiplexer is used as a skip logic that provides a better performance and performs an efficient operation with the minimum circuitry. Even though, it affords a significant advantages there may be a large critical path delay revealed by the multiplexer that leads to increase of area usage and power consumption. The basic idea of this paper is to use Binary to Excess-1 Converters (BEC) to achieve lower area and power consumption

    Power Efficient and High Speed Carry Skip Adder using Binary to Excess One Converter

    Get PDF
    The design of high-speed and low-power VLSI architectures need efficient arithmetic processing units, which are optimized for the performance parameters, namely, speed and power consumption. Adders are the key components in general purpose microprocessors and digital signal processors. As a result, it is very pertinent that its performance augers well for their speed performance. Additionally, the area is an essential factor which is to be taken into account in the design of fast adders. Towards this end, high-speed, low power and area efficient addition and multiplication have always been a fundamental requirement of high-performance processors and systems. The major speed limitation of adders arises from the huge carry propagation delay encountered in the conventional adder circuits, such as ripple carry adder and carry save adder. Observing that a carry may skip any addition stages on certain addend and augend bit values, researchers developed the carry-skip technique to speed up addition in the carry-ripple adder. Using a multilevel structure, carry-skip logic determines whether a carry entering one block may skip the next group of blocks. Because multilevel skip logic introduces longer delays, Therefore, in this paper we examine The basic idea of this work is to use Binary to Excess- 1 converter (BEC) instead of RCA with Cin=1 in conventional CSkA in order to reduce the area and power. BEC uses less number of logic gates than N-bit full adder

    High-Performance, Energy-Efficient CMOS Arithmetic Circuits

    Get PDF
    In a modern microprocessor, datapath/arithmetic circuits have always been an important building block in delivering high-performance, energy-efficient computing, because arithmetic operations such as addition and binary number comparison are two of the most commonly used computing instructions. Besides the manufacturing CMOS process, the two most critical design considerations for arithmetic circuits are the logic style and micro-architecture. In this thesis, a constant-delay (CD) logic style is proposed targeting full-custom high-speed applications. The constant delay characteristic of this logic style (regardless of the logic type) makes it suitable for implementing complicated logic expressions such as addition. CD logic exhibits a unique characteristic where the output is pre-evaluated before the inputs from the preceding stage are ready. This feature enables a performance advantage over static and dynamic domino logic styles in a single cycle, multi-stage circuit block. Several design considerations including timing window width adjustment and clock distribution are discussed. Using a 65-nm general-purpose CMOS technology, the proposed logic style demonstrates an average speedup of 94% and 56% over static and dynamic domino logic, respectively, in five different logic gates. Simulation results of 8-bit ripple carry adders conclude that CD logic is 39% and 23% faster than the static and dynamic-based adders, respectively. CD logic also demonstrates 39% speedup and 64% (22%) energy-delay product reduction from static logic at 100% (10%) data activity in 32-bit carry lookahead adders. To confirm CD logic's potential, a 148 ps, single-cycle 64-bit adder with CD logic implemented in the critical path is fabricated in a 65-nm, 1-V CMOS process. A new 64-bit Ling adder micro-architecture, which utilizes both inversion and absorption properties to minimize the number of CD logic and the number of logic stage in the critical path, is also proposed. At 1-V supply, this adder's measured worst-case power and leakage power are 135 mW and 0.22 mW, respectively. A single-cycle 64-bit binary comparator utilizing a radix-2 tree structure is also proposed. This comparator architecture is specifically designed for static logic to achieve both low-power and high-performance operation, especially in low input data activity environments. At 65-nm technology with 25% (10%) data activity, the proposed design demonstrates 2.3x (3.5x) and 3.7x (5.8x) power and energy-delay product efficiency, respectively. This comparator is also 2.7x faster at iso-energy (80 fJ) or 3.3x more energy-efficient at iso-delay (200 ps) than existing designs. An improved comparator, where CD logic is utilized in the critical path to achieve high performance without sacrificing the overall energy efficiency, is also realized in a 65-nm 1-V CMOS process. At 1-V supply, the proposed comparator's measured delay is 167 ps, and has an average power and a leakage power of 2.34 mW and 0.06 mW, respectively. At 0.3-pJ iso-energy or 250-ps iso-delay budget, the proposed comparator with CD logic is 20% faster or 17% more energy-efficient compared to a comparator implemented with just the static logic

    Demonstration of Inexact Computing Implemented in the JPEG Compression Algorithm using Probabilistic Boolean Logic applied to CMOS Components

    Get PDF
    Probabilistic computing offers potential improvements in energy, performance, and area compared with traditional digital design. This dissertation quantifies energy and energy-delay tradeoffs in digital adders, multipliers, and the JPEG image compression algorithm. This research shows that energy demand can be cut in half with noisesusceptible16-bit Kogge-Stone adders that deviate from the correct value by an average of 3 in 14 nanometer CMOS FinFET technology, while the energy-delay product (EDP) is reduced by 38 . This is achieved by reducing the power supply voltage which drives the noisy transistors. If a 19 average error is allowed, the adders are 13 times more energy-efficient and the EDP is reduced by 35 . This research demonstrates that 92 of the color space transform and discrete cosine transform circuits within the JPEG algorithm can be built from inexact components, and still produce readable images. Given the case in which each binary logic gate has a 1 error probability, the color space transformation has an average pixel error of 5.4 and a 55 energy reduction compared to the error-free circuit, and the discrete cosine transformation has a 55 energy reduction with an average pixel error of 20

    ARITHMETIC LOGIC UNIT ARCHITECTURES WITH DYNAMICALLY DEFINED PRECISION

    Get PDF
    Modern central processing units (CPUs) employ arithmetic logic units (ALUs) that support statically defined precisions, often adhering to industry standards. Although CPU manufacturers highly optimize their ALUs, industry standard precisions embody accuracy and performance compromises for general purpose deployment. Hence, optimizing ALU precision holds great potential for improving speed and energy efficiency. Previous research on multiple precision ALUs focused on predefined, static precisions. Little previous work addressed ALU architectures with customized, dynamically defined precision. This dissertation presents approaches for developing dynamic precision ALU architectures for both fixed-point and floating-point to enable better performance, energy efficiency, and numeric accuracy. These new architectures enable dynamically defined precision, including support for vectorization. The new architectures also prevent performance and energy loss due to applying unnecessarily high precision on computations, which often happens with statically defined standard precisions. The new ALU architectures support different precisions through the use of configurable sub-blocks, with this dissertation including demonstration implementations for floating point adder, multiply, and fused multiply-add (FMA) circuits with 4-bit sub-blocks. For these circuits, the dynamic precision ALU speed is nearly the same as traditional ALU approaches, although the dynamic precision ALU is nearly twice as large

    A Constant Delay Logic Style - An Alternative Way of Logic Design

    Get PDF
    High performance, energy efficient logic style has always been a popular research topic in the field of very large scale integrated (VLSI) circuits because of the continuous demands of ever increasing circuit operating frequency. The invention of the dynamic logic in the 80s is one of the answers to this request as it allows designers to implement high performance circuit block, i.e., arithmetic logic unit (ALU), at an operating frequency that traditional static and pass transistor CMOS logic styles are difficult to achieve. However, the performance enhancement comes with several costs, including reduced noise margin,charge-sharing noise, and higher power dissipation due to higher data activity. Furthermore, dynamic logic has gradually lost its performance advantage over static logic due to the increased self-loading ratio in deep-submicron technology (65nm and below) because of the additional NMOS CLK footer transistor. Because of dynamic logic's limitations and diminished speed reward, a slowly rising need has emerged in the past decade to explore new logic style that goes beyond dynamic logic. In this thesis a constant delay (CD) logic style is proposed. The constant delay characteristic of this logic style regardless of the logic expression makes it suitable in implementing complicated logic expression such as addition. Moreover, CD logic exhibits a unique characteristic where the output is pre-evaluated before the inputs from the preceding stage is ready. This feature enables performance advantage over static and dynamic logic styles in a single cycle, multi-stage circuit block. Several design considerations including appropriate timing window width adjustment to reduce power consumption and maintain sufficient noise margin to ensure robust operations are discussed and analyzed. Using 65nm general purpose CMOS technology, the proposed logic demonstrates an average speed up of 94% and 56% over static and dynamic logic respectively in five different logic expressions. Post layout simulation results of 8-bit ripple carry adders conclude that CD-based design is 39% and 23% faster than the static and dynamic-based adders respectively. For ultra-high speed applications, CD-based design exhibits improved energy, power-delay product, and energy-delay product efficiency compared to static and dynamic counterparts

    An Ultra-Low-Power 75mV 64-Bit Current-Mode Majority-Function Adder

    Get PDF
    Ultra-low-power circuits are becoming more desirable due to growing portable device markets and they are also becoming more interesting and applicable today in biomedical, pharmacy and sensor networking applications because of the nano-metric scaling and CMOS reliability improvements. In this thesis, three main achievements are presented in ultra-low-power adders. First, a new majority function algorithm for carry and the sum generation is presented. Then with this algorithm and implied new architecture, we achieved a circuit with 75mV supply voltage operation. Last but not least, a 64 bit current-mode majority-function adder based on the new architecture and algorithm is successfully tested at 75mV supply voltage. The circuit consumed 4.5nW or 3.8pJ in one of the worst conditions
    corecore