Abstract
Skip Adder(CSA) [5] [6] ,carry increment [7] [8]and carry select [9] [10] have O(n) area and 2/ 1 ( )
delay provides a good compromise in terms of area and delay, along with a simple and regular layout. Carry save adder have O(n) area and O(log n) delay. CLA adders can be realized in two gate levels provided there is no limit on fan in/out. The carry select adders (CSelA) reduce the computation time by pre-computing the sum for all possible carry bit values (ie '0' and '1'). After the carry becomes available the correct sum is selected using multiplexer. Carry Select Adder are in the class of fast adders, but they suffer from fan-out limitation since the number of multiplexers that need to be driven by the carry signal increases exponentially. In the worst case, a carry signal is used to select n/2 multiplexers in an n-bit adder. When three or more operands are to be added simultaneously using two operand adders, the time consuming carry propagation must be repeated several times. If the number of operands is 'k', then carries have to propagate (k-1). The existing adder topology is presented in Figure (1) .
In the present work, the design of an 8-bit adder topology like ripple carry adder, carry lookahead adder, carry skip adder, carry select adder, carry increment adder, carry save adder and carry bypass adder are presented. The functionality and performance analysis are done using microwind. Since Microwind integrates traditionally separated front-end and back-end chip design into an integrated flow, accelerating the design cycle and reduced design complexities. It tightly integrates mixed-signal implementation with digital implementation, circuit simulation, transistor-level extraction and verification. Performance issues like area, power dissipation and propagation delay for all the adders are analyzed at 0.12µm 6metal layer CMOS technology using microwind tool.
The remainder of this paper is organized as follows. Section II explains the topology detail of 8-bit adders. Section III presents the performance analysis. Section IV presents the simulation results implemented in 0.12-µm CMOS technology. Section V discusses summary and the final section presents the conclusion.
II. ADDER TOPOLOGIES
This section presents the design of adder topology. In this work the following adder structures are used:
• Ripple Carry Adder 
A. Ripple Carry Adder (RCA)
The ripple carry adder is constructed by cascading full adders (FA) blocks in series. One full adder is responsible for the addition of two binary digits at any stage of the ripple carry. The carryout of one stage is fed directly to the carry-in of the next stage. Even though this is a simple adder and can be used to add unrestricted bit length numbers, it is however not very efficient when large bit numbers are used. One of the most serious drawbacks of this adder is that the delay increases linearly with the bit length. The worst-case delay of the RCA is when a carry signal transition ripples through all stages of adder chain from the least significant bit to the most significant bit, which is approximated by:
( 1) c s t n t t = − +
Eq (1) where t c is the delay through the carry stage of a full adder, and t s is the delay to compute the sum of the last stage. The delay of ripple carry adder is linearly proportional to n, the number of bits, therefore the performance of the RCA is limited when n grows bigger. The advantages of the RCA are lower power consumption as well as compact layout giving smaller chip area. The design schematic of RCA is shown in Figure ( 2). The simulation result is shown in Figure (3a) . The carry-save adder [11] [12]reduces the addition of 3 numbers to the addition of 2 numbers. The propagation delay is 3 gates regardless of the number of bits. The carry-save unit consists of n full adders, each of which computes a single sum and carries bit based solely on the corresponding bits of the three input numbers. The entire sum can then be computed by shifting the carry sequence left by one place and appending a 0 to the front (most significant bit) of the partial sum sequence and adding this sequence with RCA produces the resulting n + 1-bit value. This process can be continued indefinitely, adding an input for each stage of full adders, without any intermediate carry propagation. These stages can be arranged in a binary tree structure, with cumulative delay logarithmic in the number of inputs to be added, and invariant of the number of bits per input. The main application of carry save algorithm is, well known for multiplier architecture is used for efficient CMOS implementation of much wider variety of algorithms for high speed digital signal processing .CSA applied in the partial product line of array multipliers will speed up the carry propagation in the array. The design schematic of Carry Save Adder is shown in Figure ( 
C. Carry Look-Ahead Adder
Carry look-ahead adder is designed to overcome the latency introduced by the rippling effect of the carry bits. The propagation delay occurred in the parallel adders can be eliminated by carry look ahead adder. This adder is based on the principle of looking at the lower order bits of the augends and addend if a higher order carry is generated. This adder reduces the carry delay by reducing the number of gates through which a carry signal must propagate. Carry look ahead depends on two things: Calculating for each digit position, whether that position is going to propagate a carry if one comes in from the right and combining these calculated values to be able to deduce quickly whether, for each group of digits, that group is going to propagate a carry that comes in from the right. The net effect is that the carries start by propagating slowly through each 4-bit group, just as in a ripple-carry system, but then moves 4 times faster, leaping from one look ahead carry unit to the next. Finally, within each group that receives a carry, the carry propagates slowly within the digits in that group
This adder consists of three stages: a propagate block/ generate block, a sum generator and carry generator. The generate block can be realized using the expression
Eq (2) Similarly the propagate block can be realized using the expression An 8-bit increment adder includes two RCA (Ripple carry adder) of four bit each. The first ripple carry adder adds a desired number of first 4-bit inputs generating a plurality of partitioned sum and partitioned carry. Now the carry out of the first block RCA is given to CIN of the conditional increment block. Thus the first four bit sum is directly taken from the ripple carry output. The second RCA block regardless of the first RCA output will carry out the addition operation and will give out results which are fed to the conditional increment block. The input CIN to the first RCA block is given always low value. The conditional increment block consists of half adders. Based on the value of cout of the 1 st RCA block, the increment operation will take place. Here the half adder in carry increment block performs the increment operation. Hence the output sum of the second RCA is taken through the carry increment block. The design schematic of Carry Increment Adder is shown in Figure ( 
D. Carry Skip Adder (CSkA)
A carry-skip adder consists of a simple ripple carry-adder with a special speed up carry chain called a skip chain. Carry skip adder is a fast adder compared to ripple carry adder when addition of large number of bits take place; carry skip adder has O(√n) delay provides a good compromise in terms of delay, along with a simple and regular layout This chain defines the distribution of ripple carry blocks, which compose the skip adder. A carry-skip adder is designed to speed up a wide adder by aiding the propagation of a carry bit around a portion of the entire adder. Actually the ripple carry adder is faster for small values of N. However the industrial demands these days, which most desktop computers use word lengths of 32 bits like multimedia processors, makes the carry skip structure more interesting. The crossover point between the ripple-carry adder and the carry skip adder is dependent on technology considerations and is normally situated 4 to 8 bits. The carry-skip circuitry consists of two logic gates. The AND gate accepts the carry-in bit and compares it to the group propagate signal 
E. Carry Bypass Adder (CByA)
As in a ripple-carry adder, every full adder cell has to wait for the incoming carry before an outgoing carry can be generated. This dependency can be eliminated by introducing an additional bypass (skip) to speed up the operation of the adder. An incoming carry Ci,0=1 propagates through complete adder chain and causes an outgoing carry C0,7=1 under the conditions that all propagation signals are 1. This information can be used to speed up the operation of the adder, as shown in Figure (8) . When BP = P0P1P3P4P5P6P7P8 = 1, the incoming carry is forwarded immediately to the next block through the bypass and if it is not the case, the carry is obtained via the normal route. If (P0P1P3P4P5P6P7 = 1) then C0,7 = Ci,0 else either Delete or Generate occurred. Hence, in a CBA the full adders are divided into groups, each of them is "bypassed" by a multiplexer if its full adders are all in propagate. The simulation result is shown in Figure (3f) . 
F. Carry Select Adder (CSelA)
A carry-select adder is divided into sectors, each of which -except for the least-significantperforms two additions in parallel, one assuming a carry-in of zero, the other a carry-in of one. A four bit carry select adder generally consists of two ripple carry adders and a multiplexer. The carry-select adder is simple but rather fast, having a gate level depth of . Adding two n-bit numbers with a carry select adder is done with two adders (two ripple carry adders) in order to perform the calculation twice, one time with the assumption of the carry being zero and the other assuming one. After the two results are calculated, the correct sum, as well as the correct carry, is then selected with the multiplexer once the correct carry is known. The design schematic of Carry Select Adder is shown in Figure (9) . A carry-select adder speeds 40% to 90%faster than RCA by performing additions in parallel and reducing the maximum carry path. The simulation result is shown in Figure (3g) . 
III. PERFORMANCE ANALYSIS
To evaluate performance; the adder structures discussed in this paper was designed using 0.12µm CMOS technology using Microwind. The microwind tool integrates traditionally separated frontend and back-end chip design into an integrated flow, accelerating the design cycle and reduced design complexities. It tightly integrates mixed-signal implementation with digital implementation, circuit simulation, transistor level extraction and verification. All simulations are carried out at nominal conditions: VDD=1. Table 1 presents the performance analysis of different adder topologies. Table 2 presents the parameters of AT, AT 2 and PD values of adders. Table 3 and 4 presents the energy delay and parasitic extraction values. All the adders are simulated with multiple design corners (TT, FF, FS, and SS) to verify that operation across variations in device characteristics and environment. To establish an unbiased testing environment, the simulations have been carried out using a comprehensive input signal pattern, which covers every possible transition for all the adders. The frequencies have been chosen in the range from 10 to 500MHz and its input and output capacitance is set to 10pf.
IV. SIMULATION RESULT
This section presents the simulated results of adder topologies. The above adder topologies are simulated using Microwind DSCH 3.1. Functional testing and timing analysis were carried out for the entire adder module used in this work. The MICROWIND software is dedicated to the training in sub micron CMOS VLSI design, consisting in a layout editor, electrical circuit extractor and a fast online analog simulator. The technology library used in this work is CMOS 6-metal layers 0.12µm technology, consequently lambda is 0.06µm (60nm). The microwind simulation provides two environments like logic editor and simulator. They are DSCH and MW used to validate logic design simulation with delay analysis and physical circuit extraction. All the adders used in this work are simulated using DSCH. First the simulation is performed using schematic entry and its corresponding test patterns are generated and it's functionally is verified. After verification the schematic file is converted to VERILOG file. Secondly using MW environment the VERILOG file is imported using the command "compile verilog file" so that the schematic of the logic design will be converted into physical layout. Using this physical layout the parasitic values like resistance, capacitance, node voltage and current can be estimated. When the design is converted into physical layout the MW tool will automatically generate the spice netlist providing the information regarding the transistor model used, its temperature condition and transistor second order values. An extraction of spice netlist for full adder is shown in Figure  14 . The simulation result of adder topologies is shown in Figure ( 3).
V. SUMMARY
In this work, the performances of adder topologies are tested for robustness against area, delay and power dissipation. They are selected for this work since they have been commonly used in many applications. Addition is an indispensable operation for any high speed digital system, digital signal processing or control system. Therefore pertinent choice of adder topologies is an essential importance in the design of VLSI integrated circuits for high speed and high performance CMOS circuits. The operating frequency of adder topologies are set at 500MHz and its power dissipation and delay are observed. The graph in Figure (10a) shows the distribution of power dissipation values of different adder topology. Figure (10b, c, d ) represents the area distribution, transistor count and delay distribution of adders.
From the power distribution graph it is observed that the maximum power dissipation occurs for carry select adder and next comes the carry save adder. The least power dissipation occurs for ripple carry adder and carry increment adders. From the area distribution and gate count the carry select and carry save adders occupies more area and gate count, ripple carry and carry increment occupies less area and gate count. From the delay comparison it is observed that the maximum delay occurs for ripple carry adder. The minimum delay occurs for carry select, carry increment and carry save adders. The overall comparison presents the tradeoff between area, power dissipation and delay. Figure (12) shows the automated layout generated using microwind MW03. All data for area, delay and power dissipation are obtained by microwind tool and simulations performed at the 0.12µm technology with power calculated using Predictive Technology Model (PTM). The granularity of transistor size is set to the minimum width of 1.02µm and the minimum length of 0.12µm for NMOS and the minimum width of 1.98 and the minimum length 0.12for PMOS. The simulated result for the maximum and average drain current IDDMAX and IDDAVG is shown in Figure. 13.
VI CONCLUSION
In this work, an exhaustive analysis of adder topologies in 0.12µm CMOS technologies has been carried out. The comparison has been performed with area, delay and power dissipation. The impact of layout parasitics has been included in the transistor-level design phase. The Performance analysis, simulation result and comparison are reported in section III, IV and V. According to the presented results, the adder topology which has the best compromise between area, delay and power dissipation are carry look-ahead and carry increment adders and they are suitable for high performance and low-power circuits. The fastest adders are carry select and carry save adders with the penalty of area. The simplest adder topologies that are suitable for low power applications are ripple carry adder, carry skip and carry bypass adder with least gate count and maximum delay. 
