

# International Journal of VLSI System Design and Communication Systems

ISSN 2322-0929 Vol.04, Issue.04, April-2016, Pages:0252-0255

# Low-Power and Area-Efficient Carry Select Adder BSSV RAMESH BABU<sup>1</sup>, M. UDAY KUMAR<sup>2</sup>, K. VENUGOPAL<sup>3</sup>, B. BABJI<sup>4</sup>

Dept of ECE, Raghu Institute of Technology, Visakhapatnam, AP, India.

**Abstract:** Carry Select Adder (CSLA) is one of the fastest adders used in many data-processing processors to perform fast arithmetic functions. From the structure of the CSLA, it is clear that there is scope for reducing the area and power consumption in the CSLA. This work uses a simple and efficient gate-level modification to significantly reduce the area and power of the CSLA. Based on this modification 8-, 16-, 32-, and 64-b square-root CSLA (SQRT CSLA) architecture have been developed and compared with the regular SQRT CSLA architecture. The proposed design has reduced area and power as compared with the regular SQRT CSLA with only a slight increase in the delay. This work evaluates the performance of the proposed designs in terms of delay, area, power, and their products by hand with logical effort and through custom design and layout in CMOS process technology. The results analysis shows that the proposed CSLA structure takes only 30.385ns which is better than the regular SQRT CSLA.

Keywords: Application-Specific Integrated Circuit(ASIC), Area-Efficient, CSLA, Low Power.

#### I. INTRODUCTION

Design of area- and power-efficient high-speed data path logic systems are one of the most substantial areas of research in VLSI system design. In digital adders, the speed of addition is limited by the time required to propagate a carry through the adder<sup>[1]</sup>. The sum for each bit position in an elementary adder is generated sequentially only after the previous bit position has been summed and a carry propagated into the next position<sup>[2]</sup>. The CSLA is used in many computational systems to alleviate the problem of carry propagation delay by independently generating multiple carries and then select a carry to generate the sum. However, the CSLA is not area efficient because it uses multiple pairs of Ripple Carry Adders (RCA) to generate partial sum and carry by considering carry input and then the final sum and carry are selected by the multiplexers (mux) [3].

The basic idea of this work is to use Binary to Excess-1 Converter (BEC) instead of RCA with in the regular CSLA to achieve lower area and power consumption [4] and the main advantage of this BEC logic comes from the lesser number of logic gates than the n-bit Full Adder (FA) structure. The SQRT CSLA has been chosen for comparison with the proposed de sign as it has a more balanced delay, and requires lower power and area. This brief is structured as follows. Section I deals with the delay and area evaluation methodology of the existing technology. Section II presents the detailed structure and the function of the BEC logic. The SQRT CSLA has been chosen for comparison with the proposed de-sign as it has a more balanced delay, and requires lower power and area [5],[6]. The area evaluation methodology of the proposed SQRT CSLA

are presented in Section  $\mathrm{III}^{[7]}$ . The comparison of existing RCA and proposed BEC adders in terms of time delay and power is shown in section  $\mathrm{IV}^{[8]}$ . Finally, the work is concluded in Section  $\mathrm{V}^{[9]}$ .

#### II. EXISTING TECHNOLOGY

#### A. Regular Carry Select Adder

1. The structure of the 32-bit Carry Select Adder has five groups of different size Ripple Carry Adders. The delay and area evaluation of each group in which the numerals within [ ] specify delay values i.e The group2 shown in Fig.(a) has two sets of 4-b RCA. In below figure it consists one FA, 6:3 MUX and one HA. The 6:3 MUX contains 12 gates (combination of 4:2 and 2:1 muxs).

Delay =6:3
$$mux+FA+HA=>(3*13)+1*6+1*12=57$$
 (1)

2. We now consideration some example delay values, then arrival time of selection input  $\mathbf{c1}[\mathsf{time}(t) = 7]$  of 6:3 mux is earlier than  $\mathbf{s3}[\mathsf{t} = 8]$  and later than  $\mathbf{s2}[\mathsf{t} = 6]$ . Thus,  $\mathbf{sum3}[\mathsf{t} = 11]$  is summation of s3 and mux  $[\mathsf{t} = 3]$  and  $\mathsf{sum2}[\mathsf{t} = 10]$  is summation of c1 and mux. Except for group2, the arrival time of mux selection input is always greater than the arrival time of data outputs from the RCA's. Thus, the delay of group3 to group5 determined, respectively as follows.

$$\{c6, sum [6:4]\} = c3[t = 10] + mux$$
 
$$\{c10, sum [10:7]\} = c6[t = 13] + mux$$
 
$$\{Cout, sum [15:11]\} = c10[t = 16] + mux$$
 (2)

3. The one set of 2-b RCA in group 2has 2 FA for Cin = 1 and the other set has 1 FA and 1 HA for Cin = 0 Based on the area, the total number of gate counts is determined as follows:

#### BSSV RAMESH BABU, M. UDAY KUMAR, K. VENUGOPAL, B. BABJI





Fig 1: 32-bit Regular CSLA Architecture.





Fig 2: Delay and area evaluation of regular SQRT CSLA: (a) group2,(b)group3, (c) group4, and (d) group5. F is a Full Adder.

Similarly, the estimated maximum delay and area of the other groups in the regular SQRT CSLA are evaluated and listed in Table

TABLE I: EVALUATED SQRT CSLA

|        | D.1   |      |
|--------|-------|------|
| Group  | Delay | Area |
| Group2 | 11    | 57   |
| Group3 | 13    | 87   |
| Group4 | 16    | 117  |
| Group5 | 19    | 147  |

## III. PROPOSED TECHNOLOGY

Design of high speed data path logic systems are one of the most substantial research area in VLSI system design. High-speed addition and multiplication has always been a fundamental requirement of high-performance processors and systems. The major speed limitation in any adder is in

#### Low-Power and Area-Efficient Carry Select Adder

the production of carries and many authors have considered the addition problem. The basic idea of the proposed work is using n-bit Binary to Excess-1 Converters (BEC) to improve the speed of addition. This logic can be implemented with Carry Select Adder to Achieve Low Power and Area Efficiency. The proposed 32-bit Carry Select Adder compared with the Carry Skip Adder (CSKA) and Regular 32-bit Carry Select Adder.

The CSLA is used in many computational systems to alleviate the problem of carry propagation delay by independently generating multiple carries and then select a carry to generate the sum. However, the CSLA is not area efficient because it uses multiple pairs of Ripple Carry Adders (RCA) to generate partial sum and carry by considering carry input Cin = 0 and Cin = 1, then the final sum and carry are selected by the multiplexers (mux). The entire work performed by usage of Binary to Excess-1 Converter (BEC) instead of RCA with Cin = 1 in the regular CSLA to achieve lower power consumption The main advantage of this BEC logic comes from the lesser number of logic gates than the n- bit Ripple Carry Adder (RCA). A structure of 4-bit BEC and the truth table is shown in Fig. and Table 1 respectively.



Fig 3:4-bit Binary to Excess-1 Converter (BEC).

**TABLE II: FUNCTIONAL TABLE OF 4-BIT BEC** 

| B[3:0] | X[3:0] |
|--------|--------|
| 0000   | 0001   |
| 0001   | 0010   |
| :      | :      |
| :      | :      |
| 1110   | 1111   |
| 1111   | 0000   |
|        |        |

How the goal of fast addition is achieved using BEC together with a multiplexer (mux) is described in above Fig., one input of the 8:4 mux gets as it input (B3, B2, B1, and B0) and another input of the Mux is the BEC output. This produces the two possible partial product results in parallel and the Mux'es are used to select either BEC output or the

direct inputs according to the control signal Cin. The Boolean expressions of 4-bit BEC are listed below, (Note: functional symbols, ~ NOT, & AND, ^ XOR).

The AND, OR, and Inverter (AOI) implementation of an XOR, 2:1 MUX, FA plays main role in determining delay contributed by the gate. The delay and area evaluation methodology considers all gates to be made up of AND, OR, and Inverter, each having delay equal to 1 unit and area equal to 1 unit. We then add up the number of gates in the longest path of a logic block that contributes to the maximum delay. The area evaluation is done by counting the total number of AOI gates required for each logic block.



Fig 4:4-b BEC with 8:4 mux.

# IV. AREA EVALUATION METHODOLOGY OF MODIFIED 16-B SQRT CSLA:

The structure of the proposed 16-b SQRT CSLA using BEC for RCA to optimize the area and power is shown in Fig 5.



Fig 5: BEC Converter.

## BSSV RAMESH BABU, M. UDAY KUMAR, K. VENUGOPAL, B. BABJI

### V. RESULT



Fig 6: Result Diagram.

**TABLE III:** COMPARISON TABLE BETWEEN CSLA ADDERS WITH RCA AND BEC IN TERMS OF TIMING (DELAY) AND POWER

|           | TIMING   | POWER |
|-----------|----------|-------|
|           | REPORT   |       |
| CSLA WITH | 30.385ns | 7mw   |
| BEC       |          |       |
| CSLA WITH | 51.536ns | 27mw  |
| RCA       |          |       |

#### VI. CONCLUSION

Addition is the most common and often used arithmetic operation on microprocessor, digital signal processor, especially digital computers. Also, it serves as a building block for synthesis all other arithmetic operations. Therefore, regarding the efficient implementation of an arithmetic logic unit, the adder structures become a very critical hardware unit. In any book on computer arithmetic, someone looks that there exists a large number of different with different circuit architectures performance characteristics and widely used in the practice. Although many researches dealing with the adder structures have been done, the studies based on their comparative performance analysis are only a few. Digital Adders are the core block of DSP processors. The final carry propagation adder (CPA) structure of many adders constitutes high carry propagation delay and this delay reduces the overall performance of the DSP processor. In this project, qualitative evaluations of the CSLA adder switch and without BEC architectures are given. Among the huge member of the adders we wrote Verilog (Hardware Description Language) code for Carry skip and carry select adders to emphasize the common performance properties belong to their classes. With respect to delay time and power consumption we can conclude that the implementation of CSLA with BEC is efficient. The main advantage of this BEC logic comes from the lesser number of logic gates than the n-bit Full Adder (FA) structure. Now a day's Carry Select Adder (CSLA) used in many data-processing processors to perform fast arithmetic function. That's why we have designed a configurable adder with minimal delay overhead, and power efficient. CSLA RCA can be replaced by CSLA BEC Where the speed and power are the major constraints. The proposed CSLA BEC consumes only 17mw which is very less when compare to the existing CSLA RCA which consumes 37mw.

#### VII. REFERENCES

- [1] B. Ramkumar, H.M. Kittur, and P. M.Kannan, "ASIC implementation of modified faster carry save adder," Eur. J.Sci. Res., vol. 42, no. 1, pp. 53–58, 2010.
- [2] D. Radhakrishnan, "Low-voltage low power CMOS full adder," in Proc. IEEE Circuits Devices Syst., vol. 148, Feb. 2001.
- [3] T. Y. Ceiang and M. J. Hsiao, "Carry select Adder using single ripple carry Adder," Electron. Lett., vol. 34, no. 22, pp. 2101–2103, Oct. 1998.
- [4] E. Abu-Shama and M. Bayoumi, "A new cell for low power adders," in Proc. Int. Midwest Symp. Circuits and Systems, 1995, pp. 1014–1017.
- [5] J. I. Acha, "Computational structures for fast implementation of L-path and L-block digital filters," IEEE Trans. Circuit Syst., vol. 36, no. 6, pp. 805–812, Jun. 1989.
- [6] C. Cheng and K. K. Parhi, "Hardware efficient fast parallel FIR filter structures based on iterated short convolution," IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 51, no. 8, pp. 1492–1500, Aug. 2004.
- [7] Y. He, C. H. Chang, and J. Gu, "An area efficient 64-bit square root carry-select adder for lowpower applications," in Proc. IEEE Int. Symp. Circuits Syst., 2005, vol. 4, pp. 4082–4085.
- [8] Basant kumar and sujit kumar patel ," Area-delay-power efficient carry select adder ", IEEE Transaction on circuits and systems II,2013.
- [9] NAGENDRA, C., IRWIN, M.J., OWENS, R.M, "Areatime-power tradeoffs in parallel adders", IEEE Trans. CAS-II, 43, (10), pp. 689-702.