Introduction
Due to the rapid growth of portable electronic component the low power arithmetic circuit have become very important in VLSI industry. Multiplier-Accumulator (MAC) unit is the main building block in DSP processor. Full Adder is a part of the MAC unit can significantly affect the efficiency of whole system. Hence the reduction of power consumption of Full Adder circuit is necessary for low power application. Carry Select Adder are used for high speed application by reducing propagation delay.
The basic operation Carry Select Adder (CSLA) is parallel computation. CSLA generates many carriers and partial sum [1] . The final sum and carry are selected by multiplexers (mux). Multiple pairs of Ripple Cary Adders (RCA) are used in CSLA structure. Hence, the CSLA is not area efficient. In this paper, we propose a CSLA architecture.
The proposed method use Binary to Excess-1 converter (BEC) instead of RCA with Cin=1 in the regular CSLA. The main goal of this BEC logic is to use lesser number of logic gate than the n-bit Full Adder. So that, the modified CSLA architecture is lower area and power consumption [2] - [4] . The details of the BEC logic are discussed in Section III. This paper is organized as follows. Section II presents the delay evaluation methodology of basic adder block. The structure and the function of the BEC logic comes from the Section III. The SQRT CSLA has been chosen for comparison with the proposed design as is has more balanced delay and need lower power [5] - [6] . The delay evaluation methodology of the regular and modified SQRT CSLA are presented in Sectioned IV and V, respectively. Section VI reviews the results obtained from the simulations and Section VII concludes this work.
II. Delay And Area Evaluation Methodology Of The Basic Adder Blocks
An XOR gate consists of basic gates like AND, OR, and Inverter (AOI) shown in Fig. 1 . The gates are performing parallel operation between the dotted line and the numeric representation of each gate indicates the delay contributed by that gate. For the delay and area evaluation methodology all the gates having equal to 1 unit delay and 1 unit area.. The maximum delay can be find out by adding gates of a longest path of a logic block. Based on this approach, the CSLA adder blocks of 2:1 mux, Half Adder(HA), and Full Adder (FA) are evaluated and listed in Table I . 
III. Bec Logic Gate
The proposed method uses BEC logic. The regular CSLA structure consists of two Ripple Carry Adders (RCA). One of RCA use with initial carry Cin=0 and with carry Cin=1. BEC is use instead of RCA with Cin=1 in order to reduce and power consumption of the regular CSLA. To replace the n-bit RCA, an n+1 bit BEC is required. The structure of a 4-bit BEC is shown in Fig. 2 and Table II shows its corresponding Boolean expression. Similarly, the maximum delay and area of the other groups can be calculated in the regular SQRT CSLA are evaluated in Table III.   TABLE III : DELAY AND AREA COUNT OF REGULAR CSLA  Group  Delay  Area  Group2  11  57  Group3  13  87  Group4  16  117  Group5 19 147
V. Delay and Area Evaluation Methodology of Proposed 16-bit SQRT CSLA
The Modified 16-bit SQRT CSLA is shown in Fig. 6 . RCA with Cin = 1 is replaced by BEC logic gates. These are again five groups. Fig. 7 . provides delay and area estimation of each group. a) The group2 [see Fig. 7 (a)] has one 2-bit RCA which has 1 FA and 1 HA for Cin = 0. A 3-bit BEC is used in place of another 2-bit RCA with Cin = 1.The 3-bit RCA adds one to the output from 2-bit RCA. Delay consideration of Table I , the arrival time of selection input c1[time (t) = 7] of 6:3 mux is earlier than the s3[t=9] and c3[t = 10] and later than the s2[t = 4]. Thus, the sum3 and final c3 (output from mux are depending on s3 and mux and partial c3 ( input to mux ) and mux ,respectively. The sum2 depends on c1 and mux.
b) For the rest of the group's the arrival time of mux selection input is always greater than the arrival time of data inputs from the BEC's. Hence, the delay of the remaining groups depends on the arrival time of mux selection input and the mux delay.
c) The area count of group2 is calculated as follows: d) Similarly, the maximum delay and the area of the groups of the modified SQRT CSLA are evaluated in Table IV . 
VI. Simulation Results
The proposed 40-bit SQRT CSLA has been developed using TSMC 0.13-µm and compared with TSMC 0.18-µm CMOS process technology. 
VII. Conclusion
In this paper, a modified 40-bit SQRT CSLA has been proposed for data path circuit (MAC unit) for low power DSP application. Table V and Table VI shows that modified CSLA has reduced the power as compare with regular CSLA with slightly increase in delay. The reduction in the number of gates of this work offers great advantage in terms of area and power. The compared results also show that the modified SQRT CSLA has lower power-delay product (PDP). Hence, the proposed CSLA architecture is better in terms of PDP which leads the better utilization of the DSP processor.
VIII.
