Abstract In Digital systems like digital signal processors, FIR filters and micro processors etc, Multiplier is one of the main hardware blocks. Generally, the performance of system is determined by the multiplier performance because the multiplier is generally the slowest element in the whole system and also it is occupying more area. In the multiplier, we use adder circuit repeatedly. So, an efficient adder circuit will be used in multipliers, it gives better performance. In the proposed work placing the new Carry Select Adders (CSLA) to enhance the multiplier performance. Carry Select Adder (CSLA) provides better performance with respect to speed and area. Previously a binary to excess one converter (BEC) based Square Root Carry Select Adder is designed but in that data dependency is very high, it gives some speed penalty. An efficient CSLA design is obtained using improved logic units to eradicate the data dependency and redundant logic operations. In this proposed work, the intended efficient Square Root Carry Select Adder is compared with BEC based CSLA of respective architectures, after having comparison the proposed CSLA is efficient with respective to area and delay is used in Multiplier design. This work gives better results regarding to the performance parameters such as delay and area of designed multiplier using new efficient square root carry select adder compared to BEC based CSLA multiplier.
INTRODUCTION
In digital computer arithmetic, the basic operations are addition, subtraction, multiplication and division. In those, multiplication operation is very important, because it gives repeated additions followed by shifting results, so we are going to deal with the process of additions implemented to the operation of multiplication. In VLSI designs, the major evaluates are chip area, power and delay for regulating the performance and efficiency of the VLSI architecture. Multiplications and additions are most widely used arithmetic operations performed in all digital signal processing applications like FIR, FFT and IIR. Addition is a fundamental operation for any digital multiplication. The digital system, area efficient, high speed and accurate operation are significantly influenced by the performance of the used adders.
An adder is the main component of an arithmetic unit. A complex digital signal processing (DSP) system requires G B S R Naidu Sr A conventional carry select adder (CSLA) is having two RCAs configuration which produces a pair of sum words and output carry bits corresponding the expected input carry (c in =0 and 1) and selects one out of each pair for final sum and final output carry. A conventional CSLA has less CPD than an RCA, but the design is not attractive since it uses more area because uses a dual RCA. Further implemented BEC based CSLA with one RCA and one add-one circuit instead of two RCAs, where the add-one circuit is binary to excess one converter (BEC). The BECbased CSLA gives less logic resources than the conventional CSLA, but it has slightly higher delay. We observe that improved logic mainly depends on availability of redundant operations in the formulation, whereas adder delay mainly depends on data dependence. Based on this analysis, further implemented a logic formulation for the CSLA. The main involvement in this brief is logic formulation based on data dependence and optimized carry generator (CG) and carry selection (CS) design. SQRT-CSLA using the new logic formulation design gives less delay and less area than previous ones.
By using the new logic formulation design of SQRT-CSLA adder, 128 x128 bit unsigned multiplier is implemented. In this project, we are going to compare the performance of two different adder based multipliers under concern of area and time needed for estimate. On comparison with the BEC-CSLA based multiplier, the efficient new logic formulation CSLA based multiplier is less area and delay time also reduced. Here I am dealing with the two 128 bit input (n*n) and resultant 256 bit (2n) output. This efficient multiplier design involves 11% less delay than BEC based multiplier and also 6% area is reduced. In the next sections, explained different adder designs, multiplier design and performance comparisons of different multipliers.
II. CARRY SELECT ADDER
The ripple carry adder is constructed by cascading each single-bit full-adder. In the carry ripple adder, each full-adder starts its computation till preceding carry-out signal is ready. Therefore, carry-out propagation path in a ripple carry adder determines the critical path delay. For an N-bit full-adder, the critical path is N-bit carry propagation path in the full-adders. As the bit number N increases, ripple carry adder delay time will increase consequently in a linear way. In order to improve the limitation of ripple carry adder to eliminate the linear dependency between computation delay time and input word length, carry select adder is presented.
The CSLA has hardly two units: 1) the sum and carry generator unit (SCG) and 2) the sum and carry selection unit. The SCG unit uses most of the logic resources of CSLA and considerably contributes to the critical path. For efficient SCG unit implementation, Different logic designs have been recommended. We completed a study of the logic designs recommended for the SCG unit of conventional and BEC-based CSLAs of [2] by appropriate logic expressions.
The conventional Square Root Carry Select Adder is having dual Ripple Carry Adder followed by 2:1 multiplexer for each section of Sum and Carry. Conventional SQRT CSLA major disadvantage is it requires the large area because having multiple pairs of Ripple Carry Adder. The Conventional SQRT Carry Select Adder is shown in Fig.1 . From the new logic design of conventional SQRT CSLA, there is scope for reducing delay and area requirement. In conventional SQRT CSLA, both Sum and Carry bits are calculated for two alternatives that is carry input Cin= 0 and Cin= 1. Once Cin is delivered from preceding stage, the correct computation is chosen using a mux to produce the correct output, instead of waiting for Cin to evaluate the sum, the sum is properly taken at output as soon as Cin gets there. The logic formulations of conventional CSLA as below, from structure of RCA which recommended designs in [1] . As shown in Fig3, the RCA-1 calculates n-bit sum S1 and C1 corresponding to C in =0.The BEC unit receives S2 and C1 from the RCA and produces excess-1 code. From the operation of BEC based CSLA in [1] , the logic expressions of the RCA are the same as those given in (1a) (1c).The BEC unit logic expressions of the n-bit BEC-based CSLA are given as
By observing the fig3 and logic expressions, the BEC block must wait for the result of RCA-1 block result, so the data dependency of design is increased, delay also more. So, in the next section, the new logic formulation of CSLA is designed.
IV. NEW LOGIC FORMULATION OF SQRT-CSLA
In this new logic formulation of SQRT-CSLA, for decreasing data dependency based on the improved logic formulation given in (4a) (4g), and its structure is shown in 
The HSG receives two n-bit operands (A and B) and produce half-sum word S 0 and half-carry word C 0 of width n bits each. Both CG 0 and CG 1 receive S 0 and C 0 from the HSG unit and they produce two n-bit full-carry words C 1 0 and C 1 1 corresponding to inputrespectively. The logic circuits of CG 0 and CG 1 are improved to take advantage of the fixed input-carry bits. The CS unit selects one final carry word from the two carry words at its input line using the control signal C in . It selects C 1 0 when C in = 0; otherwise, it selects C 1 1 . The CS unit is implemented using an n-bit 2-to-l MUX. This feature is used for improve the logic optimization of the CS unit. The optimized designs of internal block are presented in [1] . As shown in figure 4 , before full sum generation (FSG) carry selection is done, here logic optimization is improved. In BEC based CSLA, generation of half sum and full sum when Cin =0 in RCA1 after that BEC unit is waiting for result of RCA1. In the new logic formulation half sum is independently generated and the carry is generated parallely when Cin=0 and 1. After that Carry selection is presented with selection line Cin and follows full sum is generated using this selection carry and half sum result. This optimization of logic design is reduced the area and also delay when compared with BEC based CLSA because decreases redundant logic operations and data dependency.
The new logic formulation of CSLA is less deay and less area than BEC based CSLA.
V. MULTIPLIER DESIGN
Multiplication operation requires the generation of partial products, one for each digit in the multiplier. These partial products are then summed to produce the final product. The multiplication of two n bit unsigned integers results in a product of 2n bits in length. The implementation of the multiplication operation for unsigned data is given in following algorithm.
Step 1: Examine the least significant bit of multiplier. If it is a 1, copy the multiplicand and call it first partial product. If the least significant bit is zero, then enter zero as first partial product and preserve the partial product.
Step 2: Examine the bit left of the bit examined last. If it is a 1, do step 3. Else do step 4.
Step 3: Add the multiplicand to the previously stored partial product after shifting the partial product one bit to the right. This sum becomes the new partial product. Go to step 5.
Step 4: Get new partial product by shifting the previous partial product one bit to the right.
Step 5: Repeat steps 2 to 4 till all bits in the multiplier have been considered. The final value obtained for the partial product is the product of the multiplicand and the multiplier. In the above multiplication process, as shown in the fig5, controller block controls the multiplicand using LSB of multiplier every time and also controls shifting operations in entire block. In the 2n bit register, the product of each bit of multiplier X with multiplicand Y is placed and shift left each n bit binary value and the remaining bits are assigned with zeros. This 2n bit result is added with previous product Z and it continues until all bits in multiplier X.
In the adder block, I used different techniques like BEC based CSLA and the new logic formulation of CSLA. Because, adder is the main block in multiplier. And compare different parameters like delay and area of multipliers, because speed is the main consideration in multipliers in digital design systems.
VI. PERFORMANCE COMPARISION OF MULTIPLIERS USING SYNTHESIS RESULTS
The proposed new logic formulation of CSLA based multiplier and BEC-CSLA based multiplier in this paper have been developed using Verilog HDL and synthesized in Xilinx ISE 14.2 for 128x128 unsigned multiplier. The similar design flow is followed for both the BEC and new logic CSLA based multipliers. The simulation results of both multipliers in terms of delay and area.
The delay of design of new logic formulation of CSLA based multiplier is 11% less than the BEC-CSLA based multiplier and the area of the design is 6% less than the BEC-CSLA based multiplier. The performance analysis for the delay of both multipliers in nanoseconds and the design involves 11% less delay and occupies 6% less area than the BEC-CSLA based multiplier.
