Abstract-A low power parallel multiplier based on Optimized-Equal-Bypassing-Technique is proposed in this paper. We first exploit a new full adder architecture which is capable of bypassing the addition operation when the two summand signals are equal. Then we optimize the full adder at the transistor level for lower power and smaller area purpose. After that, we employ the novel full adder to structure a parallel multiplier. The multiplier design is implemented with TSMC 0.18um technology and simulated with Hspice tool to estimate power dissipation. The simulation results prove that, compared with other designs in literature, the proposed multiplier shows its significant superiority in terms of power consumption as well as hardware overhead.
I. INTRODUCTION
AST parallel multipliers are among the most essential components of modern microprocessors, digital signal processors and many other computer arithmetic VLSI systems. However, one of the crucial problems many parallel multipliers suffer from is its high power consumption. In modern digital CMOS VLSI systems, the total power dissipation comprises of three main parts: switching power, short-circuit power and static power. The first two components are referred to as dynamic power. This power consumption is generated by the charging and discharging of the node capacitances, and it contributes to the majority of the overall power consumption in VLSI circuits.
The total power consumption of CMOS circuits could be obtained by the following equation [1] .
Where V dd supply voltage; V swing voltage swing of the node; C load load capacitance of node i; f clk frequency of the system clock; p i switching activity of node i; I isc short-circuit current of node i; I l leakage current. To save the significant power consumption of the circuits, it is a wise choice to reduce its dynamic power, which is the major portion of total power dissipation in deep submicron VLSI systems. Having an intensive investigating at equation (1) , it is not difficult to draw a conclusion that one of the most effective methods to reduce the dynamic power dissipation is suppressing the total switching activity, namely, the total number of signal transitions of a given circuit.
In this paper, we propose a novel low-power full adder architecture using bypassing and isolation mechanism, named Optimized-Equal-Bypassing-Technique, to cut down switching activities for low dynamic power consumption. Then we employ the new full adder to structure a parallel multiplier and implemente the multiplier with TSMC 0.18um technology. Finally, various previous parallel multiplier designs and the proposed design are simulated with Hspice tool to estimate their performance regarding power dissipation.
The rest of this paper is organized as follows: In Section 2, we briefly describe the previous bypassing full adder designs in literature. After that we exploit a novel full adder based on Optimized-Equal-Bypassing-Technique for low power in Section 3, and then propose a new multiplier architecture based on this full adder circuit in Section 4. Section 5 described the simulation method and result. Finally, the conclusion is summarized in the last section.
II. PREVIOUS DESIGNS OF BYPASSING FULL ADDERS
Full adders are critical components of parallel multipliers in the CSA array or CSA tree, and it often contributes to the most of power dissipation in the whole multiplier circuits. Hence, it is of great significance to explore a low power full adder cell for multiplier applications demanded for low power consumption.
All full adders abide by the generic arithmetic equation as follows:
(2) One of a classic full adder structure is illustrated in Fig.1 and the expression of this full adder could be obtained with:
Based on expression (2), an Equal-Bypass-Technique (EBT) for full adder with a bypass scheme was first proposed in [2] , as is illustrated in Fig.2 . The addition operation in the full adder can be bypassed if summand A equal to summand B, that is, the XOR result of the summand A and B is zero. Otherwise, the addition operation must be executed. In this structure, the full adder only executes the A+1 addition if the summand A XOR B is 1, and the sum result of the full adder can be obtained by adding C if A XOR B is 0. Therefore, the sum bit can be bypassed from C, and the full adder cell can be replaced with a low-cost incremental adder A+1. The multipliers based on the EBT architecture decrease both the 
III. PROPOSED FULL ADDER DESIGNS
Based on the Equal-Bypass-Technique described above, we proposed an novel optimized full adder architecture named Optimized-Equal-Bypassing-Technique for even lower power dissipation as well as smaller area overhead. The proposed architecture in gate level is shown in Fig.3 . We first optimize the buffered A+1 adder circuit. In the proposed scheme, the buffered A+1 adder is replaced with a 6-transistor tri-state inverter as shown in Fig.4 , and the right input of the left multiplexer is directly connected to the input C. If A XOR B is zero, the sum bit can be bypassed from C, and the right input of the right multiplexer is isolated with the tri-state inverter. The advantage of this architecture includes two aspects: on one hand, the power consumption is reduced by bypassing and isolation schemes; on the other hand, compared with the EBT design, the critical path of our full adder design is shortened due to lower gate count in the signal propagation path.
Based on the novel architecture, we strive to futher optimize the XOR-XNOR gate and the full adder circuit in transistor level. It is known that traditional CMOS XOR-XNOR gate is comprised of 10 transistors. In this paper, we utilize a novel 7-transistor XOR-XNOR gate for low area and low power, as shown in Fig.5 .
This XOR-XNOR cell cascads a CMOS inverter and a pull up PMOS transistor as a signal level restorer unit in a feedback path of the XNOR gate proposed in [3] . Hence, it features small transistor number, low power and with a good signal level at the output ends.
The transistor level circuit of the proposed full adder with Optimized-Equal-Bypass-Technique is illustrated as Fig.6 . It comprises of 7-transistor XOR-XNOR cell, tri-state inverter and multiplexers based on transmission gate logic. All of these subcircuits are capable of provide full voltage swing at all nodes, so that lots of static power is avoided. Note that some functions of the inverter can be shared by the tri-state inverter and multiplexers, so that the transistor count of a full adder can be further cut down. In our design, we use only 19 transistors to implement the bypassing full adder, while the transistor counter of a conventional CMOS adder and the EBT adder are 28 and 22, respectively. 
Where P is the product, is the multiplicand, and n is the bit length of the operands. A typical implementation of this multiplication is Braun multiplier, and a 4*4 Braun multiplier example is illustrated in Fig.7 .
By empolying the Optimized-Equal-Bypass-Technique design of a full adder, a 4*4 low-cost bypassing-based multiplier is built as shown in Fig.8 . In this multiplier architecture, the total transistor number of a n*n multiplier is only V. SIMULATION AND RESULT To compare the power efficiency of various multipliers, we design the conventional CMOS Braun multipliers, the multipliers based on EBT, and the multipliers based on the proposed architecture in different operand sizes, and implement these designs with TSMC 0.18um technology. To evaluate the power consumption of these multipliers, we use MATLAB to yield 5000 random data and feed it into the circuits as input stimuli, and then run the HSPICE tool at the transistor level in 100MHz. In order to yield a more practical character in the simulation, CMOS buffers are added to all inputs and outputs of the test circuit. This configuration provides a scenario similar to realistic situation where the multiplier cores has the driving stages and the driven stages in the VLSI system [5] .The The simulation results are summarized in Table I , and the comparison in terms of power dissipation of different designs are illustrated in Fig.9 .
The simulation results indicate that our proposed design consumes less power in all cases than the Braun multiplier, and the multiplier proposed in [2] . For 16x16 multipliers, our design saves 37.9% and 25.5% of power consumption compared with Braun multiplier and multiplier [2] , respectively. The area overhead in transistor count of these three designs are listed in Table II . It is obvious that our design of the multiplier has the smallest transistor number among all the compared designs. In summary, the proposed multiplier outperform other designs regarding both power consumption and hardware overhead.
VI. CONCLUSION In this paper, we presente a low power full adder architecture named Optimized-Equal-Bypassing-Technique, which could bypass the addition operation if the two summands are equal so as to suppress dynamic power. Then we optimize this architecture at the transistor level to further reduce the power and area, and finally implemente it into a parallel multiplier design. The simulation results show that the proposed multiplier has an absolute advantage over previous designs in terms of both power dissipation and area overhead. 
