In this paper, we propose a methodology in which a power-delay product of a binary adder is optimized based on the heterogeneous adder architecture. We formulate the power-delay product of the heterogeneous adder by using integer linear programming(ILP). For the use of ILP optimization, we adopt a transformation technique in which the initial non-linear expression for the power-delay product is converted into linear expression. The experimental result shows the superiority of the suggested method compared to the cases in which only conventional adder is used. 
The Power-Delay Product(PDP) indicates a measurement of how much a digital circuit is power-effective by considering its delay performance together with its power consumption. The PDP optimization for a binary adder should be done for modern digital circuits instead of considering power and delay independently [1] . Mixing multiple implementation type of adder such as ripple-carry adder(RCA), carry-lookaehad adder(CLA), carry-skim adder(CSKA) for the optimization of delay, area, power was introduced in the pat research [2] [3] .
In this paper, we formulate an ILP model for the PDP based on heterogeneous adder architecture. The formulation of delay and power of the heterogeneous adder is acquired from power and delay formulation of the heterogeneous adder introduced in [4] . Simply multiplying those twe twe twe twe twe ts indicating power and delay of the heterogeneous adder removeay of validiogeneoILP formulation beca pow of multiplication of those twe twe twe ts givef th an the heterogeneous addeincltding pSimply termh of twe integer vro obles. Taking moou than onlatiteger vro obles in any singlu term of an ILP model removeay of heteroiogefrom of ILP model. This pre mulf th applying thf ILP power and neooptimize the PDP of the heterogeneous adder. Thus us adder is ower and neotiplyinrmulation of the-linear expression of the PDP to the linear expression.
By using the heterogeneous adder architecture and the non-linear to linear transformation scheme, more optimized PDP can be achieved when compared to that of conventional adder architecture. It is due to the exploration of expanded PDP design space for the heterogeneous adder as we observed in the area-constrained delay optimization or vice versa [5] . The optimization of the PDP of the heterogeneous adder is performed by a linear program solver [6] . The reduction of PDP in the heterogeneous adder compared to the PDP of conventional adder, will be shown in experimental results.
II. Backgrounds
Generalized architecture and delay modeling of the heterogeneous adder is illustrated in Fig. 1 [3] .
SAi (ni ) indicates a sub-adder with its propagation scheme SAi and its bit-width ni. With available I-sub-adders, the n-bit heterogeneous adder is defined as concatenation of each sub-adder SAi (ni ), where         . The carry-out signal of SAi , Cout (SAi ), is used as the carry-in signal of SAi+1, Cin(SAi+1).
Combining each sub-adder SAi (ni ) and varying ni for each SAi enables us to explore more fine-grained design space for the performance metrics such as delay, area and power than in that of conventional adder [4] , [5] . Deciding the proper ni of each sub-adder SAi (ni ) for PDP optimization in the heterogeneous architecture is the main goal of the approach presented in this paper.
The metric of PDP is meaningful especially in digital signal processing application and mobile system. It is due to that not only the power consumption of the system but also the high speed of operation is required in those systems [7] . As the value of PDP of a system becomes smaller, the system becomes more power-effective, which means it consumes lower power with the same speed of the system operation.
For a specific implementation of a binary adder type, the PDP can vary with type of implementation, the degree of optimization, and the process technology for the implementation. Generally, Carry Lookahead Adder (CLA) is known to be most superior in the metric of PDP [8] . Figure 2 shows the PDP of actually implemented binary adder with 0.18m CMOS library with varying their bit-width. The value of PDP becomes smaller with the order of Ripple Carry Adder (RCA), Carry Skip Adder (CSKA), and CLA at the bit-width 128. It is shown in Fig. 2 . This comparison implicates that, although CLA has larger power consumption compared to those of CSKA and RCA the delay decrease due to using carry lookahead architecture compensates the increase of power consumption. It indicates the CLA is most power-efficient when considering its delay.
Thus for an application which requires low power consumption together with performance in the speed, CLA is most appropriate among the adder types shown in Fig. 2 . As the bit-width of each sub-adder becomes larger, the PDP of each sub-adder becomes larger too. RCA has always larger PDP than those of CLA, CSKA. However, for CLA the PDP is smaller when its bit-width is lower than 64. At the bit-width 128, CLA has the smallest PDP.
By using the heterogeneous adder architecture, we can exploit heterogeneous adder designs in the design space represented by the area between each PDP curve.
III. ILP Formulation for PDP Optimization of Heterogeneous Adder
The heterogeneous adder architecture is presented in [5] and is applied to delay-constrained power optimization using ILP in [4] . In this section, the ILP formulation for the PDP optimization of the heterogeneous adder will be proposed. Specifically, the ILP formulation for area-constrained PDP optimization will be presented since delay/power of digital circuit is usually in tradeoff relationship with its area.
For the ILP formulation "transforming a non-linear expression into a linear expression" is required since the original PDP expression acquired by multiplying delay and power of a heterogeneous adder give us non-linearity. The average power consumption and the delay of the heterogeneous adder can be represented in the form of integer linear expression. The PDP of the heterogeneous adder can be expressed by the product of each integer linear expression representing the power consumption and delay of the heterogeneous adder, respectively.
As presented in [4] , POWER(Heterogeneous Adder) and AREA(Heterogeneous Adder) can be expressed as follows :
Equation (1) and (2) are subject to      ≤ .
韓國컴퓨터情報學會 論文誌(2010. 10.)
In the above Equation (1) and (2) The order of sub-adder has impact on the delay of a heterogeneous adder. Depending on the order of sub-adders, the carry generation of sub-adders located in the most significant bit (MSB) part can overlap the sum generation of sub-adders located in the least significant bit (LSB) part as shown in Fig.   1 . The order of sub-adder is fixed such that SA1 = CLA, SA2 = CSKA, and SA3 = RCA. By fixing the order of sub-adder, we can reduce the design space of ILP for PDP optimization since the order minimize the delay of heterogeneous adder with same combination of sub-adder.
Therefore, the delay of the heterogeneous adder is defined as follows : DELAY(Heterogeneous Adder) = max{D1, D2, . . . , DI} Here, D1, and Di are defined as follows :
················································································ (4) In (3) and (4) Thus area-constrained PDP optimization is formulated as follows :
In the above expressions, θAREA denotes the upper bound of area allowed for PDP optimization of the heterogeneous adder instance.   is a variable indicating the upper bound of PDP, and it is used also the minimax objective in ILP formulation for area-constrained PDP optimization [9] . Thus, the PDP of the heterogeneous adder can be modeled as follows : in Equation (6). However, both SAi and SAj indicate the same sub-adder instance assigned to an heterogeneous adder instance since Equation (6) implies the case in which only one type of sub-adder is assigned.
In Equation (6) and (7), we define new variable and introducing additional constraints, we can get the proper formulation to solve ILP fit for the PDP optimization of the heterogeneous adder.
In other words, following condition should be satisfied :
are binary variables. To make the above condition satisfied, following additional constraints are required [9] .
By incorporating the newly defined variable and additional constraints, we can acquire following ILP formulation of the PDP of the heterogeneous-adder.
for all SAi , SAj , ni , and nj (1≤i≤I, 1≤j≤I, 1≤ni ≤ n, and 1≤nj ≤n)
IV. Experimental Results
To show the effectiveness of the proposed method, the experiment for PDP optimization was performed with the derived ILP models. For the experiment, three types of sub-adders, CLA(=SA1), CSKA(=SA2), and RCA(=SA3), were used, and their sizes varied from 4-bits to 128-bits with an incremental step of 4-bit. All the sub-adder instances were implemented by Synopsys tool with ANAM 0:18m CMOS library [10] . The delay and the average power consumption were obtained using timing and power simulation results of the tool.
In Fig. 3 , the PDP design space generated by the combination of all the possible bit-width of sub-adders (here, I=3) is depicted as the form of 3-dimensional surface curve. The X-axis and the Y-axis indicate the bit-width of CLA and CSKA assigned to a heterogeneous adder, respectively. Z-axis means the PDP value at a specific point designated by each sub-adder combination. For example, when X = n1 = 128 and Y = n2 = 0, the remaining n3 becomes 0 and the corresponding PDP value is 8.911pJ. Finding a solution of the ILP formulation for PDP optimization implies seeking the lowest value point in this 3-dimensional graph. As shown in Fig. 3 , the PDP of 128bit-CLA is lowest in the whole design space. Figure 4 shows the result of PDP optimization while increasing area upper bound by 25. The unit of PDP is pJ since the multiplication of delay (nS) and average power consumption (uW) has the same unit as that of energy. Without any area upper bound, the optimized PDP is acquired at the sub-adder combination, 128-bit CLA. It means that CLA is most beneficial in the measurement of PDP among three types of sub-adders.
Also, in Fig. 4 , a combination of sub-adders with the bit-width found by ILP optimization is given with a pair of optimized PDP and actual area at the point. For example, at the area upper bound 2300, the pair of the optimized PDP value and the area at that point, is represented in the parenthesis as (11. 381, 2290) with the combination of sub-adders ' CLA112+CSKA12+RCA4' . In the interval of area upper bound, θAREA < 1200, the combination of CSKA and RCA (without CLA) is solely used for the optimized PDP. At the area upper bound 1200, the optimized value of the PDP of the heterogeneous adder is 13.649pJ with the actual area 1164. Here, the operator ' +' means the concatenation of the sub-adder. In the interval of area upper bound, 1175 < θAREA ≤ 2100, the heterogeneous adder is configured to ' CSKA124+RCA4' as shown in Fig. 4 . 8 韓國컴퓨터情報學會 論文誌(2010. 10.) Figure 5 displays the reduction of PDP, which means that the ratio of the reduced PDP in area-constrained optimization by using the heterogeneous adders instead of using conventional adders. In the interval, 675 < θAREA ≤ 1200, upto 57% of PDP reduction is acquired, and in the interval, 1200 < θAREA ≤ 2125, about 3% PDP reduction was obtained. In the interval, 2125 < θAREA ≤ 2500, upto 35% of PDP reduction is acquired. The improvement numbers are not absolute since this improvement is from the areas/delays/powers of the specific sub-adder implementations. However, the improvement would be changed relatively, if other circuit level optimization such as transistor sizing is applied to sub-adder types or different design libraries is used in implementing sub-adder components.
V. Conclusions
In this paper, the ILP formulation for PDP of heterogeneous adder is presented and the experimental results of optimizing PDP of heterogeneous adder are provided. The technique to transform a non-linear expression to a linear expression is also adopted for ILP based PDP formulation of the heterogeneous adder. Without that transformation, PDP of the heterogeneous adder can not be modeled in ILP form due to the non-linearity property of the original PDP formulation.
The experimental result showed the optimized PDP values of the heterogeneous adders under area constraints. Through the use of the proposed methodology, the compromised design space of the heterogeneous adder can also be exploited for the case of PDP optimization.
In future research, we plant to extend the proposed method to work in different input arrival time for each input bit of a binary adder.
