Abstract
Introduction
During logic synthesis, the functionality of nodes intemal to a circuit can be manipulated within the Don't Care (DC) set to minimize area or delay. Although a reduction in area often corresponds to a reduction in power, this is not true in CMOS technologies if the switching activity is increased. In circuits with distributed input switching probabilities, even a minimal Boolean difference in functionality at a node may dramatically alter the switching activity.
Low-power synthesis algorithms which address the issue of distributed input switching probabilities already exist [l] [2] . These approaches either ~y to reduce the functional support from highactivity inputs or guide sub-expression extraction through simple high-level power approximations. However, these works do not present any formalization of minterm probability distributions within the Boolean space.
In this paper we present an efficient technique for exploring the Boolean space to identify minterms highly appropriate for influencing switching activity. This allows us to tune existing area optimization strategies towards reduced power consumption. The minterms in the Boolean space are separated by probability into a set of classes. The overlap of the high-probability minterm classes with the DC set at a node is used to bias area optimization towards reducing switching activity. The small-probability minterm classes in general have large cardinality. These subsets of the DC set provide significant flexibility for area-optimization without strong influence upon activity. The concept of the probability distribution within the Boolean space is described in Sect. 3 after a quick definition of CMOS power consumption (Sect. 2). In Sect. 4 an outline is given for the use of these classes in power optimization. Sect. 5 is a presentation of the results of applying this theory to standard benchmarks, and Sect. 6 is a description of the pros and cons of our method.
Power Dissipation in CMOS Logic
In this paper, the well accepted switching-activity dominated model of [3] is used to model power dissipation. Namely where Pow, denotes the average power dissipated by gate g;, Ci is the load capacitance at the output of gate g,, V& is the supply voltage, T is the clock period, and E; is the average number of gate output transitions per clock cycle. We are assuming that the primary inputs for combinational logic blocks are independent, Furthermore, it i s assumed that functional switching power (the zero-delay model) is the predominant effect in determining power consumption. Under this assumption, the power consumption at a gate g; with onset probability p ; is given by: ( l -p ; ) T When load cap,acitance is constan& functional power minimization is equivalent to optimizing p. to be close to zero or one.
Partitioning the Boolean Space
To identify the subset of a Boolean space most influential upon switching activity, we want to find all minterms with probability of occurrence higher than the average minterm probability. be the Boolean space described by the variable set z = {XI, ..., 2 , ) where Z; E B1 Vi = 0,1, ..., n. Let P = { ( P I , ~l ) ,
..., (pn, zVz)} be the associated probability set where p ; E l0,1] are independent random variables associated with each 2% such that p ; = Pr(z, = 1). For any minterm m E B", let To complete the proof, we need onIy show that S
7%.
i.e. Vm E SIo ,,_,, lq,), ~( m ,
We may replace every instance of p with p = (: + €1, E > 0. If 111 is even, this case is
We conjecture that the above proposition is true when the input onset probabilities are non-unique and every input i to a network has an onset probability p ;
When all inputs have the same onset probability p # $, the probabilitycontainedin Cps may becomputedas(assumep > 0.5) In summary, we conjecture that for any Boolean space Q = B"
and associated input probability set P where pt E P are distributed on [0,1] it can be shown that greater than 50% of the probability is contained in Cps and CPS 5 q.
Finding Cps exactly is an exponentially difficult problem. We propose an efficient solution to approximate this class by first partitioning the space into classes of similar-probability minterms. Cps is then be approximated as the union of all classes with an average minterm probability 2 I.
Let I, be the largest subset of network inputs I such that
The set of minterm probabilities in the subspace defined by these variables is given by $.
(1 -p)I'pl-J V j : 0 < j < IIpl. The sets of minterms corresponding to each of these probabilities form a set of classes p:p which partition this subspace, the zero superscript indicating that approximating the minterm probability in each class with the class average is exact.
By construction, Iqypl = (II,l + 1). Let P be the set of onset probabilities for the entire input set I , ]PI 5 111. A set (although not necessarily minimum) of exact-average classes which partition the 1 1 1 variable Boolean space is then given by the product of the exact-average classes for each I,, p E P. i.e. cpy = nPEF q;,.
Thenumber of zero-error classes is therefore O (~p c p ( l I p l
which is usually impractically large evenif IPl << 111. Accepting a small minterm probability variance in a class allows the total number of partitions to be reduced by collapsing classes in each p: .
(This also tends to 'smooth' the functions that define the classes.? Minterm probabilities within a class are not normally distributed so we want to establish the maximal class countreduction for small-
PI est maximum error increase. The maximum error in a class C, is defined as: maxAcc, j(Pr(A) -I A l . W ) \ . This corresponds to the sum of all positive differences betweenminterm probabilities and the class averageminterm probability. For eachp 6 P, an error vs. class count curve can be established.
A plot of the total error (which is the sum of the error for each class) in prP against the total number of classes is shown in Fig. 1 . Now consider the set of classes described by the Boolean product of classes chosen for each input subset I,. As each set of classes p~~ covers the entire space with mutually exclusive sets, the overall set of classes formed from the product maintains this necessary property. Let tp be the total error for the set of classes y~~. An upper bound on the error for the set of classes on I defined by the product is given by:
PE?
Using this formulation and the error curves for each p E P, the sensitivity of reduction in class count to increase in error can be specified. This provides a simple numerical way of finding the minimum cardinality set of classes for a given maximum error tolerance. ( In practice, a set of elements ' P' is used to approximate all elements of the set P to within a small error. This helps reduce computational complexity with marginal loss of accuracy.) It was found that a maximum error tolerance of 10% reduces the number of classes by orders of magnitude. These classes are then grouped together based upon the similarity of average minterm probabilities in the original classes, and an attempt to equally partition the total probability. An example of the distribution of minterms between 100 classes is shown in Fig. 2 for a Boolean space of 41 variables. The curves show the cumulative coverage of the Boolean space (y-axis) against the cumulative consumption of total probability (x-axis) in growing from the class with the largest to the class with the smallest average minterm probability. As the total probability is almost equally distributed amongst the classes, the point at which to collapse the classes to form Cps is equivalent to point on the curve where distributions -the first uniform; the second gaussian centered on 0.5 with with 0 = 0.1. The third curve, the straight line, indicates how cumulative probability relates to the cumulative proportion of the Boolean space when all inputs have unique onset probability 0.5. 
From Area to Power Optimization
The goal of logic synthesis is generation and optimization of a multi-level logic description which implements a specified func- . To guide standard area optimization towards activity sensitive decisions, we want to split the ODC at each node into power-sensitive and non-power-sensitive sets. We ensure that the functional flexibility offered to the area-optimization algorithm is not severely compromised by performing a two phase process which first addressesreductionin switching activity, thenreduction in area. The concept of separating the optimization phase for reduction in activity and area is illustrated in Fig. 3. Fig. 3a indicates the partition of the B o o E s p a c e into high and low probability minterm classes, Cps and C p s respectively. Fig. 3b is a node function, fn, and its ODC set, D,. Assume that the onset probability for f,, is greater than i. To reduce switching activity, we want to increase the Ionset probability by absorbing the large probability minterms withinthesetdescribedby: f(D,).K.f(Cps). (i.e. The black shaded area of Fig. 3c.) Although an expansion will benefit switching activity, this is usually a very small subset of the DC set and unlikely to provide sufficient area optimality. Optimization for arearequires the flexibility of a large subset of the DC set. However, it is important to avoid functional contraction within the set C p s as even a small excursion in that direction could sharply increase switching activity. (i.e. P r ( f n ) would move closer to .) The set providing maximum flexibility without allowing small functional changes to halve a strongly detrimental influence upon switching activity is: f (D,).(x+ f n . f ( G ) ) . (i.e. The blackshaded area of Fig. 3d.) A pseudo-c:ode form of the ideas presented above is provided in Note that the final area optimization step deals with the function f,, afer any activity optimization. The restriction of the DC set in that step therefore prevents the area optimization from undoing any critical expansiodcontraction already achieved within the set C p s .
Results
The algorithm outlined in this paper was implemented inside the SIS logic synthesis package to guide the node minimization phase during multi-level logic optimization. The resulting programs for power-sensitive node minimization -powersimplifi() and powerfull,rimplifir() are counterparts of the area optimization routines simp@)() and fullsimplifi() of SIS. For benchmarking purposes, we replaced the occurrences of simplib() and fullsimplifi() in script.ruggedwith ourpowersimplifir() andpowerfullsimplifii() command to obtain script.power. A subset of circuits from the MCNC benchmark set was used to obtain the experimental results. All circuits were mapped using mmgenlib. Power estimation and switching activity computation was performed using the symbolic simulation method of [3] using a zero-delay model. All inputprobabilities were chosenfrom auniform distribution in the range [0,1]. All experiments were run on a DEC-station ALPHA with a 160Mb memory.
The results comparing the area and power reduction obtained by optimization via script.rugged and script.power are presented in Table 1 . The names of the benchmark circuits tested is given in Column 1, Column 2 contains the number of literals in the factored form. In Column 3 and 4 respectively we present the total probability contained in the set Cps and the proportion of the Boolean space it encompasses. Column 5 shows the area of the circuit prior to optimization. Columns 6 and 7 present the results after network optimization using script.rugged and script.power respectively. Similarly, Column 8 shows the power dissipation of the unoptimized circuit and Columns 9 and lopresent the power dissipation results after optimization using script.rugged and script.power respectively. Columns 11 and 12 show the YO change in area and power results by using script.power instead of script.rugged.
The results demonstrate that in most cases script.power yields a circuit with lower power dissipation thanscript.rugged. There is no significant trade-off in terms of circuit area. In fact, on the whole, script.powerperforms better in area optimization as well. Note that script.power does not always give a better result than script.rugged. This is to be expected since our approach attempts to favorably bias (from the power perspective) the network optimization process at the nodeminimization level, but cannot guarantee that this will always translate in a lower power network. At the same time, since the area optimization flexibility is not strongly affected by our algorithm, we do expect that in most cases our algorithm will yield a lower power network without any area penalty. This assertion is validated by the experimental results (a total of 16% reduction in power and a 12.5% reduction in area over the set of benchmark circuits, averaging 8.0% reduction in power and 7.5% reduction in area for each individual case).
Pros and Cons of the Technique
The positive and negative aspects of this approach both arise from a the same overall property of the results -this algorithm reduces both areaandpower. The computationalcost of this gain is a doubling of the time complexity of simplib() and fullsimp@(), two key steps in area optimization.
Although this trade-off may be regarded as acceptable, it is clear that the use of power-sensitive DC sets does not achieve a controllable de-correlation of reduction in activity and reduction in area. This follows from the attempt to reduce switching activity at the output of a complex node through high-level area optimization. In general, the output activity of complex node does not dominate the power consumption of its intemal structure. It would therefore be necessary to bias the extraction of internal structure (e.g. [2] ) to reduce the area-dominant effect of power reduction.
Ongoing work in this area is examining the application of activitysensitive classes to extraction of intemal node structure as well as guided functional alteration through engineering change. In the latter case, power reduction should be achieved with zero change in area as only existing gates are used in the optimization.
Conclusions
We have shown that for a network with distributed input probabilities there is often a small subset of input combinations (minterms) which encompass the majority of occurrence probability. For a uniform distribution of input probabilities in the interval [O, 13 we have generated experimental results which show this effect to be of the order of > 80% of the probability in < 15% of the Boolean space.
This has prompted the definition of apower-sensitive minterm class. An efficient technique for analyzing the probability distribution within the Boolean space and identifying this class has been developed. We have designed and implemented a technique for using the power-sensitive class to guide area optimization towards more activity-optimal decisions. The results in comparison to standard area optimization are quite promising.
Acknowledgements

