Abstract-Recent research has shown that voltage scaling is a very effective technique for low-power design. This paper describes a voltage scaling technique to minimize the power consumption of a combinational circuit. First, the converter-free multiple-voltage (CFMV) structures are proposed, including the p-type, the n-type, and the two-way CFMV structures. The CFMV structures make use of multiple supply voltages and do not require level converters. In contrast, previous works employing multiple supply voltages need level converters to prevent static currents, which may result in large power consumption. In addition, the CFMV structures group the gates with the same supply voltage in a cluster to reduce the complexity of placement and routing for the subsequent physical layout stage. Next, we formulated the problem and proposed an efficient heuristic algorithm to solve it. The heuristic algorithm has been implemented in C and experiments were performed on the ISCAS85 circuits to demonstrate the effectiveness of our approach.
I. INTRODUCTION

A. Background and Related Work
Power dissipation has become one of the most significant parameters in very large scale integration design due to the trend toward portable computing and communications systems. For portable devices, power dissipation limits the battery life and the available time. Even for nonportable devices, power dissipation affects the cost of packaging and cooling equipment.
Total power dissipation in a digital CMOS circuit can be obtained from the sum of three components: static dissipation, dynamic dissipation, and short-circuit dissipation [1] . In general, the total power dissipation is dominated by the dynamic dissipation and may be estimated by P d = a 1 f clk 1 CL 1 V 2 DD , where a is the activity factor, f clk the switching frequency, C L the total node capacitance, and V DD the supply voltage. This formula is the basis of previous researches in low-power CMOS digital design [2] - [11] .
As the dynamic power dissipation is proportional to the square of the supply voltage, voltage scaling is evidently the most effective technique to minimize the power dissipation. Moreover, the conclusion of [8] provides us a clear goal in minimizing the power dissipation, i.e., operate the circuits as slowly as possible, with the lowest possible supply voltage.
The most popular voltage scaling technique is to operate all the gates in a circuit with a reduced supply voltage, which is limited by the critical paths. However, the gates that are not on the critical paths could operate slower with lower supply voltages. This motivated some researchers to operate gates with two or more supply voltages in a circuit [10] - [12] . Publisher Item Identifier S 0278-0070(01)00362-1. Usami et al. proposed a clustered voltage scaling (CVS) technique to reduce the power consumption with two supply voltages [11] . The block diagram of CVS is shown in Fig. 1 . They arranged supply voltages such that the voltage swings of all paths are in decreasing order. Then level converters and latches with the level-conversion function are inserted before the primary outputs to prevent the static current.
B. Motivation and Goals
In previous works, level converters and latches with the level-conversion function were inserted to prevent the static current in circuits with multiple supply voltages. However, there exist some overheads with level converters.
First, the power consumption of level converters is not negligible. From our circuit simulation, the power consumption of a level converter is about four times that of an inverter. Next, the insertion of level converters introduces extra delays into the circuits. The rising delay of a level converter is about four times that of an inverter, too. Finally, the insertion of level converters changes the topology of circuits.
Consequently, Usami et al. used level converters with care and just inserted level converters in front of the primary outputs to minimize the number of level converters [11] . Instead of using level converters with care, we try to find a voltage scaling technique without level converters in this paper.
To reduce the complexity of placement and routing when multiple supply voltages are used in physical layout, gates with the same supply voltage should be placed in a cluster. This is especially important for a standard-cell design since the gates in a standard-cell design are arranged in rows and their power lines are connected directly. Hence, we would like to preserve the clustering property in this paper as Usami et al. did in the CVS technique.
Finally, the logic structure discussed in this paper is CMOS complementary logic. Other logic structures may have more sophisticated effects at the interface of different supply voltages, which is beyond the scope of this paper.
The rest of this paper is organized as follows. In Section II, we propose the converter-free multiple-voltage (CFMV) structures, which need no level converters, make use of multiple voltages, and have gates with the same supply voltage in a cluster. In Section III, we give some definitions, formulate our problem, and propose a heuristic algorithm for the problem. Then, experimental results are shown in Section IV and are compared to the results of previous work. Finally, concluding remarks and future works are provided in Section V.
II. CFMV STRUCTURES
A. Elimination of Level Converters
When multiple supply voltages are applied in a CMOS digital circuit, there might exist a static current flowing from the supply voltage to the 0278-0070/01$10.00 © 2001 IEEE ground at the interface of gates with different supply voltages. Level converters are usually used at the interface to prevent the static current. To avoid using level converters in a CMOS circuit with multiple supply voltages, we put constraints on the voltage differences between adjacent gates with different supply voltages.
A simple analysis with the first-order MOS model predicts that there will be no static current if the supply voltage of a driver gate is higher than the subtraction of the threshold voltage of a PMOS from the supply voltage of a driven gate. Take However, the subthreshold effect makes the above prediction imprecise [13] . As we can see from Fig and unacceptable. Therefore, the best way to determine the reduced voltage is by a circuit simulator, such as HSPICE, when the acceptable value of the static current is given. For example, we could use HSPICE to simulate the circuit in Fig. 2(a) . If the acceptable value of the static current is 1 A, we can determine V DDR to be 4.4 V with which the static current will be less than 1 A.
B. Arrangement of Supply Voltages
From the above section, we know that if the voltage difference of a driven gate and its driver gate is less than a specific value, level converters are not necessary at the interface of gates with different supply voltages. We call such value a safe threshold voltage, denoted as V st .
Next, we'll discuss how to arrange multiple supply voltages in a circuit to eliminate level converters.
Assume that we have a set of n supply voltages, fV dd 0 ; V dd 1 ; . . . ; V dd n01 g, such that 1) V dd0 > V dd1 > 1 11 > V ddn01, 2) V dd i 0 V dd i+1 < V st for i = 0; 1; . . . ; (n 0 2), 3) V dd i 0 V dd j > V st for i 0 j > 1. Also, we have a set of n clusters of gates fC0; C1; . . . Cn01g, where the gates in different clusters are supplied with different voltages and each cluster has two adjacent clusters at most. If we like to assign these supply voltages to the clusters such that there is no static current and no level converters, the only solution will be as shown in Fig. 3 .
We call the structure shown in Fig. 3 1 All the circuit simulations of this paper used the level-3 SPICE models of the TSMC 0.8-m single-poly double-metal process. it an n-type CFMV structure. If both V dd and V ss are scalable, it is called a two-way CFMV structure. Whichever CFMV structure is used, the voltage swings along all paths are in increasing order. The set of supply voltages f(V dd0 ; V ss0 ); (V dd1 ; V ss1 ); . . .g is called the feasible supply voltage set. can define the depth of a vertex u in a graph G to be the number of edges in the longest path from u to a sink of G. The depth of a sink is defined to be zero and the depth of u, denoted as dep(u), can be determined by dep(u) = 1 + max v20 (u) dep(v). In addition, the depth of a graph G = (V; E) is defined by maxv2V dep(v). 
III. ALGORITHMS FOR THE CFMV STRUCTURES
A. Preliminaries
From Lemma 1, we know that if a vertex is in V 2 , then all the vertices in its reachable set must be included in V2 as well. Now we can define the proper-directed cut, which will be used to partition graphs in the following subsections. 
B. Problem Formulation
Now we can formulate the problem that we would like to solve in this paper as the following. Given a circuit with timing constraints and a feasible supply voltage set, scale the supply voltages of a subset of gates with positive slacks to minimize the total power consumption for the CFMV structure.
Note that the formulated problem is for the CFMV structure and then has a much smaller solution space than a generic multiple-voltage scaling problem since the voltage sequence in the CFMV structure is a continuous subsequence of the feasible supply voltage set.
When there are only two elements in the given feasible supply voltage set, the optimal solution can be easily obtained by a depth-first search algorithm shown in Fig. 4 . However, when there are more than two elements in the given feasible supply voltage set, the problem becomes much more difficult. Thus we next give an asymptotic bound on the solution space of the formulated problem. where n is the number of vertices in the graph G and p is the depth of G.
C. Heuristic Algorithm
Since the number of proper-directed cuts of a graph is exponentially proportional to its number of vertices, it is impractical to search all the proper-directed cuts for the optimal solution. Therefore, we propose a heuristic algorithm to search only a subset of the solution space.
Let G 0 be the induced subgraph of G on the vertices whose voltage level is m. After DFS(m) Next, CFMV(2; 1) is called and it calls DFS(1), which assigns the voltage levels of fp; q; rg to two. The ML of CFMV (2; 1) is then f(m; 1); (n; 1); (o; 1); (p; 2); (q; 2); (r; 2)g. Before CFMV(2; 1) returns, the voltage levels of fp; q; rg are assigned back to one. When CFMV(2; 1) returns, L2 = f(m; 1); (n; 1); (o; 1); (p; 2); (q; 2); (r; 2)g and P 2 = 3:00. In the following, we give an asymptotic bound on the computation complexity of CFMV.
Theorem 6: Let n be the number of vertices in the graph G and l be the number of elements in the feasible supply voltage set. Then, the computation complexity of CFMV on G is O(n l01 ), for l = 2; 3; . . ..
IV. EXPERIMENTAL RESULTS
We have implemented our heuristic algorithm in C on a Pentium II 450 PC running Linux (RedHat 6.0) with 128-MB memory, and performed experiments on all the ISCAS85 circuits. In addition, we implemented the CVS technique for comparison.
The experiment environment is shown in Fig. 8 . The control file provides the feasible supply voltage set. In our experimental cell library, the length of each MOS is 0.8 m, the width of each PMOS is 16.8 Table I . When n voltage levels are used, the feasible supply voltage set is {(V dd0 ; Vss0), . . ., (V dd(n01) ; V ss(n01) )}.
First of all, we compare the results of CFMV with those of CVS to show the effectiveness of the CFMV technique. Since the CVS technique uses two supply voltages, we compare it with the CFMV technique with two voltage levels. The comparison results are shown in Table II . We can find that the CFMV technique is better than the CVS technique in most cases, except in c880 and c5315. On average, the power reduction of CVS (5 V, 4 V) is 7.17%, CVS (5 V, 3 V) is 8.99%, and two-way CFMV (2 levels) is 13.65%.
Next, we perform experiments on three types of CFMV structures with more voltage levels to find the effect of voltage levels as shown in Table III . We find that the more voltage levels are provided, the more power reduction we can obtain. For example, from Table III, the average power reduction of a two-way CFMV with two voltage levels is 13.65%, three voltage levels is 18.05%, and four voltage levels is 18.73%. Though more power reduction can be obtained with more voltage levels, the increment of power reduction is less with more voltage levels. It is a tradeoff between the power reduction and the cost of voltage levels. 
, where l is the number of supply voltages and n is the number of gates. In this paper,
we have proposed a multiple-voltage scaling technique to minimize the power consumption of a combinational circuit. We put constraints on the voltage differences between connected gates to eliminate the necessity of level converters, which were used in previous works to prevent static currents. With such constraints, we formulated the problem and found that the size of its solution space is (2 n=(p+1) ), where p is the depth of a graph. Though the solution space of the formulated problem is much smaller than that of the generic multiple-voltage scaling problem, it is still exponentially proportional to the number of gates.
Therefore, we tried to find a practical solution and proposed a heuristic algorithm for the formulated problem. The complexity of our heuristic algorithm is shown to be O(n l01 ). Furthermore, we implemented the heuristic algorithm in C, performed experiments on all the ISCAS85 circuits, and compared the results with those of the CVS technique. From the experimental results, we can find that the CFMV technique can reduce the power consumption by up to 33.39%. On average, 9-18% power reduction can be obtained using the CFMV technique.
In this paper, we used the maximum voltage difference allowed in the CFMV structures. In future work, we will find what the voltage differences should be to obtain maximum power reduction.
In addition, we used monotonously increasing supply voltages in the CFMV structures to have both the converter-free and the clustering features. If the clustering constraint is released, it is not necessary for the voltage sequences to increase monotonously. In the future, we will also explore the solution of such formulation.
Last but not least, if the converter-free constraint is released, it becomes the generic problem and has the largest solution space. This is really a challenging research topic.
