Considering the voltage drop constraint over a distributed model for power/ground (P/G) network, we study the following two problems for physical synthesis of sleep transistors: the min-area sleep transistor insertion (and sizing) (T IS) problem with respect to a fixed P/G network, and the simultaneous sleep transistor insertion and P/G network sizing (T IP GS) problem to minimize the weighted area of sleep transistors and P/G network. We show that there may exist multiple sleep transistor insertion solutions that all lead to a same minimum area in the T IS and T IP GS problems. We develop optimal algorithms to T IS and T IP GS problems by modeling the circuit as a single current source, and then extend to the case modeling the circuit as distributed current sources. Compared with the best known approach, our algorithms achieve area reduction by up to 44.1% and 61.3% for T IS and T IP GS, respectively.
INTRODUCTION
Leakage power has gained an increasing importance as the VLSI technology advances to the deep-submicron era. The key of sizing sleep transistors includes 1) characterization of switching current and 2) physical design of sleep transistors. Most existing work studies characterization of switching current. Some recent papers have also studied the synthesis of sleep transistors. In [2] , the discharging pattern of the switching current is exploited to save sleep transistor area. In [3] , circuits are divided into clusters and each cluster is connected to a sleep transistor. To reduce the size of sleep transistors, techniques such as bin-packing and set-partitioning have been employed to reduce the simultaneous switching current in the clusters. In [4] , to take advantage of the discharge balancing property of switching current, a mesh of distributed sleep transistors is proposed to save the area of sleep transistors. In addition, [5] employs a distributed P/G model and proposes two design styles to layout sleep transistors. They are inserted between each row of the standard cells and P/G network in one style and form an external ring between all gates and external power supply pins in the other. However, all above work assume ideal or fixed P/G networks and there is no automatic method to simultaneously optimize sleep transistors and P/G networks.
In this paper, we develop automatic physical synthesis of sleep transistors with a distributed P/G model. Specifically, we study two problems: the sleep transistor insertion (and sizing) (T IS) problem with fixed P/G network, and simultaneous sleep transistor insertion and P/G network sizing (T IP GS) problem that sizes both sleep transistors and P/G network wires. The rest parts of the paper are organized as follows. We present modeling and problem formulations in Section 2, and solve the T IS and T IP GS problems in Section 3 and 4, respectively. We present the experiment results in Section 5 and conclude the paper in Section 6. The proofs of all lemmas and theorems are included in a technical report [6] .
MODELING AND PROBLEM FORMU-LATIONS
We summarize the notations frequently used in this paper in Table 1 . 
Switching current model
The switching current of gates is time-variant and varies with respect to the input of the circuit. It has been modeled as time-invariant variable to reduce the complexity in [7] [8] [9] . In this paper, we model the switching current as timeinvariant maximum current and will extend to time-variant current model in the future.
P/G network model
P/G networks include power networks and ground networks. A power network can be transferred into a ground network by reversing the directions of currents. Therefore, in this paper we only consider the ground network without loss of generality.
The P/G network is modeled as an adjoint multi-port resistive network with one common-terminal, the ground(GND). The resistance of P/G branches is
where ρp, Lp and Wp are the sheet resistance, length, and width of P/G branches, respectively. We illustrate the modeling of P/G network in Fig. 2 . As shown in the figure, gates are modeled as current sources and connect to the P/G network through tapping points (TP). P/G branches are modeled as resistors.
GND Pin GND Pin
TP
Figure 2: An example of P/G network modeling.
A resistive network can be represented as a graph Γ(V, B), where V is the vertex set and B is the branch set. Of particular interests are special subsets of B called cut-set defined as follows.
Removing all branches in C causes the network unconnected, but the removal of any proper subset of this set keeps the network connected. Among all cut-sets, those disconnecting all T P from power supply pins are defined as TP cut-set and denoted as CT P (see Fig. 2 for an example).
Sleep transistor insertion and sizing
We formulate the sleep transistor insertion problem as follows.
Formulation 1. Given a fixed P/G network Γ(V, B), the min-area sleep transistor insertion (and sizing) problem (T IS) finds a set of branches C ⊆ B to insert sleep transistors with minimum area such that all paths between T P and power pins are interrupted, and voltage drop constraints are satisfied. Theorem 1. The optimal solution to the T IS problem must be a CT P .
A CT P divides V into two disjointed subsets where all T P are in one set V1, and all external power pins in the other set V2. Although the net current should flow from V1 to V2, the current directions in particular branches of CT P , however, could be different. Intuitively, the non-uniform current directions in CT P result in a larger sleep transistor area for the given voltage drop constraints. Therefore, we only consider a CT P with the uniform current direction from V1 to V2. This kind of CT P is denoted as − −→ CT P in the following.
Simultaneous sleep transistor insertion and P/G network sizing
Under a constant voltage drop constraint, increasing the area of sleep transistors leads to smaller voltage drop, which would allow us to save area on P/G network, or vice versa. In this sense, the area of P/G network and sleep transistors are exchangeable. This area exchangeability can be used to reduce the total chip area. For example, in a design with small number of metal layers, the routing area may be the bottleneck to decide the size of the chip. In this case, budgeting a relatively large area to sleep transistors but a small routing area to P/G network can reduce the total chip area.
To provide a smooth trade-off between the area of P/G network and that of sleep transistors, we formulate the simultaneous sleep transistor insertion and P/G network sizing problem as follows:
Formulation 2. Simultaneous sleep transistor insertion and P/G network sizing (T IP GS): Given P/G network topology and voltage drop constraint, the T IP GS problem finds a − − → CT P to insert sleep transistors and determines the size of sleep transistors and P/G branches such that αAp + βAs is minimized, where α and β are given constants, and Ap and As are the area of P/G network and sleep transistors, respectively.
TIS PROPERTIES AND ALGORITHMS
We first solve T IS on Single Source Network (SSN ), where all gates are modeled as a single current source and then extend the solution to Multiple Source Network (M SN ), where gates are modeled as distributed current sources.
Single source network
SSN falls into the category of one-port two-terminal resistive network as shown in Fig. 3 . The two terminals are T P and ground(GND). In this network, driving-point impedance is defined as
where V and I are the voltage and current between T P and GND, respectively. Regarding this network, T P is a single node and we have:
, if the resistance of the resistor in each branch ci increases by ∆ri > 0, we have
where ∆R is the increase of the driving-point impedance.
Lemma 2. For an arbitrary − − → CT P = {c1, c2, . . . , c k }, if the current on P/G branch ci is ii and
1/∆ri is given, the following conditions minimize ∆V on T P (the increase of voltage after increasing the resistance):
Lemma 3. All the sleep transistors have a same voltage drop in an optimal T IS solution.
Lemma 1, 2 and 3 reveal the following solution to T IS in SSN .
leads to an optimal solution for T IS, where ii is the current in ci, V is the voltage constraint on T P , and Vp is the voltage on T P before the insertion of sleep transistors.
Theorem 3. Any − − → CT P leads to a optimal solution of T IS with the same area.
Note that Theorem 2 and 3 solve T IS optimally and indicate that the optimal solution of T IS is not unique. This design freedom could be used to optimize for other design constraints such as routing congestion. 
Multiple source network
where Ii is the current source placed between terminal i and GN D, and ∆Vi is the increase of voltage at terminal i.
T IS of M SN can be solved based on Hypothesis 1. By Hypothesis 1, we have
where V is the voltage drop constraint on T P , Ii is the current on T Pi, vp,i is the voltage on T Pi with no sleep transistors inserted, and rs,i is the resistance of sleep transistors. Similar to Theorem 2 in SSN , we have
The right-hand side of (7) is the lower bound on the area of sleep transistors in M SN . One solution to achieve the minimum area is to find a separable − − → CT P , which is defined as follows.
such that 1) For any
is a − − → CT P for T P i.
One way to obtain a separable − − → CT P is to use all P/G branches directly connected to a current source as − − → CT P
(i)
. In summary, an algorithm is described in Fig. 5 . Also, Hypothesis 1 will be verified experimentally in Section 5.
.
For each
For each
, insert sleep transistor with 
TIPGS PROPERTIES AND ALGORITHMS
As in Section 3, we first solve T IP GS in SSN and then extend the solution to M SN in this section.
Single source network
Let Ap be the area of the P/G network, we have
To solve the T IP GS problem for SSN , we introduce the following lemmas first.
Lemma 4. In a min-area P/G network satisfying voltage drop constraint V at tapping points, the product of the P/G area A * p and V is a constant. We define the constant product as
Lemma 4 indicates that A * p is reversely proportional to Vp and shows that the optimal sizing solution under a voltage drop constraint Vp,1 can be extended to another voltage drop constraint Vp,2 by scaling branches with the ratio of Vp,1/Vp,2. Similar to Lemma 4, we have the following lemma for sleep transistors.
Lemma 5. For a given P/G network, we assume that sleep transistors inserted at an arbitrary − − → CT P have a voltage drop equal to or below Vs. The product of the minimum sleep transistor area A * s and Vs is a constant. We define the constant product as
Lemma 5 indicates the same property for sleep transistors as Lemma 4 for P/G network. Lemma 6. Given the voltage drop constraint on the T P in SSN as V , we have
In other words, (11) provides a lower bound on the weighted area of P/G network and sleep transistors.
With a total voltage drop V over P/G network and sleep transistors, we denote the voltage drop constraint on sleep transistors as Vs and the voltage drop constraint on P/G network by removing sleep transistors as Vp.
Theorem 4. In an optimal T IP GS, Vs and Vp must be
and
respectively.
Note that T P is a single node in T IP GS.
Theorem 5. Inserting sleep transistors at any − − → CT P leads to optimal T IP GS solutions with the same weighted sum of P/G network and sleep transistor area.
Theorem 4 is a necessary condition to minimize the weighted sum of P/G network and sleep transistor area. To make it sufficient, additionally we need to 1) optimally size P/G network to minimize Ap under the voltage drop constraint Vp determined by (13) 
Multiple source network
Similar to SSN , K * p and K * s can be defined for M SN . Then, the counterpart of Theorem 6 is presented as follows.
Hypothesis 2. Given the voltage drop constraint on T P in M SN as V , we have
In other words, (14) provides a lower bound on the weighted area of P/G network and sleep transistors in M SN .
If Hypothesis 2 holds, Theorem 4 and 5 hold for M SN , too. Therefore, an T IP GS algorithm for M SN can be developed as in Fig. 6 . However, no algorithm has been proposed in the literature to optimally size P/G network (step 2 in Fig. 6 ). Nevertheless, we can construct the best algorithm to minimize αAp + βAs based on the best known algorithm to size P/G network.
EXPERIMENT
In this section, we first verify Hypothesis 1 and 2 by experiments, and then compare the Hypo1-based T IS algorithm in Fig. 5 and Hypo2-based T IP GS algorithm in Fig. 6 with alternative algorithms based on sequential linear programming.
Verification of Hypothesis 1
For the purpose of verifying Hypothesis 1, we define effective area ratio (EAR) as
where Ii, ∆Vi, and ∆ri are the same as in Hypothesis 1. If Hypothesis 1 holds, we have
To verify Hypothesis 1, we compute the EAR for nine mesh networks as shown in Table 2 under 100,000 random solutions. For each solution, the value of current sources , the − − → CT P , and the size of sleep transistors are randomly chosen, and EAR is obtained by solving the networks with a linear solver integrated in SIS1.2 [10] . We report the computed EAR in column 4 of Table 2 : Random solutions(100,000 ×) to compute the maximum EAR.
According to column 4 of Table 2 , it clearly shows that the maximum EAR values in all networks are equal to or less than 1. This means that the solution of T IS by the algorithm in Fig. 5 has the smallest area among all these 100,000 random solutions. This strongly indicates the correctness of Hypothesis 1.
Verification of Hypothesis 2
To verify Hypothesis 2, we define effective area ratio as
If Hypothesis 2 holds, we have
We compute the EAR for T IP GS in the same fashion as for T IS. For each circuit, we carry out 100,000× random solutions to find the maximum EAR. However, in T IP GS K * p and K * s are needed to compute EAR. According to Lemma 4,
Since A * p is unavailable in the experiments, we approximate
where S represents the set for all solutions. K * s is computed by
Ii.
We reported the computed EAR in column 5 of Table 2 . According to column 5 of Table 2 , the maximum EAR is always less or equal to 1 among 100,000 random solutions for all networks. This clearly implies the correctness of Hypothesis 2.
Comparison between algorithms for T IS and T IP GS
Circuit # Block # GND SLP-based Hypo1-based As(%) As(%) Comparison between SLP-based and Hypo1-based algorithm for T IS.
Algorithms
We have revised the sequential linear programming algorithm proposed in [7] to solve T IP GS (denote as SLP-based algorithm) as a comparison base. The sequential linear programming algorithm in [7] is employed to size P/G network, where each branch of P/G network is modeled as a resistor. Because sleep transistors are also modeled as resistors, we are able to modify [7] to size both the P/G network and sleep transistors simultaneously (See [11] for details of the algorithm).
In fact, the SLP-based algorithm provides a comparison base for both Hypo1-based algorithm to solve T IS and Hypo2-based algorithm to solve T IP GS. Hypo1-based algorithm follows the exact steps in Fig. 5 . The Hypo2-based algorithm follows the steps in Fig. 6 but with minor modifications. Because there is no optimal algorithm available to minimize Ap, we employ the SLP-based algorithm to obtain the "optimal" P/G network under given voltage drop constraints.
For all algorithms in the experiments, we have chosen the same separable − − → CT P that is directly adjacent to the tapping points. Theorem 3 and 5 indicate that all − − → CT P have the same optimal value for both T IS and T IP GS, but experiment results have shown that this − − → CT P produces a relatively good result for SLP-base algorithm. Therefore, the experiment setting is favorable to the SLP-base algorithm.
Results
The SLP-based, Hypo1-based, and Hypo2-based algorithm have been applied to NCSU benchmarks [12] . Switching current is modeled as time-invariant and the current density is Circuit # Block # GND
SLP-based (%)
Hypo2-based (%) 300mA/mm 2 , which is similar to that of the Alpha microprocessor in [13] . We assume the P/G pitch as 50µm and present Ap and As in the percentage of chip area.
To compare the SLP-based algorithm with the Hypo1-base algorithm, we first apply the SLP-based algorithm to find the size of P/G network branches and the size of sleep transistors. Then, we fix the size of P/G network branches and resize the sleep transistors by using the Hypo1-base algorithm. We compare the total area of sleep transistors obtained by the SLP-based algorithm and Hypo1-base algorithm in Table  3 . For T IS problem, we found that the Hypo1-base algorithm is consistently better than the SLP-based algorithm and it can reduce the transistor area by up to 44.1%. As shown in Table 4 , for T IP GS problem, the Hypo2-base algorithm reduces the total area significantly (up to 61.3%) with α and β being set as 1.0.
Discussion
It is observed in our experiment that SLP-based algorithm usually terminates when only one TP reaches the voltage drop constraint V . The voltage drop slacks on other TP lead to extra P/G network and/or sleep transistor area. From Fig. 5 and 6 , one can see that in Hypo1-based algorithm and Hypo2-base algorithm, the voltage drop on all TP are uniformly equal to the voltage drop constraint, which leads to significant area reduction.
DISCUSSION AND CONCLUSION
Under a distributed P/G network model, we have studied the sleep transistor insertion (and sizing) problem (T IS) and simultaneous sleep transistor insertion and P/G sizing problem (T IP GS). We have developed effective algorithms to solve these two problems by revealing the optimal solutions to them. Compared with the best known approach using sequential linear programming, our algorithms reduce area by up to 44.1% and 61.3% for T IS and T IP GS, respectively. Our T IS and T IP GS algorithms are extremely efficiently too, as all steps are based on closed-form formulas. We have shown that there exist multiple optimal solutions to these problems, which offer design freedom to consider other design constraints such as routing congestion.
The time-invariant current model is assumed in this paper. In the future, we intend to extend our problem formulations and algorithms to time-variant current model.
