Abstract-Power grid voltage integrity verification requires one to check if all the voltage drops on the grid are less than a certain threshold. This paper addresses the problem of correcting the grid when some voltage drops exceed this threshold, by making minor modifications to the existing design. The method uses current constraints that capture the uncertainty about the underlying circuit behavior to find the maximum voltage drop on the grid, and then to estimate the voltage drop as a function of the metal widths on the grid. It formulates a non-linear optimization problem and finds the required change in widths that reduces the maximum voltage drop below the threshold while keeping the total area cost at a minimum.
I. INTRODUCTION
As the supply voltages have been reduced in nanometer chip technologies, modern integrated circuit (IC) designs have become more succeptible to supply voltage fluctuations. With lower supply voltages, smaller voltage drops become more significant and can reduce the timing performance of the chip, leading to soft errors. Thus, voltage integrity verification has become a crucial step in reliable high-speed chip design.
Power grid verification is traditionally done by simulation, which requires full knowledge of the current waveforms drawn by every circuit block attached to the grid. These waveforms would be used to simulate the grid and determine the voltage drop at every node. However, this approach requires 1) a comprehensive set of currents to be simulated, and 2) full knowledge of current waveforms which is a problem if one would like to verify the grid early in the design flow, before all the circuit details are available.
To overcome these problems, current constraints concept was proposed in [1] . These current constraints are a set of upper bounds on the currents that would be drawn by the underlying circuit. They can be obtained by simulations or from the knowledge of overall power dissipation of the circuit blocks. Using these constraints, a linear program (LP) is formulated to check if the voltage drop at any node exceeds a certain threshold under all possible current waveforms that satisfy the constraints. If all the nodes meet their voltage drop requirements, we call this grid a robust grid. An important advantage of the constraint-based approach over the simulation based approach is that it can be applied early in the design process when grid modifications can be most easily incorporated.
Once the power grid verification has been done, some nodes may be found to exceed the threshold. In this case, it is critical to find some means to fix this problem without the need to re-design the whole grid from scratch. This paper proposes a novel approach to correct a given non-robust grid by making minor changes, namely by changing the widths of metal branches on some level or levels of the grid. Previous work [2] , [3] tackled the problem of determining This work was supported in-part by the Semiconductor Research Corporation (SRC) and by the Natural Sciences and Engineering Research Council (NSERC) of Canada. the widths of metal branches to achieve a robust grid for a given set of currents drawn by the underlying circuit. This work, on the other hand, determines the required widths with incomplete information on the circuit currents.
The method presented here builds on linear programming theory to find the maximum voltage drop on the grid as a function of the metal widths. Using non-linear optimization, it then finds the required change in parameters that reduces the maximum voltage drop below the threshold. In this paper, the method is restricted to "DC grids", i.e., for the case where all the currents are DC. We are working to extend this to the general case of time-varying currents.
The paper is organized as follows. In the next section, the grid model and the constraint-based approach are explained. In section III, the basic linear programming terminology that will be used to formulate the proposed method is introduced. The problem is defined in section IV, and the correction approach is formulated in section V. Finally, in sections VI and VII the experimental results and some concluding remarks are given.
II. POWER GRID VERIFICATION
The power grid model and the constraint-based voltage integrity verification method first introduced in [1] will lay the groundwork of our proposed grid correction approach. This is a vectorless method in the sense that it does not require complete information on the circuit currents. It can be performed early in the design process, when the circuit currents are not yet known.
A. The Power Grid Model
Consider an RC model of the power grid, where each branch is represented by a resistor and where there exists a capacitor from every node to ground. In addition, some nodes have ideal current sources (to ground) to represent the current drawn by the underlying circuit, and some grid nodes have ideal voltage sources (to ground) to represent the connections to the external voltage supply. Let the power grid consist of n + p nodes, where nodes 1, . . . , n have no voltage sources attached, and nodes (n +1), . . . , (n + p) are the nodes where p voltage sources are attached. Let ck be the capacitance from node k to ground. Let i k (t) be the current source connected to node k, where the direction of current is from the node to ground. We assume that i k (t) ≥ 0 and that i k (t) is defined for every node such that the nodes which have no current source attached have i k (t) = 0, ∀t. Let i(t) be the vector of all current sources i k (t), and u(t) be the vector of all node voltages. If we apply Modified Nodal Analysis (MNA) to the grid, we have:
where G is the n × n conductance matrix of the grid, C is the n × n diagonal capacitance matrix, and V dd is a constant vector each entry of which is equal to the voltage source value. Let v(t) = V dd − u(t) be the vector of voltage drops. Then, (1) can be written as:
This is a revised sytem equation which represents the same circuit, but with all the current sources reversed and the voltage sources set to zero. In the rest of the paper, we will consider the DC version of this model which can be easily seen as:
B. Current Constraints
Local constraints are upper bounds on individual current sources. One may specify that the current i k at node k does not exceed a certain bound, i L,k . This value may be known from prior simulation or it might be the result of engineering judgment, based on the area of the cell or block. We assume that every current source tied to the grid has an upper bound associated with it, so that if a node does not have a current source attached, the upper bound for that current is 0. We can express these constraints as:
Global constraints are upper bounds on the sums of currents for groups of current sources. For example, if the total power consumption of a certain functional block is known, then an upper bound can be specified on the sum of currents drawn by all its internal sub-blocks or cells. Assuming we have a total number of m global constraints, then we can express them in matrix form as:
where S is an m × n matrix that contains only 0s and 1s, which indicate if that node is included in the constraint or not, and iG is the vector of the upper bound values. The local and global constraints can be combined into a single inequality as follows:
where U is an (n+m)×n matrix whose first n rows form an identity matrix corresponding to the local constraints, and whose remaining m rows form the S matrix, and where im is a (n + m) × 1 vector which is the combination of iL and iG vectors.
C. DC Robustness
A grid is called robust if the maximum worst-case voltage drop of all nodes is less than a given threshold. Therefore, checking the robustness of a grid entails checking if the voltage drop at a node is lower than a threshold over all the possible currents that satisfy (6) . A solution for the DC problem is presented in [1] which formulates the problem so that it can be solved as an LP.
Making use of (3), we can express the DC constraints in terms of DC voltages and DC currents in the voltage domain as:
Let e k be an n × 1 vector consisting of all 0s, except that its k th entry is 1. We can, therefore, express the DC power grid verification problem for the k th node as:
such that 0 ≤ UGv ≤ im
III. LINEAR PROGRAMMING BASICS
The inequality constraints in (8) can be converted into equality constraints by introducing slacks, and redefining the variables as follows:
where s ≥ 0 is a (2n + 2m) × 1 vector of slack variables, I is the identity matrix of size 2n + 2m, b is a vector of size 2n + 2m, and c is a vector of size 3n + 2m. Because the source current vector is element-wise positive and the conductance matrix G is an Mmatrix [4] , then v = G −1 i ≥ 0. Using this new notation, we can write the problem in the standard LP form, as:
Using standard linear programming terminology [5] , any vector x that satisfies (11) is called a solution of the LP. If it also satisfies (12), x is called a feasible solution. The set of all feasible solutions X can be expressed as:
There is a column aj in A corresponding to every variable xj. Because A contains an identity matrix of size 2n + 2m, it has rank 2n + 2m, and we can always find 2n + 2m linearly independent columns {aj 1 , aj 2 , . . . , aj 2n+2m } of A. These columns form a basis, and the corresponding variables {xj 1 , xj 2 , . . . , xj 2n+2m } are called basic variables of the LP. Given a basis, we will denote the index set of these variables by B = {j1, j2, . . . , j2n+2m}, and the index set of the remaining variables by R. To simplify the notation, we can assume that the columns forming the basis are moved to the first 2n + 2m columns of A, by permutation. Therefore, we can write:
where B = AB, and R = AR are submatrices of A corresponding to B and R. Using (14), we can rewrite (10) and (11) as follows:
Because the columns of B are linearly independent, then B
−1
exists, and from (16), we have:
From this, we can see that the values of the basic variables are uniquely determined by the values of the non-basic variables. A feasible solution for which xR = 0 is said to be a basic feasible solution, and it has:
Theorem 1. If the LP has a feasible solution, then it also has a basic feasible solution that gives the same objective function value v k .
The proof of this theorem can be found in [5] . A feasible solution is called optimal if it solves the LP. As a corollary, if the problem (10-12) has an optimal solution, then it has an optimal basic solution, and it is enough therefore to deal with basic feasible solutions only.
Given a basis B, if, upon setting xR = 0, we get an xB ≥ 0, so that the resulting solution x is feasible, then B is said to be a feasible basis. If the resulting x is in fact an optimal solution of the LP, then B is said to be an optimal basis, and (18) gives the optimal solution of the LP.
Theorem 2. A feasible basis B is optimal if
The proof of this theorem is given in [5] . The Simplex Method [6] uses the above result to find the optimal solution of the LP. In this method, a starting basis is selected, and the columns of A are swapped in and out of the basis until d ≤ 0. Thus, the final basis obtained by solving the LP using Simplex Method is an optimal basis, and it satisfies the condition (19). Most solvers that implement this method return the optimal basis. Therefore, once we have solved the LP using Simplex, we can make use of the available optimal basis.
Assuming that the power grid is connected, and assuming that the local and global constraints are not (trivially) all zero, then the worstcase voltage drop for any node k cannot be zero, and must be strictly positive, v k > 0. Recall, from (9) and the definition of e k , that c has at most one non-zero entry. Therefore, from (14), it must be the case that either cR = 0 or cB = 0. Clearly, if, for an optimal basis B, we have cB = 0, then the optimal basic solution of the LP is v k = 0, due to (15) and (18). This contradicts our assertion that the worstcase voltage drop on any node must be strictly positive. Therefore, cB = 0, and the optimal basis must be such that cR = 0.
As a result, when the Simplex method has "terminated", we must have, not only (19), but also in fact:
IV. PROBLEM DEFINITION If a power grid is found to be non-robust, one would like to know how to modify it, so that it becomes robust. Typically, this would involve increasing the width of metal branches on some level of the grid. Let r be the vector of parameters that can be changed. In this paper, we assume that the elements of r correspond to metal widths, so that the grid conductance matrix G = G(r) is linear in r. All other matrices and vectors, such as A, x, v k are also functions of r, so that the verification LP can be restated as:
where an initial value for r is available, which we denote as r0. We refer to the verification problem at r0 as the nominal problem, and its solution the nominal solution. In order to determine the impact of a change in r on the worstcase voltage drop, the brute-force approach would be to re-solve the verification problem at every value of r, but this is too expensive. Instead, we propose an approach where the nominal solution at r0 is used to directly find the worst case solutions in a neighborhood around it. This approach will rely on the validity of the optimal basis at r0 in a neighborhood around r0. In fact, Theorem 2, along with (23), can be used to determine the boundary of the neighborhood around r0 in which the same optimal basis remains valid, as we will see below.
V. PROPOSED SOLUTION Suppose we identify a neighborhood around r0 in which y ≥ 0, throughout. For any r in this neighborhood, let us maintain the same basis B that was found as optimal at r0, so that cR = 0 is also maintained. If we can find a basic feasible solution at r, then, with cR = 0 and y ≥ 0, this x(r) must be optimal, by Theorem 2. This can be achieved by, requiring (22) to ensure that x(r) is a solution, requiring (23) to ensure it is feasible, and setting xR(r) = 0 to ensure that x(r) is basic. Using (15) and (16), this leads to:
which, as long as xB(r) ≥ 0, gives us directly the optimal solution of the LP at r. This leads us to a revised definition of a (possibly smaller) neighborhood around r0, determined by:
Throughout this neighborhood, the solution of (24) and (25) is the optimal solution of the LP. We will call this neighborhood the safety region and the points along its boundary the breakpoints. Thus, we see that the notion of the safety region is crucial to an efficient approach, because it allows us to discover the solution of the LP efficiently, by simply solving the linear system in (25). Our approach follows from the single and multi-variable Taylor series expansions of xB and y around the initial operating point r0, which we use to determine the breakpoints. Because the conductance matrix is linear in r, then B and R are also linear in r and the second derivatives of G, B, and R with respect to r are always zero. This fact will be useful in the following sections. Furthermore, to simplify the notation, we will drop the arguments r or r0 in connection with the matrices like G, B, and R.
A. Single Parameter Variations
In this section, we assume that there is only one parameter that can be changed, i.e., r consists of a single element; it is a scalar. This makes the safety region simply an interval around r0 in one dimension.
1) Voltage Drop Estimation:
The Taylor series expansion for xB(r) around the nominal point r0 gives:
where
B (r0) is the i th derivative of xB(r), evaluated at r0. It is shown in [7] that this derivative is given by:
for i ≥ 1. We can also write (28) as:
Combining (27) with (24), leads to:
In practice, the summation does not need to be carried out to infinity.
In fact, in all test cases that we have seen, the nonlinearity is not very severe at all, so that truncation can be done quite accurately, and we write:
where N is a small integer, in practice about 3. In order to make use of this result, we need to 1) have xB(r0), which we already know from the nominal solution, 2) evaluate dB/dr at r0, and this can be easily done because the dependence of G on r is well known based on the construction of G from element stamps during modified nodal analysis (MNA), and 3) to be able to compute the product of B −1
by a vector, which we can do using LU factorization. As a result, we have an easy-to-evaluate, polynomial expression for the solution v k (r) throughout the neighborhood. It remains to discover the safety region.
2) Safety Region Estimation:
In order to discover the safety region, we will develop an expression for y(r). We start by writing the Taylor series expansion for:
as we did for xB(r), which gives:
We define:
so that
Claim. For i ≥ 1, the i th derivative of y(r) with respect to r is given by:
Proof: We will prove this claim by induction. The basis case, for i = 1, is trivially true due to (35), which gives:
Now, assuming the claim is true for i − 1, then:
Let us take the derivative of this equation:
which is the desired result. As was done for xB, we can write the i th derivative of π(r) with respect to r at r0 as:
If we insert this in (36) at r0 and reorder, we get:
In (27) and (33) we model the behavior of xB(r) and y(r) based on their Taylor series expansions around r0. In practice these Taylor series can be safely truncated, up to an apropriate order, say N . Since we require all the entries of these two vectors to remain nonnegative, we can equate the resulting N th order polynomials to zero, and find their roots, i.e., ∀j:
From this, the smallest root to the right of r0 and the largest root to the left of r0 determine the breakpoints. Between these two breakpoints, we are guaranteed that y(r) ≥ 0 and xB(r) ≥ 0.
B. Multi-Parameter Variations
In this section we will generalize our findings to the case of a vector of parameters. This corresponds to changing the widths of different metal branches, possibly by different amounts.
1) Voltage Drop Estimation:
We can write the multivariable Taylor series expansion for xB(r) around r0 using multi-index notation (see Appendix 1) as:
Similar to (29) in the single parameter case, ∂ α xB(r) is given by (see Appendix 2 for proof):
where the scalar constant i β is the value of the nonzero index β in α. For example, let us take α = {α1, α2, . . . , αp}. For β = {1, 0, . . . , 0}, i β is α1, for β = {0, 1, . . . , 0}, i β is α2, and so on. Using (24), we can express the voltage drop in the safety region as:
2) Safety Region Estimation: Let us write the multivariable Taylor series expansion for y(r) = R T π(r) around the initial point r0 as:
Similar to (36), ∂ α y(r0) is given by:
where the previous definition of i β is valid (see Appendix 2 for proof). And, ∂ α π(r) is given by:
which has the same form as (45). As a result, the safety region is defined by the values which make all the elements of the vectors defined in (44) and (47) stay nonnegative. In practice, these expressions can be truncated up to an order N . Therefore, at the breakpoints, at least one entry, j, of one of these vectors satisfy:
These are N th order polynomials in r. For example, in the case when N = 3, and say, there are two parameters under consideration, i.e. r = (r1, r2) and r0 = (r10, r20), (50) and (51) will have the form:
where the ms are constants that depend on the partial derivatives of xBj(r) and yj(r). The breakpoint in a given direction is given by the intersection of these equations and the direction vector. For example, if the direction vector is r = (u1t, u2t) where u 2 1 + u 2 2 = 1, (52) will be a third order polynomial in t:
where the ns are constants that depend on the ms in (52). We can easily solve these equations for all entries in xB and y to find the smallest t corresponding to the breakpoint.
C. Nonlinear Optimization
Once we have parametric equations for the voltage drop and the breakpoints, we can construct a nonlinear optimization problem to minimize the voltage drop in the safety region:
where xB(r) and y(r) are given by (27) and (33) in the single parameter case, and (44) and (47) in the multiparameter case by truncating them up to an order N . A number of nonlinear optimization algorithms are given in [8] . We used the steepest descent line search method with cubic interpolation to determine the step length in our implementations.
Note that the safety region represents the region where the optimal basis (found at the nominal solution) remains optimal. If the algorithm reaches a breakpoint as the minimizer, we can re-solve the LP given by (21-23) to determine the new optimal basis and restate the functions to reduce the voltage drop in the new safety region.
Algorithm 1 describes the reduction of the worst-case voltage drop at a given node. It starts with the solution of the nominal linear program to determine the optimal basis. Using this basis, the expressions for the voltage drop estimation and the safety region are computed. Then, the nonlinear optimization algorithm is employed. First, the search direction is found. Then, the maximum step length that can be taken is calculated by finding the breakpoint in that direction. Using the maximum step length, an appropriate step length that reduces voltage drop is computed. If a step length that reduces the voltage drop cannot be found, the algorithm exits. Using the step length and the direction, the parameter values are updated and the new voltage drop is calculated. If the maximum step is taken, the LP is re-solved to get the optimal basis in the new region. This procedure is repeated until the voltage drop on the node in consideration is less than the threshold. It remains to check that all node voltages are also now below the threshold. Solve the LP given in (21-23) using Simplex for at r = rreq and get the optimal basis 4: Find the expression for v k (r) using (31) or (46) 5: Find the safety region expressions using (27) and (33), or (44) 
D. Top Level Algorithm
Recall that a robust grid refers to a grid in which the highest voltage drop among all nodes is less than a given threshold value. However, Algorithm 1 was focused on a single node. Algorithm 2 describes the overall procedure. It starts with identification of the maximum worstcase voltage drop on the grid and the offending node. If it is less than the threshold, the grid is deemed safe. If not, the required parameter values that result in the maximum voltage drop on that node to be less than the threshold are found using Algorithm 1. Then, the grid is re-verified using the new parameter values to check if any other nodes exceed the threshold, and the procedure is repeated until the maximum voltage drop on the grid is less than the threshold. In all cases that we tested, we found that no other nodes ever exceeded the threshold, once the initial maximum node voltage was reduced below the threshold. Thus, a single run of Algorithm 1 was enough in all cases and, therefore, there was no practical benefit in re-verifying the whole grid after Algorithm 1 had terminated. Verify all nodes at r = r f to identify the node, k, with the maximum worst-case voltage drop, vmax.
5:
if (vmax ≤ v th ) then 6: grid unsaf e = F ALSE 7:
Find the required change in parameter, rreq, to reduce the maximum voltage drop at k, below v th using Algorithm 1 9: r f = rreq
VI. EXPERIMENTAL RESULTS
The grid correction algorithm was implemented in C++. The parameters were taken to be the widths of metal lines on different layers. A number of test grids were generated based on user specifications, including grid dimensions, metal layers (M1-M9), pitch and width per layer, and current source distribution. For the full grid verification, the DC equivalent of the algorithm in [9] was used. The computations were carried out on a 64-bit Linux machine with 8GB memory. As it stands, this algorithm can suggest changes of width that may be extremely small, such as 1% increase or less on many parameters. For practical reasons, it makes more sense to change the width by only some significant increment, say 5% or 10% increase, and not to bother with very small increases. We can accommodate this easily by rounding the values returned by the algorithm to, say, the nearest 5% or 10% setting. Our implementation includes this optional rounding step, and, as a sanity check, we apply a final check that this node voltage drop is still below threshold, before leaving Algorithm 1. In all test cases that we have run, we have not found that any further work is required, and the node voltage has always remained below the threshold. We tested the multi-parameter variations on a number of grids, varying the number of parameters between 6-18. The maximum voltage drops on the grids before and after correction are given in Table I . We also show the number of Simplex solutions required for Algorithm 1, the runtime of individual node correction, and the total runtime of Algorithm 2. The individual node correction time is the time it takes for Algorithm 1 to reduce the voltage drop on the node with the initial maximum voltage drop. We see that the grids were corrected in a reasonable amount of time. The total runtime takes into account the full grid verifications done before the correction to determine the node with the maximum voltage drop and after the correction to see if any other voltage drop requires reduction. It can be improved with faster grid verification techniques. Table II gives the required changes in grid parameters to reduce the maximum voltage drops on the grid below the threshold. We can see that, in some cases, only some parameters may need to be changed, which means the voltage drop is more sensitive to those parameters. It can be seen that the required area increase is modest, generally below 10% for the test cases that we tested. In the absence of an approach like ours, the only option available today for correcting the grid is, perhaps, to simply increase the metal widths everywhere. It is clear that, generally, an increase in metal width everywhere will fix the grid. However, with no further guidance as to which metal widths should be increased, the required overall area increase can be substantial. We have tested this, for the grids under study, by increasing all metal line widths until the grid becomes robust, and the overall area increase is reported in the last column of Table II, "Fixed area cost". The overall metal area increase is large, ranging from 15% to 20%, and this area overhead can seriously complicate signal line routing. Thus, the value of our approach is that it allows one to selectively and intelligently increase metal width to achieve grid robustness.
In Figure 1 , the correlation plot of the voltage drops before and after the correction for the 38,101 node grid is given. We see that the algorithm successfully reduced all the voltage drops exceeding the threshold. Finally, the histograms of the voltage drops before and after the correction is given in Figure 2 . 
VII. CONCLUSION
Voltage drop on the power grid is a key concern for design of modern integrated circuits. It is traditionally done by comparing the voltage drops on the grid with a certain threshold that guarantees reliable circuit performance. In this work, we propose a novel method to correct the grid with minimal change when some nodes exceed the given threshold. It builds on linear programming and the current constraints-based verification approach, and formulates the problem as a non-linear optimization problem to find the required change in metal widths that reduces the maximum voltage drop on the grid below the threshold. We only considered the DC currents in this work; correction in the case of transient currents is a part of ongoing research. 
Similarly, the partial differential operator: Next, we give the multivariable Taylor series expansion using multi-index notation:
Here, x and x0 are vectors of length n and the sum is over all norms of multi-indices of length n. The Taylor series expansion of order N sums the terms up to the multi-indices of norm N . For illustration we will "unpack" Taylor series expansion of order 3, in the case when n = 2. Let x = (u, v), x0 = (u0, v0), and 
