Verifying an RC model of the power grid requires one to check if the steady state voltage drops on all the nodes of the grid do not exceed a certain threshold. We propose an approach to correct the grid, in case some voltage drops violate the threshold condition, by making minor changes to the original design. Previous work has been done in [1] on the DC model of the grid and this paper deals with the transient model. Rather than directly reducing the steady state voltage drops below the threshold we work on reducing the first time step voltage drops. The method uses current constraints proposed in [2] to find the first time step voltage drop whose distance to the corresponding threshold is the largest. It then tries to estimate it as a function of the metal widths on the grid. A non-linear optimization problem is then formulated and the required metal line width changes that reduce the first time step voltage drops by a sufficient amount are then determined. The reduction of the first time step voltage drop by that amount will make the steady state voltage drops of all the nodes less than the threshold.
INTRODUCTION
With technology scaling, the supply and threshold voltages are decreasing. Hence, voltage drops on the power grid become more significant and can result in longer circuit delays, leading to soft errors. Checking the integrity of the voltage on the power grid has become crucial in reliable chip design.
Power grid verification via traditional circuit simulation, requires full knowledge of the current waveforms drawn by every circuit block attached to the grid. Once these waveforms are known, the grid is simulated and the voltage drop at every node is determined. This voltage drop is then compared to a threshold to check if any node on the grid is unsafe. Verifying the grid using this approach requires the simulation of a comprehensive set of currents and full knowledge of the circuit. The latter is problematic in case one would like to do the verification early in * This work was supported by Advanced Micro Devices (AMD) Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. the design flow, before all the circuit details are available, which is typically the case.
A new approach was proposed in [2] to deal with these problems. It is based on the current constraints concept to capture the uncertainty about the circuit details and circuit behavior. These current constraints can be obtained from simulations of the circuit or from the knowledge of the overall power dissipation of the circuit blocks. They are a set of upper bounds on the currents that would be drawn by the underlying circuit. A linear program (LP) is then formulated using these constraints, to check if the voltage drop at any of the nodes exceeds a certain threshold under all possible current waveforms that satisfy the constraints. In case none of the nodes violates its voltage drop requirement, the grid is said to be robust. Unlike the simulation based approach, the current constraints approach allows the verification to be done early in the design process when grid modifications can be most easily incorporated.
Once the grid verification is completed, we may have some nodes violating the voltage threshold. It then becomes useful to be able to make some minor changes on the grid so that it becomes safe without having to redo all the design from scratch. A method has been presented in [1] to correct an R model of the grid. This work presents an efficient method to correct an RC model of the grid. Rather than working on reducing the steady state voltage drops below the threshold in a direct way, we work with the first time step voltage drops. This is possible and in fact much easier because of the relation between the upper bound of the steady state voltage drop and the first time step voltage drop. Reducing the first time step voltage drop also reduces the steady state voltage drop, as we will see in this paper. The correction of the grid will be achieved by doing minor changes to the widths of some metal branches.
The method presented here uses linear programming concepts to express the first time step voltage drops as a function of the width parameters. It then uses a nonlinear optimization method to find the width changes that cause a sufficient reduction to the first time step voltage drop. The steady state voltage drops are consequently reduced. This paper is organized as follows. In sections 2 and 3, the grid model and the constraint-based approach are explained. The concept of transient robustness is then defined in section 4. In section 5, the basic linear programming terminology that will be used to formulate the proposed method is introduced. The formulation of our problem is then presented in section 6, and the correction approach is given in section 7. Finally, in sections 9 and 10 the experimental results and some concluding remarks are given.
POWER GRID MODEL
We give a brief review of the RC model of the power grid, where each branch is represented by a resistor and where there exists a capacitor from every node to ground. In addition, some nodes have current sources (to ground) that represent the current drawn by the underlying circuit, and some grid nodes have voltage sources (to ground) that represent the external voltage supply.
Let the power grid consist of n + p nodes, where nodes 1, . . . , n have no voltage sources attached, and nodes (n + 1), . . . , (n + p) are the nodes where the p voltage sources are connected. Let c k be the capacitance from every node k to ground. Let i k (t) be the current source connected to node k, where the direction of current is from the node to ground. We assume that i k (t) ≥ 0 and that i k (t) is defined for every node so that the nodes which have no current source attached have i k (t) = 0, ∀t. Let i(t) be the vector of all current sources i k (t), and u(t) be the vector of all node voltages. If we apply Modified Nodal Analysis (MNA) to the grid, we have [3] :
where G is the n × n conductance matrix of the grid, C is the n × n diagonal capacitance matrix, and V dd is a constant vector each entry of which is equal to the supply voltage value. Let v(t) = V dd − u(t) be the vector of voltage drops. Then, (1) can be written as:
CURRENT CONSTRAINTS
As already stated, we deal with circuit current uncertainties by using the current constraints approach proposed in [2] , which we now review briefly. We distinguish two types of current constraints, local constraints and global constraints.
Local constraints are upper bounds on individual current sources. For example, one may specify that the peak value of the current i k at node k is less than a certain bound, i L,k . We obtain this value from prior simulations of the block, from power budgets, or from engineering judgment, based on the area of the cell or block. We assume that every current source tied to the grid has an upper bound associated with it, so that if a node does not have a current source attached, the upper bound for that current is 0. We can express these constraints as:
Global constraints are upper bounds on the sums of certain subsets of current sources. For example, if the total power consumption of a certain functional block is known, then an upper bound can be specified on the sum of currents drawn by all its internal sub-blocks or cells.
Assuming we have a total number of m global constraints, then we can express them in matrix form as:
where S is an m × n matrix that contains only 0s and 1s, which indicate if that node is included in the constraint or not, and i G is the vector of the upper bound values. The local and global constraints can be combined into a single inequality as follows:
where U is an (n + m) × n matrix whose first n rows form an identity matrix corresponding to the local constraints, and whose remaining m rows form the S matrix, and where im is a (n+m)×1 vector which is the combination of the vectors i L and i G .
TRANSIENT ROBUSTNESS
Applying a time-discretization to (2) leads to:
As in previous work on power grid verification [4] , the problem is formulated by assuming the grid had zero current stimulus for all time t ≤ 0 and then monitoring the solution of the grid under all possible feasible currents as t → ∞. With zero initial currents, v(0) = 0, and at the first time step, t = Δt, (6) leads to:
where
can be shown to be a symmetric positive definite M -matrix [5] , for which A −1 ≥ 0 so that v(Δt) ≥ 0. In general, at t = pΔt we have [4] :
In the rest of the paper, we use the term steady state voltage drop to refer to the value of the worst-case voltage drop when p tends to infinity. We are now ready to define transient robustness. A grid is said to be robust if, for every node k, the steady state voltage drop at k over all possible currents satisfying (5) is less than a given threshold v th . Let v ub (pΔt) be the vector of upper bounds on the peak voltage drop at time point pΔt, defined in [6] . In particular, we will denote v ub (Δt) by Va. From [6] , we have that, as p tends to infinity, v ub (pΔt) converges to:
for all Δt ≥ 0. Hence, in this paper we consider that a grid is robust if the following inequality is satisfied:
where V th is a vector whose every entry is equal to v th . Using [2] , the k th entry Va k of Va can be expressed as the linear program (LP):
where e k is an n × 1 vector of all 0's except that its k th entry is 1, and the constraints are obtained using (5) and (7). 
Δt
) results in V∞ ≤ V th . To prove that the converse of the statement is not true, we will consider the following counter example where:
LINEAR PROGRAMMING
We solve the LP problem presented in (12)-(13) using a similar approach to the one given in [1] , which we now briefly review. The inequality constraints in (13) are converted to equality constraints by introducing slacks:
where s ≥ 0 is a (2n + 2m) × 1 vector of slack variables, I is the identity matrix of size 2n + 2m, b is a vector of size 2n + 2m, and c is a vector of size 3n + 2m. The problem can be written in standard LP form as:
The set of all feasible solutions X [7] is defined by:
D has a rank of at least 2n + 2m because it includes an identity matrix of size 2n + 2m. Hence, there are at least 2n + 2m linearly independent columns {d 
Using (19), rewrite (15) and (16) as:
Because B has full rank, then B −1 exists, and we have:
A basic feasible solution is a feasible solution for which
If the LP has a feasible solution, then it also has a basic feasible solution that gives the same objective function value v k (Δt). The reader is referred to [7] for a detailed proof. So, if a problem has an optimal solution then it also has an optimal basic feasible solution and it is enough to deal with the basic feasible solutions only.
A feasible basis B is optimal if:
The Simplex Method [8] uses this criterion to find the optimal solution of the LP. When the Simplex Method terminates, we must have, not only (24), but also:
because, for our problem, c R = 0 [1].
GRID CORRECTION
If the grid is found to be unsafe, our optimal goal would be to do minor changes to the design so that it becomes safe. Let r be the vector of parameters that can be modified, typically the widths of metal branches in the grid. In our case, we consider that the grid may span several metal layers, each of which may be composed of several regions, and each region has metal lines of uniform width value r i . Given this, then G(r) is linear in r because a change in widths on any branch would have a direct proportional effect on the conductances of that branch. The diagonal capacitance matrix C also depends on r. Each entry C ii (r i ) is the sum of capacitance due to the grid metal branches and other fixed capacitance due to the circuit MOSFETS, and other sources, which we denote by C ii (0). Using the simple parallel plate capacitance model, C ii (r i ) is linear in r i and we can write:
where ox is the permittivity of the metal-oxide, l is the length of the branch and d is the thickness of the dielectric. As well, D, x and v k are functions of r. Our LP problem formulated in (15)-(17), can hence be rewritten as:
Solving this LP corresponds to finding the k th entry Va k (r) of the vector Va(r) defined earlier.
Recall from section 4 that Va th is a function of G and C, hence Va th is also a function of r and it can be expressed as:
To determine the effect of changing r on the worst-case steady state voltage drop, the brute-force approach would be to solve the LP (27)-(29) at every value of r, for all values of k, which
gives Va(r), multiply Va(r) by
and then check if V∞(r) ≤ V th . However this is too expensive. Instead, we propose a much more efficient approach based on the the claim presented earlier, as follows.
PROPOSED SOLUTION
Due to the claim of section 4, if we are able to reduce Va(r) below V a th (r) , then we are guaranteed that V∞(r) is less than V th and hence that the grid is robust. However this might lead to an overestimation of the width changes needed to fix the grid, because the condition is only sufficient but not necessary, and thus even if Va is not less than Va th , the grid may already be safe. What we propose, therefore, is to use nonlinear optimization to reduce Va k (r) as an objective function, but also to occasionally compute V∞(r) and check if V∞(r) ≤ V th . The next two subsections will describe, respectively, the computation of Va k (r) and the nonlinear optimization loop around it.
First Time Step
An expensive brute-force approach to evaluate the objective function Va k (r), would be to solve the LP at every given value of r. Instead, we borrow an approach for doing this from [1] , which turns out to require much fewer solutions of the LP. To briefly summarize this approach, suppose we have solved the LP at some nominal point r 0 . It can be shown that, in a neighborhood around this value of r 0 , defined by:
the solution of the LP (27), ie. Va k (r), can be obtained by solving the linear system:
This neighborhood is referred to as the safety region and the points along its boundary as the breakpoints. This approach is efficient because solving the linear system is much faster than solving the LP. In fact, we will see below that it's possible to estimate Va k (r) inside the safety region using an even faster approach based on Taylor series expansion, and this works very well in practice as we will observe later on. If, in modifying r, we reach the boundary of the safety region, then a new LP is formulated and solved, a new safety region is discovered, and the algorithm continues modifying r and reducing Va k (r). In order to identify the safety region and the breakpoints, we start by finding the multi-variable Taylor series expansions of x B and y around the initial operating point r 0 . Because the conductance and capacitance matrices are linear in r, then A is linear in r and hence B and R are also linear in r and the second derivatives of A, B, and R with respect to r are zero. Furthermore, to simplify the notation, we will drop the arguments r or r 0 in connection with matrices like A, B,and R. 
First Time Step Voltage Drop Estimation
As in [1] , the multi-variable Taylor series expansion for x B (r) around r 0 can be written using multi-index notation [9] as:
where,
and the scalar constant i β is the value of the nonzero index β in α. For example, let us take α = {α 1 , α 2 , . . . , αp}. For β = {1, 0, . . . , 0}, i β is α 1 , for β = {0, 1, . . . , 0}, i β is α 2 , and so on. Using (32), we can express the worst-case first time step voltage drop in the safety region as:
Safety Region Estimation
As in [1] , the multi-variable Taylor series expansion for y(r) = R T π(r) (we define π(r) = B −T (r)c B in (25)) around the initial point r 0 can be written as:
where ∂ α y(r 0 ) is given by:
and where the previous definition of i β is valid, and ∂ α π(r) is given by:
which has the same form as (35). As a result, the safety region is defined by the values of r for which all the elements of the vectors defined in (34) and (37) remain nonnegative. In practice, these expressions can be truncated up to an order N . At the breakpoints, at least one entry, j, of one of these vectors must satisfy: where the m's are parameters that depend on the partial derivatives of x Bj (r) and y j (r). The breakpoint in a given direction is given by the intersection of these equations and the direction vector (which will be explained below). For example, if the direction vector is r = (u 1 t, u 2 t) where u 2 1 + u 2 2 = 1 and t is a scalar variable, then (42) would be a third order polynomial in t:
where the n's are parameters that depend on the m's in (42). We can easily solve these equations for all entries in x B and y to find the smallest t corresponding to the breakpoint.
Nonlinear Optimization
Inside a safety region, and in order to reduce Va k (r), we make use of the nonlinear optimization problem:
where x B (r) and y(r) are given by (34) and (37). A number of nonlinear optimization algorithms are available to solve such problems [10] . We use the steepest descent line search method, with cubic interpolation, which leads to our node correction algorithm summarized in Algorithm 1. In the remainder of this section, we will describe how this algorithm works. 
12:
recompute Va th (rreq),
13:
Using Taylor expansion with r = rreq, find Va k (rreq) = v k (Δt, rreq).
14:
if(λ = λmax) then max step taken = T RUE. As a first step, at line 3, we find the optimal basis using the solution of the nominal linear program at a given r. We then compute the Taylor expressions for the first time step voltage drop and the safety region, using this basis. The search direction is then found using the steepest descent line search method. The maximum step length that can be taken is calculated by finding the breakpoint in that direction. Having the maximum step length, we then compute an appropriate step length that reduces the objective function.
Step length computation is done using the cubic interpolation method [10] . Because the first time step threshold vector Va th is not a constant and varies with r, then that threshold vector is evaluated after every step. The cost involved in the computation is due to the LU factorizations. However, in all test cases, we observed that no more than 35 LU factorizations are needed and the cost was only moderate. Using the step length and the direction, the parameter values are updated and the new Va k (r) is computed. If the maximum step is taken (a breakpoint is reached), the LP is re-solved to get the optimal basis in the new region.
While doing all the above steps, we track the difference between Va k (r) and Va th k (r). When this difference decreases by 10%, we check if V∞(r) ≤ V th based on (10) . Finding V∞(r) effectively checks the voltage drop at all the grid nodes, not just at node k. If this check succeeds, then the grid is robust and the algorithm terminates, returning with the appropriate flag setting to the Top Level Algorithm (Algorithm 2) which causes that algorithm to perform no additional work and exit. Note that it is important to keep the number of V∞ checks somewhat small, because each check is as expensive as a full grid verification which can be costly.
However, it is possible for the sufficient condition in the claim of section 4 to be met before the V∞ check is observed. In this case, we know that node k is safe, but we don't know about the safety of other nodes. So, the algorithm exits and returns to the Top Level Algorithm with a different setting of the flag, which then checks if all nodes have indeed been corrected.
TOP LEVEL ALGORITHM
Our overall approach, which uses the node correction as a subroutine, is given in Algorithm 2 and works as follows. It starts with a grid verification for the given r, and finds V∞(r). If V∞(r) ≤ V th , then the grid is safe. Otherwise, it finds Va th (r) and identifies the node k such that Va k −Va th k is the largest. The required parameter values that result in Va k less than Va th k or V∞ less than V th are found using Algorithm 1. In the first case, the grid is re-verified using the new parameter values to check if any other nodes exceed the threshold and the above procedure is repeated until V∞ ≤ V th . In the second case, the verification was already done in Algorithm 1 and we therefore exit the Top Level Algorithm. In all our test cases, Algorithm 1 was called only once and we did not need to re-verify all the nodes in the Top Level Algorithm which was hence executed only once.
EXPERIMENTAL RESULTS
We implemented the grid correction algorithm in C++. Our algorithm was tested on a set of grids generated according to user specifications of metal layers(M1-M9), pitch and width per layer, current source distribution and grid dimensions. The computations were performed on a 64-bit Linux machine with 24 GB memory. Some of the width changes returned by the algorithm are very small, such as 1% increase or less on many parameters, and the largest widths changes are less than 16%.
In our tests, we varied the number of parameters between 10 and 20. Table 1 shows the percentage width increase for each of the 10 parameters used so that the voltage drops on all the nodes at steady state are below the threshold. Table 2 gives the time needed to correct a single node and the total CPU time of the algorithm. The latter includes full grid verification done before the correction algorithm to pick the node whose first time step voltage drop peak is farthest away from its corresponding threshold, and the time needed to perform the steady state check. Therefore, at least 2 full grid verifications are required. In all the tests that we did, we observed that no more than 2 full grid verifications were needed. The reason why the run time of the algorithm is larger than that of the correction of a single node is because of the full grid verifications, which are highly parallelizable operations. Hence, the total run time can be reduced with more efficient techniques of full grid verification. While testing our algorithm on different grids, we observed many interesting results. As the algorithm progresses, Va th k is increasing and Va k is decreasing. Hence there will be an eventual intersection between the two curves. However, this intersection usually happened after a large number of Simplex solutions because 1) the increase and the decrease in Va th k and Va k , respectively, slow down as the algorithm progresses, 2) for most of the grids we tested, the gap between Va th k and Va k was very large (in fact in most cases
Va k Va th k ≤ 2.6%). The need to check the voltage drops at steady state after a certain number of Simplex solutions is therefore justified. This number is determined by tracking the difference between Va th k and Va k . The first time we do the infinity check is when the value of that difference has dropped by 10% from its initial value. Subsequent steady state checks are done when the difference drops by 10% from the last steady state check. In all our tests, the 10% decrease was sufficient and the grid was fixed the first time we did an infinity check. Hence, only two full grid verifications were needed before the grid was deemed to be safe. The first was needed before the correction algorithm to select the node to be corrected, and the second was needed when we did an infinity check while in the correction algorithm. The intersection point (corresponding to Va k = Va th k ) was never reached and the correction happened much earlier than that. Fig.1 illustrates how our algorithm avoids correction overestimation on a 598 node RC grid. If no infinity checks were done and we waited for the intersection between Va k and Va th k to happen, then we would end up with 55 Simplex solutions and therefore an overestimated correction. In fact, the grid was safe after 8 Simplex solutions only and therefore for smaller changes in r. Our algorithm successfully captured that by performing an infinity check.
In Fig. 3 , the histograms of the voltage drops before and after correction for a 70421 node grid, are given. We only show the voltage drops larger than 60 mV for better visibility. We see from the plots that the algorithm successfully reduced all the voltage drops exceeding the threshold of 100mV. The correlation plot in Fig. 2 , also shows that all the voltage drops on a 1594 grid were reduced under the threshold of 100mV.
CONCLUSION
With the rising demand of low voltage designs, supply voltage integrity verification has become a crucial step in reliable highspeed chip design. In this paper, we propose a fast and easy approach to correct the grid with minimal changes, when some nodes do not satisfy the threshold. We extend the work done in [1] to the case of an RC model of the grid. Instead of reducing the steady state voltage drops directly, we work on reducing the first time step voltage drops, which will have an impact at infinity. We use linear programming concepts and the current constraints-based verification approach and we formulate a nonlinear optimization problem to find the required width changes needed. Our method showed good results: A grid of 70421 nodes was corrected in a total time of 6.8 hours and the changes needed in the widths of metal branches on some levels of the grid were less than 16%.
