As VLSI technology scales toward 65nm and beyond, both timing and power performance of integrated circuits are increasingly affected by process variations. In practice, people often treat systematic components of the variations, which are generally traceable according to process models, in the same way as random variations in process corner based methodologies. Consequently, the process corner models are unnecessarily pessimistic. In this paper, we propose a new cell characterization methodology which captures lithography induced gate length variations. A new technique of dummy poly insertion is suggested to shield inter-cell optical interferences. This technique together with standard cells characterized using our methodology will let current design flows comprehend the variations almost without any changes. Experimental results on industrial designs indicate that our methodology can averagely reduce timing variation window by 8%-25%, power variation window by 55% when compared to a worst case approach. For an industrial low power design, over 300ps reduction on the path delay variation is obtained by using cells characterized according to our methodology.
INTRODUCTION
The International Technology Roadmap for Semiconductors (ITRS) projects that process variations present a critical challenge for both manufacturing yield and parametric yield of integrated circuit products. The process variations consist of systematic components and random components. The systematic variations represent process parameter variations caused by predictable design and process procedures, such as CD (Critical Dimension) variations from different Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. poly gate pitches and metal thickness variations occurred during Chemical Mechanical Planarization (CMP). Therefore, systematic variations behave deterministically.
It is reported in [2] that more than 50% of transistor gate length variations are due to systematic sources. As VLSI technology aggressively scales to 65nm and beyond, the influences from both systematic and random variations become greater and greater. The consequently expanding process corners force designers to set aggressive timing targets which intensify both design productivity crisis and power crisis [1] . Therefore, significant pessimism in process corner estimations is no longer tolerable and systematic variations need to considered differently from random variations.
Among systematic variations, transistor gate length variation has the largest impact on circuit timing and power performance since it directly affects both transistor switching speed and leakage power. Fortunately, gate length variation largely depends on lithography process and can be captured through lithography/OPC (Optical Proximity Correction) simulations. A pioneer work [3] tried to estimate gate length variations through computationally expensive aerial image process simulations. Recently, a post OPC extraction methodology was proposed [4] for timing analysis of critical paths in a design with expensive litho simulation and complicated timing characterization scheme. Another work [5] proposed a timing analysis methodology with awareness of lithography induced gate length variations according to different poly pitches. For a standard cell, 81 variants are characterized for different contexts. The timing characteristic of a cell instance in a layout is obtained by matching its surrounding layout pattern with one of the 81 variants.
In this paper, we present a new standard cell characterization methodology which considers lithography induced gate length variations. Base on our methodology, major advantages of the new litho aware timing analysis flow include that the current standard cell based ASIC design flow is not disturbed at all; the time-consuming lithographic simulations for block and chip level are avoided; and the characterization of a given standard cell is performed only once, the same way as it is done in traditional standard cell development. In addition to the above main contributions, our methodology includes exposure dose and focus variations in the lithography/OPC simulations. We applied our methodology to industrial library cell designs. The experimental results indicate that our methodology can averagely reduce timing variation window by 8%-25%, power variation window by 55% when compared to an existing approach. For an industrial low power high speed design, over 300ps reduction on the path delay variation is obtained by using cells characterized with our methodology. [4] claimed that lithography simulation on individual standard cell does not represent accurate lithographic effect for that standard cell in block level designs because lithographic effect depends on proximity of that standard cell. However, increasing the distance between one shape and other shapes will reduce the impact of the lithographic effect of other shapes tremendously. It is also very important to notice that the closest neighbors of a shape are the dominating factors to model based OPC process and Sub-Resolution Assist Features (SRAFs) generation for a particular shape.
OVERVIEW OF METHODOLOGY 2.1 Standard Cell Architecture
We ran Mentor Graphics Calibre LFD with foundry litho models on two sets of test structures shown in Figure 1 . The first set of the structures have three shapes with distance L between them. The second set of structures have five shapes where the distance between the middle shape and the shapes on both sides is also L. The shapes next to the middle shape are L1 away from it. In both sets, L is changing and L1 in the second set is fixed. The CD (Critical Dimension) data for the middle shape is recorded from our lithography simulation tool.
The result is shown in Figure 2 . We can see that the CD of test structure 2 has much less variations than that of test structure 1, i.e., having neighboring shapes with a fixed distance really helps reducing printability variations of that shape. Also, other test structures we ran showed that a shape has minimal impact on another shape's printing image if there are two or more shapes between them. Figure 2: L impact on CD. The range of L and L1 that we chose to exercise the test structures is hardly random. In our 65nm standard cell library implementation, L is the range of the possible poly gate spacing if two standard cells are placed adjecent to each other. If the two neighbor cells have gate space larger than L, a filler cell with a dummy poly will be inserted between these two cells for Design Rule Check (DRC) and power connectivity purposes. For the first set of test structures, we observe over 10% of CD variability over the range of L. However, in our standard cell library architecture, we can put a dummy poly shape at the cell boundary without introducing any area penalty. In this case, the value of L1 in our test structures represents the minimum spacing between the dummy poly and active transistors in the standard cell. When two cells are placed side by side, the dummy poly shapes of both cells overlap exactly. We also would like to point out that the dummy poly shapes we insert are field poly, i.e., they do not form new devices as they fall in the gap of the diffusions between two closely placed standard cells. Thus, these shapes do not cause extra LVS verification efforts. Our standard cell designs ensure DRC of the dummy poly lines as the gap of the diffusion is large enough. In fact, most of the standard cell libraries used in the industry will easily support dummy poly insertion either without area penalty or with very minimal area impact. By adding dummy poly shapes into the original standard cells, we introduce fixed closest neighbors to the poly gates that are at the cell boundary, thus greatly reduce the CD variations introduced by various proximity of this standard cell in the design as shown in Figure 2 . In fact, they effectively "shield" all the internal transistors from lithographic effects of neighboring structures.
Calculation of Effective Gate Length
In lithography process, there are several contributors of the poly gate length variations (see Figure 3) . One of them is the variation caused by poly pitch to neighbors. Another is the L-shaped poly cornering effect, which is particularly important for small transistors. To obtain an accurate prediction of the transistor behavior, we need to account for multiple sources of poly gate length variations.
Figure 3: Litho systematic poly gate CD variations
The lithography induced deviations of the critical dimension, which is the poly gate length, are usually in the order of a few nanometers. For other larger shapes, the relative shape deviation is much smaller. In a typical 65nm design, poly gate width and diffusion dimensions are at least 2 or 3 times of the gate length. That means that lithography induced circuit performance variations are mostly due to gate length deviation. It is therefore sufficient to extract the lithography information for only poly gate length. The printed image of gate poly shapes across process window will be employed to replace gate length image offset parameters introduced by traditional process corner models. We keep all the other process corner parameters unchanged, such as threshold voltage variation, gate oxide variation, etc.
After lithography/OPC simulation, we have an estimation of the printed images of poly shapes. However, current device models in SPICE can handle only rectangular shaped transistors while the lithography/OPC simulation results are often irregular shapes. In order to solve this mismatch, we try to compute an effective gate length which may provide the same timing/power performance of a post lithography/OPC simulation poly shape. In general, the on current Ion of a transistor determines the timing performance of this transistor. The leakage power of a transistor is mostly dependent on the off current I of f of the transistor. Since on and off currents of a transistor usually have different sensitivities to gate length variations, we need to use different effective gate length for timing and leakage. We use a segmentation technique to compute the effective gate length for timing and leakage. First, we construct two lookup tables for transistor Ion and I of f . For both tables, each row corresponds to a specific transistor gate width and the columns are for different transistor gate length. Each entry of the table represents Ion or I of f of a transistor with gate width and length specified by the row and column indices. The ranges of transistor width and gate length, i.e., the ranges of row and column indices, are based on typical transistor sizes allowed in fabrication. The values of Ion and I of f are obtained through SPICE simulations.
Next, we chop a poly shape from lithography/OPC simulation into multiple segments and each segment can be approximated by a rectangle. This is illustrated in Figure 4 . The Ion and I of f of each small segment can be obtained from a simple calculation based on the lookup tables. Please note that the width of a segment is usually much smaller than fabrication allowed size. Thus, it cannot be matched to any row index in the lookup tables. In addition, the current per unit width depends on the overall transistor width. Therefore, we need to use the lookup tables based on the original nominal transistor width instead of the segment width. For example, consider a transistor with nominal gate length 60nm and width 200nm. We chop its gate poly shape from lithography/OPC simulation into 10 segments. Thus, each segment i has a length li and width of 20nm. Then, we can find the on current Ion(li, 200nm) from the lookup tables. The on current of this segment can be approximated as Ion(li, 20nm) = Ion(li, 200nm)/10. The off current I of f (li, 20nm) can be calculated in the same way.
Once the on and off currents of all segments are available, the overall currents of the entire transistor based on the lithography/OPC simulated poly shape can be calculated as:
where n is the number of segments and w is the width of each segment. Last, the effective gate length L ef f,timing can be found based on the estimated I on,shape and the lookup 
Netlist Back Annotation and Cell Characterization
After we calculate the effective gate length L ef f,timing and L ef f,leakage of each poly gate shape, we need to back annotate the standard cell SPICE netlist with these effective gate lengths. The layout of a transistor may consist of multiple fingers and lithography usually has different effects on each finger depending on the layout environment. Therefore, we need to treat these fingers separately even though they belong to the same transistor. We perform a Mentor Graphics Calibre LVS with ixf option that takes the x and y coordinates of each poly gate shape into a layout netlist even when some poly gate shapes are the fingers of the same transistor. The layout netlist will be fed into our extraction tool to generate a netlist with parasitics. The extracted netlist also keeps each poly gate shape as a separated device. We then use the x and y coordinates to match the poly shape in the extracted netlist with poly shape contours from lithography/OPC simulations. We back annotate the L ef f,timing and L ef f,leakage into the extracted netlist. Thus, for each standard cell, we generate one netlist for timing simulation and another netlist for leakage power simulation. By including dose and focus variations in the lithography/OPC simulations as well as other process corners such as doping and threshold voltage, we can have the extracted netlist for each cell at the worst and the best process corners.
Overall Methodology Flow
The overall flow of our cell characterization methodology is shown in Figure 5 . For given standard cell designs, we first insert dummy poly at their boundaries. Next, we stream out the GDS to feed into our lithography/OPC simulation tool. We then use the simulation results to generate the lookup tables and the new netlists with back annotated effective gate length for timing and leakage. Last, we run the standard cell characterization with our standard flow and tools. 
EXPERIMENT 3.1 Standard Cells
We run lithography/OPC simulation using Mentor Graphics Calibre LFD on our original standard cell library layout. Since our lithography/OPC simulator provides the printed images of transistor poly gates across process window, we can calculate the longest and shortest effective gate length for each gate. With the original standard cell netlist at the worst timing corner, which has the worst case RC parasitics, we change the length of each gate to the longest L ef f of that specific gate. This gives us the annotated cell netlist at the worst timing corner. We do the same for original cell netlist at the best timing corner, only to use the shortest L ef f for each gate. That way, we generate the annotated cell netlist at the best timing corner. The same procedure is repeated for obtaining the annotated cell netlist at the best and the worst leakage corners. All of the generated netlists have been characterized with our standard cell characterization flow. We present the timing and leakage variabilities of a set of representative standard cells in Table 1 . In column 2 and 3, we report the timing variation between two timing signoff corners with the original standard cell netlists and with our new netlists, and the percentage changes are reported in column 4. In column 5 and 6, we report the leakage ratio between two leakage analysis corners with the original cell netlists and with our new netlists. All data are obtained with an input slew of 180ps and output load of 4.7ff. We observe an average of 11% decrease on the delay variabilities. As we performed lithography/OPC simulation across process window through exposure dose and depth of focus rather than at a normal process condition, the lithography induced variation is a significant source of the variability. The data on leakage show that the new netlists of the standard cells have far less variability, with the average ratio less than a half of that of the original netlists. Since transistor leakage depends exponentially on transistor gate length, estimating leakage using our methodology can help identify lithography sensitive design patterns for leakage and thus help to improve standard cell design robustness to leakage variations. With the improvement of OPC and SRAF generation from the foundry, we believe that we will see better design variability control for both timing and leakage.
Design Implementation
We apply our newly characterized standard cell library to one of low power and 250MHz hard macros of our 65nm designs for timing analysis. We use our standard timing signoff flow for this analysis, with the same timing constraints as the original design. Figure 6 shows the timing variability reduction for the 400 most timing critical paths in the design. Timing variability for a path is defined as the path delay difference between the best timing corner and the worst timing corner. Using our methodology, we reduce the variability on an average of about 330ps. With the average path delay variability at 3ns, we reduce the variability by 11%, which is consistent with our standard cell analysis. 
CONCLUSION
In this paper, we proposed a new methodology of standard cell characterization considering lithography effects on transistor gate length variations. Our methodology can be easily incorporated into current design flow with virtually no impact on design schedule. We performed a lithography simulation with foundry validated and calibrated production lithography models across process window. We extracted the electrical parameters from the lithography images. As a result, the standard cell library characterization we have generated reduced the pessimism introduced by the process corners for timing and leakage analysis and we confirmed that with simulations for real 65nm designs.
