Process variations due to lens aberrations are to a large extent systematic, and can be modeled for purposes of analyses and optimizations in the design phase. Traditionally, variations induced by lens aberrations have been considered random due to their small extent. However, as process margins reduce, and as improvements in reticle enhancement techniques control variations due to other sources with increased efficacy, lens aberrationinduced variations gain importance. For example, our experiments indicate that lens aberration can result in up to 8% variation in cell delay. In this paper, we propose an aberration-aware timing-driven analytical placement approach that accounts for aberration-induced variations during placement. Our approach minimizes the design's cycle time and prevents hold-time violations under systematic aberration-induced variations. On average, the proposed placement technique reduces cycle time by ∼ 5% at the cost of ∼ 2% increase in wirelength.
INTRODUCTION
Aberrations can be described as the departure from ideal imaging induced by an imperfect lens system, as shown in Figure 1 . Undesirable imaging artifacts from aberration are uncorrectable and, indeed, are sometimes exacerbated through use of resolution enhancement techniques (RETs) such as phase-shift mask and off-axis illumination [1] . Zernike's coefficients capture the deviation from ideal imaging and may be used during lithography simulation to predict the impact of lens aberration on critical dimension (CD). CD variation caused by lens aberration is relatively small compared to that caused by defocus and pattern proximity. However, most CD error caused by proximity can be corrected by RETs. Thus, lens aberration has turned out to be a major source of residual errors in across-field linewidth variation (AFLV) [3] .
Recent studies of lens aberration control have focused on measurement systems [4] and pattern sensitivity of aberration [12] , as well as lens mounting systems to compensate for the aberration [11] . However, despite these efforts, the impact of lens aberration on CD will be an ever-present barrier to manufacturing yield as minimum design rules are pushed closer to fundamental resolution limits. From the design perspective, variations in CD affect the delays, slews, input capacitances and leakage of a given logic cell. We also observe that the maximum difference in delays of all timing arcs in a cell (delay skew) increases significantly due to lens aberration as different MOS devices in the layout are affected differently by aberration.
In this paper, we first describe a novel aberration-aware timing analysis flow that integrates: (i) results of lithography simulation to measure CD across the lens field, (ii) SPICE simulation-based library characterization that captures the impact of CD variation due to aberration on timing and power, and (iii) placement information. In addition to aberration-aware timing analysis, we propose an aberration-aware timing-driven analytical placement framework that minimizes clock cycle time and avoids hold-time violations, without significantly increasing total wirelength. The placer is driven by models that capture the impact of lens position on timing-arc delays in cells, and by weighted-wirelength models. Essentially, we preferentially place cells that are setuptime (resp. hold-time) critical at lens field locations where aberrations cause the cell delay to decrease (resp. increase). The contributions of our work are as follows.
• Using industry OPC recipes and aberration parameters, and realistic design testcases, we show that the variation in timing due to lens aberration can be significant. Over the cells in a 90nm foundry library, we observe average cell delay to change by 2% − 8%. The maximum difference in delays of all timing arc of a cell (delay skew) increases significantly.
• We develop a novel aberration-aware timing analysis flow that allows more accurate timing analysis, taking into account the position of the chip in the lens field. It also considers the increase in delay skew caused by aberration.
• We propose an aberration-aware, timing-driven analytical placement flow that considers the impact of lens aberrations on timing to minimize clock period and avoid hold time violations without significantly increasing total wirelength. On average, cycle time reduces by ∼ 5% at the cost of ∼ 2% increase in wirelength, and there are no hold-time violations.
The remainder of this paper is organized as follows. In Section 2, we describe lens aberration and study its impact on CD and gate delay. Section 3 presents our novel aberration-aware timing analysis flow. Section 4 describes our aberration-aware analytical placement formulation and implementation details. Test designs, experimental conditions and experimental results are presented in Section 5. We conclude in Section 6 with a brief description of ongoing research. 
DESIGN IMPACT OF LENS ABERRATION
In this section we briefly describe how lens aberration impacts CD and consequently circuit delay.
CD impact of Lens Aberration
Several manufacturing process steps are involved in transfer of the pattern on the mask to the photoresist, and then to the wafer. Lens aberration comes into play when the photoresist is exposed to light during lithography. Modern lithography systems use step-and-scan to expose small portions of the wafer at a time, and then shift to the next region. The portion of the wafer that gets exposed in a step is called the lens field, or simply field. In each step, the photoresist is exposed to light through a slit that is scanned from one side of the field to another. Lens aberration parameters (Zernike's coefficients), which capture the divergence from ideal behavior of light, change as the slit translates horizontally. Hence, the CD error induced by lens aberration varies along the horizontal direction but stays constant along the vertical direction. While the variation in CD along the horizontal direction is continuous, it is reasonable to discretize it and assume it to remain constant over small regions as shown in Figure 2 . Based on industry-supplied Zernike's coefficients at multiple locations in the lens field, we run lithography simulation on some frequently-used standard cells from a 90nm foundry library, and study the impact on CD. Figure 3 shows average CD variation of devices in BUFX4, INVX2, NAND2X4 and NOR2X1 cell instances as their position within the lens field is varied. For example, average gate CD variation of NAND2X4 with 100nm worst defocus is up to 8nm across the entire lens field. In addition, we investigate the CD skew (maximum difference in CD over all devices in a cell) of different cells. Large CD skew can imbalance the timing arcs of a cell, as we discuss in greater detail below (Section 3).
Delay Impact of Lens Aberration
Variations in CD directly and indirectly affect circuit delay. At the device level, increase in gate CD causes an approximately proportional increase in on-current of the device. Since lens aberration affects different devices in a cell differently, each of the cell's timing arcs can be affected differently. Most standard cells are designed such that the difference in delays of timing arcs (delay skew) is small. Due to lens aberration, however, this delay skew can increase -e.g., arcs that are governed by largerthan-nominal CDs will be slowed down, while those governed by smaller-than-nominal CDs will be sped up. Figure 4 shows how the delay, averaged over all timing arcs, changes for four cell masters as the cell instance location is varied from the lens center. CD variations also cause variations in cell input capacitance and output slews (transition times). Input capacitance affects the loading of fanin cells and consequently their delays. Similarly, slews affect the output slews and delays of cells in the fanout cone. Again, to avoid unnecessary guardbanding, the performance analysis flow (library model characterization, timing/SI analysis, etc.) must comprehend these systematic variations.
ABERRATION-AWARE TIMING ANALYSIS
In this section we describe our aberration-aware timing analysis flow. While the flow is complete and self-contained, it is at the same time designed for, and will be used by, the analytical placement framework described in Section 4. Our aberration-aware timing analysis flow involves two main steps: (1) constructing timing libraries of all standard cells for different locations in the lens field, and (2) using placement information of the design to compute the location of all cell instances in the lens field, then using this information to look up appropriate models in the timing library for use with off-the-shelf static timing analysis tools.
Our timing library technique creates a priori variants for each cell master, such that there is one variant for every possible assignment of CDs to devices. This means that given any assign- ment of CDs to devices, an exactly matching, pre-characterized cell variant can be found. After lithography simulation provides CDs of all devices in all cells, a correctly matching variant can be picked for use in timing analysis. Though this flow is very accurate, it requires a very large number of cell variants (exponential in the number of devices in the cell); this is infeasible with respect to both characterization time and library size. In our flow, variants are created for each cell for different lens field locations. Figure 5 illustrates our timing library construction flow. We begin with standard-cell GDSII files and use Mentor Graphics Calibre v9.3 5.11 for sub-resolution assist feature (SRAF) generation and model-based OPC. We use Zernike's coefficients for eight sampling positions in the lens field from a major chip maker, and compute the other coefficients at 19 different locations with 1.5mm stepsize on the field using linear interpolation. Using the post-OPC standard-cell GDSIIs and Zernike's coefficients, we perform lithography simulation at 19 different field locations with wavelength λ = 193, numerical aperture NA = 0.75, and annular aperture σ = 0.75/0.50. After lithography simulation, we have 19 PrintImage GDS's for each standard cell and measure the CD of each of the MOS devices in each GDS.
The measured CDs using the PrintImage contour generated by Mentor Graphics PrintImage are then used to alter SPICE netlists of standard cells and run library characterization. A complication arises because GDSII typically does not have device names, but SPICE netlists only reference devices by device names. We solve this problem by applying LVS (layout vs. schematic) to obtain a mapping between device locations and device names. After modifying the SPICE netlists, we run Cadence SignalStorm v4.1 to perform library characterization. Since lens aberrations affect different devices in a cell differently, the altered SPICE netlists may no longer have equal CD for all devices. We call our characterized library a transistor-level timing library (TTL); it accurately captures the delay skew induced due to CD skew while adding manageable complexity to the characterization effort and library size.
Our test library contains 50 combinational cells. For each we create 19 variants corresponding to 19 field locations. Library characterization requires approximately 6 hours (wall time) running on 18 CPUs ranging from Intel Xeon 1.4GHz to AMD Opteron 2.2GHz. We do not create variants for the 13 sequential cells in our library due to large CPU time (estimated at 60 hours on our machines) required by their characterization. We note that the characterization time can be significant but is a one-time task for each process.
ABERRATION-AWARE TIMING-DRIVEN PLACEMENT

Introduction of Analytical Placement
We now briefly introduce the APlace analytic placement framework [5, 7, 8, 9] , which forms the foundation of our proposed aberration-aware timing-driven placement method. APlace casts global placement as a constrained nonlinear optimization problem: The layout area is uniformly divided into global cells and APlace minimizes total half-perimeter wirelength (HPWL) while maintaining an equalized cell area in each global cell. The formulation is as follows:
where (x, y) is the vector of center coordinates of cells, HP W L(x, y) is the total HPWL of the current placement, Dg(x, y) is a density function that equals the total cell area in a global bin g, and D is the average cell area over all global bins. APlace applies smooth approximations of the HPWL and density functions and solves the constrained optimization problem in Eqn. 1 using the simple quadratic penalty method and a Conjugate Gradient (CG) solver. The general APlace framework has been extended to address a variety of placement tasks such as mixed-size placement, poweraware placement, voltage-drop aware placement, etc, and is shown to be competitive in a wide variety of contexts [2, 10, 7] .
Aberration-Aware Placement Formulation
Here we propose a novel aberration-aware timing-driven placement objective for improved timing yield after manufacturing and describe its integration in an analytical placement framework. We perform aberration-aware timing-driven placement by optimizing for a hybrid placement objective. Besides the typical objective of minimizing total timing-weighted net wirelength, we also minimize total timing-weighted delays of timing-critical cells. The aberration-aware timing-driven placement formulation is as follows:
where W W L(x, y) is the total timing-weighted net HPWL of the current placement and Wa is the weight for the aberration-aware timing-driven objective, which is the sum of timing-weighted delays of timing-critical cells. In the formulation, gt v (xv) is the delay function for cell v's model tv; it is a function of cell v's current horizontal position xv in the chip. In the situation that there are multiple copies (n > 1) of chips in the reticle, we let g i tv(xv) be the delay function for the i th chip, and we consider the maximum delay of cell v over all copies so that the performance of the slowest chips is improved. Like traditional net weighting methods, we assign timing weights to cells based on timing criticality and path sharing. First, a cell along a timing critical path should receive a heavy weight. Second, a cell with many timing critical paths passing through should have a large weight as well. Therefore, the weight w(v) assigned to a cell v is as follows.
where
and
Here, δ is the criticality exponent. u is the expected improvement of the longest (or shortest) path delay after this timing-driven iteration. T is Ts = (1 − u) · maxπ{delay(π)} for setup-critical paths or T h = (1 + u) · minπ{delay(π)} for hold-critical paths. slacks(π) = Ts − delay(π) is the slack of a setup-critical path π and slack h (π) = delay(π) − T h is the slack of a hold-critical path π. In Equation 3, we compute a weight for each timingcritical path based on its slack and obtain the timing weight of a cell by summing up the weights of timing-critical paths passing through it. Note that similar forms of the function have been previously applied to assign timing weights to nets for timingdriven placement [6] .
Placement Flow
The aberration-aware timing-driven placement and evaluation flow is shown in Figure 6 . Besides the design netlist, inputs to the aberration-aware placer also include delay functions of cell models, which represent how the delays of given cell models change with horizontal position in the chip.
The timing-driven process in our placer may include several iterations. As shown in Figure 6 , during each iteration, we send the intermediate placement to TrialRoute (Cadence First Encounter v04.10) to perform a fast global and detailed routing, and extract RC. Then we change the type of each cell in the netlist according to its horizontal position within the lens field and use a commercial tool, Synopsys PrimeTime (version W-2004.12-SP2) to perform accurate aberration-aware Static Timing Analysis (STA) with the transistor-level timing libraries (TTLs) described in Section 3. The resulting critical paths are imported into the placer to decide timing weights for nets and cells. The total timing-weighted cell delay is then minimized using the Conjugate Gradient solver, together with the timing-weighted wirelength objective and subject to density constraints.
Implementation Details
We compute the weight of the aberration-aware objective Wa in Equation 2 according to the x-gradients derived from the wirelength and delay terms so that the scaled gradients of delay functions are comparable to the wirelength gradients:
where the delay ratio α decides the ratio of the delay gradients to the wirelength gradients, and needs to be carefully tuned according to the impact of reduced cell delay and increased net wirelength on design performance. We derive the delay of a cell at a specific horizontal field position by averaging the rising and fall delays of all timing arcs with zero wire load, according to the transistor-level timing libraries. Therefore, the delay functions represent how gate delays vary with horizontal locations and gate CDs.
Due to simulation limits, delay functions only have accurate values at discrete horizontal coordinates, and thus are expressed as look-up tables (LUTs). We obtain delay at continuous positions using linear interpolation and compute gradients accordingly.
EXPERIMENTS
In this section, we empirically test our approach on two real designs within a standard industry flow using leading-edge tools, and we measure impacts on timing, wirelength and runtime. The experimental flow is shown in Figure 6 . The inputs for each design include technology libraries, synthesized netlists, floorplan, timing constraints, aberration-aware timing libraries and delay look-up tables. For each design, our aberration-aware timing-driven placer, AberrPl, is applied to perform two placement runs: (1) with HPWL objective and no RC extraction before timing analysis (AberrPl WL) and (2) with timing-driven wirelength objective and RC extraction before timing analysis (AberrPl TD). Comparing with the placement runs by wirelengthdriven APlace (APlace WL) and timing-driven APlace (APlace TD) respectively, we show how much our aberration-aware timingdriven objective improves chip performance, without and with traditional timing-driven wirelength objective and/or interconnect load and delay during timing analysis.
Intuitively, chips with large sizes will benefit more from our aberration-aware placement technique, since there is larger CD Table 2 : Comparison of aberration-aware placement against traditional wirelength or timing-driven placement for AES and JPEG.
and delay variation induced by an imperfect lens system across the layout region. However, available testcases are not large enough to clearly show the effect of lens aberration. We illustrate the aberration effects and show the effectiveness of our method by scaling the CD and delay functions along horizontal direction to control the amount of variation within the layout region.
After each placement, we perform a fast global and detailed routing, RC extraction and finally aberration-aware timing analysis using PrimeTime. Minimum cycle time (MCT) of the slowest chip is reported by the aberration-aware STA to measure performance of timing-driven placements, together with HPWL and runtime (minutes) of the placers, and routed wirelength and the number of vias of TrialRoute's results. All the experiments are performed on linux machines with 2.4GHz CPUs and 4GB memory.
Experimental Results. Table 2 summarizes the results of AberrPl WL and AberrPl TD for AES and JPEG. According to the results, AberrPl WL reduces MCT by 4.7% with 3.0% HPWL increase and 1.4% increase of trial-routed wirelength for AES, and reduces MCT by 4.2% with 1.2% HPWL increase and 1.3% increase of trial-routed wirelength for JPEG. When combined with traditional timing-driven placement method, our aberration-aware placer (AberrPl TD) reduces MCT by 8.4% with 2.7% HPWL increase and 2.4% increase of trial-routed wirelength for AES, and reduces MCT by 9.8% with 1.5% HPWL increase and 1.7% increase of trial-routed wirelength for JPEG.
Impact of Delay Ratio.
The second set of experiments are performed for circuit AES with a variety of delay ratios (α's) ranging from 0 to 0.225 with a spacing of 0.025. The results are summarized in Table 3 . For each delay ratio, we perform aberration-aware placements using AberrPl WL and AberrPl TD, and compare the results to the reference runs by APlace WL and APlace TD. Note that during each timing iteration we assign a set of timing weights to nets and cells according to the current placement. It makes the analytical placement unsteady, since the placement objective keeps changing. Therefore, here we only apply small delay ratios in order to reduce the instability of the placements.
The results of AberrPl WL clearly show the performance improvements obtained using our aberration-aware placement method. According to the first part of Table 3 , our aberrationaware placer can reduce MCT by 4.7% with 3.0% HPWL increase and 1.4% increase of trial-routed wirelength. Figure 7( of AberrPl WL as functions of delay ratio. We see that MCT improvement generally increases with delay ratio to 4.7% when α = 0.150, with wirelengths generally increase. When combined with traditional timing-driven placement method, AberrPl TD achieves a MCT reduction of 8.4% when α = 0.075 with 2.7% increase of placed HPWL and 2.4% increase of trial routed wirelength, according to the second part of Table  3 . However, since timing analysis is very sensitive to the actual placement with wire load considered, it in turn increases the instability of timing weights and thus placement results. Therefore, we see a very unsteady curve of MCT in Figure 7 (b) and the information is not quite clear.
Impact of Scaling.
A third set of experiments are designed to show the effect of chip size on performance improvement obtained with our aberration-aware placement method. We perform aberration-aware placements for circuit AES using AberrPl WL with delay ratios of 0.15 and a variety of scaling factors so that the number of copies within the reticle is 1x1, 2x2, 4x4, 6x6, and 8x8. The results are summarized in Table 4 . Figure 8 shows the curves of MCT, HPWL and routed wirelength impacts as functions of the scaling factor. We see that the performance improvement obtained decreases with the number of copies in the field. When the chip size is small, although there is a significant CD and delay variation across the reticle, the variations within the layout area is too small to achieve any benefit from aberration-aware placement methods.
CONCLUSION AND ONGOING WORK
We proposed an accurate aberration-aware timing analysis flow and a novel aberration-aware timing-driven placement technique, AberrPl, as a practical and effective approach to improve timing yield after manufacturing. We implement our method based on a general analytical placement framework and test it within a standard industry flow using leading-edge tools. For two benchmark designs in 90nm technology, AberrPl achieves an average improvement of ∼ 5% in minimum clock cycle time with a wirelength increase of ∼ 2% on average. The benefits of AberrPl are expected to increase in future technology nodes. Our ongoing work explores other aberration-aware techniques to increase timing and leakage yield and to increase the value per wafer when chips may be speed-binned. We also plan to enhance traditional model-based OPC (that is applied at chiplevel) to minimize aberration-induced variations. Table 4 : Results of aberration-aware placements (AberrPl WL) with a variety of scaling factors for circuit AES. 
