Ultra-deep submicron manufacturahility impacts physical design (PD) through complex layout rules and large guardbands for process variability; this creates new requirements for new manufacturing-aware PD technologies. The first part of this tutorial reviews PD complications and methodology changes -notably in the detailed routing arena -that arise from subwavelength lithography and deep-submicron manufacturing (antennas, metal planarization and maskwafer mismatch). Process variations and their sources are taxonomized for modeling and simulation. A framework of design for cost and value is described. The second part covers yield-constrained optimizations in PD, especially "heyond corned approaches that escape today's pessimistic or even incorrect corner-based approaches. Statistical timing and noise analyses enable Optimization of parametric yield and reliability. Yield-aware cell libraries and "analog" design rules (as opposed to "digital", 011 rules) can help designers explore yield-cost tradeoffs, especially for low-volume parts. We then examine performance impact-limited fill insertion which goes beyond mere capacitance rules. Modeling, objectives, and filling strategies are discussed. Finally, we discuss current and near-term prospects for the overall design-to-manufacturing PD methodology Key aspects include better integrations with analysis and manufacturing interfaces, as well as cost-benefit tradeoffs for "regular" layout structures that are likely beyond, 90nm. cost optimizations for low-volume production, and the role of robust and/or stochastic optimization in PD.
Introduction
Moore's law continues to drive higher performance with smaller circuit features. Aggressive technology scaling bas introduced new variation sources and made process variation control more difficult. As a result, semiconductor manufacturing equipment will be strained to maintain constant process variation levels in future technology nodes. Despite the relaxation of some 30 tolerances, there are no known solutions for a number of near-term variability control reqnirements (according to the ITRS [SI). Consistent improvements in the resolution of optical lithography techniques have been ' a key enabler of the continuation of Moore's law. However, as minimum feature sizes continue to shrink, the wavelength of light used in modern lithography systems is no longer several times larger than the minimum line dimensions to he printed, e.g., today's 130nm CMOS processes use 193nm
Pcnnissioii to make digital or hard copics of all or part of this work for pcrsonal or classroom usc is grantcd without fcc providcd ihal copics arc not madc or distributcd for profit or commercial advantage and that copics hcar this notice and the full citation on thc first pagc. To copy othcnvisc. 10 rcoublish. to nosf on scrvcrs or lo rcdintribulr to lists. exposure tools. As a result, modern CMOS processes are operating in a sub-wavelength lithography regime.
Increasingly complex design rules impact detailed routing, physical verification, resolution enhancement (RET) and mask data preparation (MDP). The loss in design tool quality as well as design productivity have resulted in increased project uiicertainty and nianufacturing NRE. Designer, EDA, and process communities must cooperate and co-evolve to maintain the cost (value) trajectory of Moore's law. At 9Onm to 65nm transition, this is a matter of survival for the worldwide semiconductor industry. There needs to exist. a bidirec.tional design-manufacturing data pipe with cost and value as the fundamental drivers. Limits of mask flow need to be passed on to design while functional intent needs to be Ted into the mask flow.
The next section briefly discusses PD complications due to deepsubmicron manufacturing issues, notably subwavelength lithography and process variation. Section 3 outlines various elements of yield-constrained design optimization. Section 4 reviews area fill insertion methods that minimize performance impact. Last, Section 5 describes prospects for closer links between mask and process technology and the physical design flaw.
2
This section of the paper reviews recent PD complications, and impact on methodology, that arise from subwavelength lithography and deepsubmicron manufacturing. We also discuss the mask NRE component of design cost, and taxonomize process variations and their sources for modeling and simulation. This leads us to a framework of design for cost and value.
Subwavelength Lithography
Optical lithography is being pushed to new extremes, with 193nm lasers currently used to fabricate devices with dimensions of SOnm'or less. The extension of optical lithography has been enabled by several developments such as chemically amplified photoresists and anti-reflective coatings. By p r e dicting physical phenomena (especially diffraction and interference) behind optical systems and systematically compensating for them, the minimum feature and pitch that can bc resolved are significantly extended. These Resolution Enhancement Techniques (RETS) are aimed at three major optical wave components, namely, direction, amplitude and phase. An example consequence is the impact on routing algrithms. As we move into the nanometer regime, some of the requirements such as spacing rules, reliability rules and process antenna rules impose severe constraints on routing algorithms [19]. Some oi these restrictive rules are listed below.
"Antennas" are formed by metal traces that accumulate static charge during manufacturing. Without a safe discharge path (through the reverse-biased diode at the output stage of a logic gate) any connected gate may be damaged due to electrostatic discharge. "Antenna rules" establish maximum allowable ratios of metal area to gate area in the absence of discharge path. The pure router-based solution is bridging (layerhopping) to limit the amount of metal connected to a gate; this creates more wiring, vias and congestion.
The combined router-and library-based solution is to drop reverse-biased diodes (source-drain contacts) close to the gate, i.e., (ECO substitution of) dioded cell variants, with negative area and power implications. Tightening of antenna ratios has lowered completion rates of detailed routers and led to more antenna waivers [IZ].
Should liberal use of dioded cells be required, there will be high costs with respect to chip area and power metrics as well as non-trivial balancing of two sources of yield loss: increased die area versus antenna damage.
Via staclcing and minimum a m rules arise because stacking of vias through multiple layers can cause minimum area violations with respect to stackingdependent alignment. tolerances. thought. Signal routing layers are often divided into local layers. intermediate layers, and global layers. layers within the same group have same pitches and parasitics. At the highest layer of a given group, the overhang of the "upvia" can be significantly larger than that of the "down-via" [19] . In addition, use of multiple-cut via cells to increase BEOL yield is complicated hy dependencies on the layers and wire segment widths to be connected.
Width-and length-dependent spacing rules make minimum spacing a function of both wire width and length of parallel adjacencies. This means that edge costs during heuristic search are dependent on path history. Especially pernicious are influence rules (stub rules, halo rules), where a wide wire will influence the spacing rule within its surroundings. This results in strange jogs and spreading when wires enter an influenced area, as well as complicated ECO effects. Another aspect of reliability which is gaining prominence is resist pattern collapse [12, 381. Resist features collapse upon formation at high aspect ratios. Pattern collapse probability is length-dependent. This contributes to lengthdependent spacing rules: longer parallel runs of wires require more spacing. Inserting jogs in the routing can avoid such effects.
2.2
Aggressive technology scaling has introduced new variation sources and made process control more difficult. As a result, future technology nodes are expected to see increased process variation and decreased predictability of nanometerscale circuit performance [5, 13, 21. Variability impact extends beyond just performance. For instance, leakage power has exponential dependence on gate length and hence its variation. Circuit variability arises from chip fabrication or circuit operation itself.
Based on inherent spatial scales, parametric variations are separated into two categories [Il:inter-die and intm-die variation.
Inter-die variation is the difference in the value of a parameter across nominally identical die. These may be fabricated on the same wafer, on different wafers or in different lots. It is mostly design independent and is related to equipment properties, wafer placement, processing temperatures, etc [28].
Intm-die variation is the deviation occurring spatially within any die. Such variation can arise from waferlevel trends as well as layout-pattern dependencies.
Fluctuations in channel doping, gate oxide thickness, and ILD permittivity are primarily due to random variation. This type of variation is likely to have spatial correlation, making nearby devices more similar than ones that are across the die from one another.
Process Variation: Taxonomy a n d Simulation Both kinds of variation can have systematic and random components. An example of systematic wafer-scale variation is the bowl-shaped or slanted-plane nature of some deposition processes. Similarly, on a die-scale variation due to layout patterns which arise from design is a systematic and largely predictable effect. When simulating the impact of variability using Monte-Carlo or other means, few cautions are as follows.
Accounting for systematic variation comtly. If the nature of systematic variation is known, the nature of distribution of delay variation due to the systematic source can be accurately modeled. For instance, using a symmetric Gaussian distribution for a variation for some source which is known to follow a bowl shaped structure across the wafer is a modeling error as there are more die at the periphery of the wafer than at the center. Filtering out systematic within-die variation is also important systematic and random components thereof is an important question to ask.
Mask Cost Model
With the growing complexity and expense of equipment, microlithography comprises oyer 30% of the cost of a new fabrication facility (301. According to (321, the major contributors to mask cost are:
1. low mask yield (due to OPC and PSM as well as stringent CD requirements)
increased data preparation time

equipment cost
4. low equipment throughput gives the market value of the chip for some performance measure f (e.g. speed, power). Thus, the total value of a given process is obtained as: may not be equivalent. This calls for research into such probabilistic optimizations, as well as efforts to quantify the potential value and costs associated with both manufacturing and design solutions to the process variability issue.
Yield C o n s t r a i n e d Optimization i n PD
This section of the paper covers yield-constrained optimizations in PD, especially "beyond corners" approaches that escape today's pessimistic or even incorrect corner-based a p proaches. Statistical timing and noise analyses enable optimization of parametric yield and reliability. Yield-aware cell libraries and "analog" design rules (as opposed to "digital", 011 rules) can help designers explore yield-cost tradeoffs, especially for low-volume parts.
A great deal of research effort has been spent on deterministic design optimization. Aggressive nominal optimization can harm the design yield or lead to drastic overdesign. One way to correct this is to model systematic and random effects and explicitly optimize for yield. Another way is to design variation-insensitive circuits. In current design methodologies, well-tuned circuits have a large nuinber of equally critical paths. In the variation regime, the circuit delay is determined by maximum of path delay distributions. Circuit delay variance thus, increases with the number of equally critical paths. As a result the circuit yield at the desired selling point performance may be much less t h a n optimal even though the nominal optimum has been achieved.
[ZO] present a sizing approach which penalizes having a large number of equally critical paths to avoid a high "wall" of critical paths. Two important components of a yield-based optimization are a statistical timer and usable models of variation in terms of libraries and design rules. We discuss these further in this section.
Statistical Timing Analysis
Traditionally, inter-die variation has been handled by case analysis and intra-chip variability is accounted for by heuristic derating factors which slow down data with respect to clock or viceversa. This is implemented as linear combination of delays in IBM's EinsTimer and as an on-chip variation mode in Synopsys' PrimeTime [Zl] . There are a number of reasons due to which the conventional deterministic timing analysis paradigm is breaking down. With fast scaling critical dimensions, the variability in physical parameters is increasing. Temperature, IR drop and coupling noise induced variability, though systematic is difficult to analyze. Timing runs to incorporate all these effects are mandatory in current timing sign-off. Trying to worst-case all these variables gives rise to a formidable number of corners. Time t o market constraints do not allow such exploding number of static timing runs for timing verification. Besides having feasibility issues, corner-based analysis suffers from pessimism but at the same time can be not "pessimistic enough" b e cause it is impossible to exhaustively enumerate all possible cases. A simple but typically overlooked example is that of computing worst-case clock skew. If two clock paths are not identical and do not track perfectly, then the peak skew between them may not occur at any of the delay corners. This problem, which is the subject of some of our current research, is only further complicated by intra-chip variability.
The solution to these problems lies in statistical timing analysis, which if implemented correctly can reduce pessimism as well as improve timing verification turnaround time. Yield loss can be classified as catastrophic or functional (occuring due to dust particles and other random defects during manufacturing and rendering the chip nonfunctional) versus parametric (causing the chip to function with a range in performance figures of merit). Statistical timing predicts parametric yield. A statistical timer esseutially propagates probability distributions. The probability distributions may be of sources of variation (such as L,,J, tos ,etc) or of the performance measure (such as delay) itself.
It also has to take into account correlations between these distributions. The output is a circuit delay distribution or a distribution of slacks at each of the outputs. The major bottleneck in statistical timing analysis are efficiently calculating the maximum of two correlated probability distributions. Approaches in literature range from smart Montecarlo [24] to bounding distributions [25, 231.
An important improvement in deterministic as well as statistical timing analysis can be accounting for systematic variation. Example of such variation can be through pitch and through focus CD variation. Pattern dependent linewidth variation arising out of iswdense bias is p r e dictable after placement. A limited but representative set of environments for each cell can be simulated for printed wafer image. The results can be used to predict the CD of each gate instance and hence the delay of each timing arc in the design. An in-context timing analysis can then be performed to yield better accuracy as well as achieve tighter statistical distributions (as through pitch variation is the major component of CD variation 
Yield-Aware Libraries and Design Rules
The usual medium of communication between design and process development is through a set of design rules. These rules in presence of RET and other manufacturing constraints are becoming overly restrictive. Given adequate models of MDP, RET and Litho flows, design tools can and should optimize parametric yield, $/wafer, profits. The prerequisites for such interesting optimizations are "analog" rules and yield-aware libraries. Current design rules are hard constraints which result into a nightmarish number of yeslno checks at the physical verification stage. Instead what is r e quired are degrees of meeting various design rules with an indication of the corresponding yield penalty. Design rules are decided by the process community which is oblivious of the design requirements. This results in huge pessimism in coming up with these design rules. A cognizance of yield loss mechanisms (both parametric as well as catastrophic) is required in design so that penalty of not being 100% designrule correct can be computed. An example of such a pessimistic rule is metal density constraint, which is the same all throughout the design. A metal density versus copper thickness tradeoff curve can be more useful for a design tool which can then use area fill more intelligently based on timing criticality of nets.
Yield-aware timing libraries are absolutely necessary for statistical timing analysis. These libraries can range from specifying simple ( p , u ) pairs for each timing arc (assuming Gaussian distribution of performance) to specifying complex distributions that decompose variability into its various components (systematic versus random, intra-die versus inter-die) and various correlations between them. The complexity of the yield library is a tradeoff between complexity of process characterization as well as implementation of the statistical timer and accuracy of the result. Care is required to pick the right point on this tradeoff curve so that after statistical timing, the result is not meaningless (i.e., an incorrect distribution instead of incorrect corners points ).
Another aspect of designing future library design is making them variation aware. For instance, current industry practice is to design entire standard cell libraries with minimum width critical poly permissible by the technology. The advantage is better performance to area ratio. Here the important thing to note is that CD variation due to lithography is "absolute". Therefore, intentional increase in L,J J can lead to cells which are less susceptible to variation. Such cells would have poorer performance to area ratio but better leakage and predictability characteristics. A select number of variation-sensitive cells in the design can be replaced with these more reliable versions from the library. Moreover, certain transitions in a cell will tend to be better controlled.
For example, a transition involving only PFET fingers of an inverter which are densely packed will tend to time close to nominal (since the lithography process is better controlled for tighter pitches). The relative sensitivities of various timing arcs within a cell can be used in the synthesis flow to yield less variation-sensitive designs. Finally, it is also of interest to generate cell libraries which are conducive to application of post-tapeout RET [6] .
4
Chemical-mechanical planarization (CMP) and other manufacturing steps in nanometer-scale VLSI processes have varying effects on device and interconnect features, depending on local attributes of the layout. To improve manufacturability and performance predictability, foundry rules require that a layout be made uniform with respect to prescribed density criteria, through insertion of area fill ("dummy") geometries. Chemical-mechanical planarization (CMP) and other manufacturing steps in nanometer-scale VLSI processes have varying effects on device and interconnect features, depending on local attributes of the layout. To improve manufacturability and performance predictability, foundry rules require that a layout be made uniform with respect to prescribed density criteria, through insertion of area f i l l ("dummy") geometries.
Work at MIT Microsystems Technology Laboratories 1161 proposes a rule-based area fill methodology. To minimize the added interconnect capacitance resulting from fill, a dummy fill design rule is found hy modeling the effects on interconnect capacitance of different design rules (which are consistent with the fill pattern density requirement). Work at Motorola by Grobman et al. 171 points out that the main parameters to influence the change in interconnect capacitance due to fill insertion are feature ("block") sizes and proximity to interconnect lines. The larger the size of the block, the larger the consequent interaction between interconnect lines. Similarly, the closer blocks are to interconnect lines, the stronger their interaction will he. Lee et al. [15] describe the methodology used at Samsung for chip-level metal fill modeling. Their approach replaces the metal fill layer by an effective (i.e., equivalent) high-k dielectric. The increments of capacitance due to floating metal fill are d e pendent on the signal line width and spacing, inter-metal dielectric thickness and permittivity, density of metal fills, metal fill feature size, and metal layer thickness. RC extraction results in 11. 51 show that the total interconnect capacitance increase can be up to 15% for some nets in an 0.18um design. Thus, floating dummy metal fills should be included in chiplevel RC extraction and timing analysis to avoid timing errors.
The first work aimed at true performance aware fill insertion was presented in [lo] . The authors give the first formulations of the Performance Impact Limited Fill (PIL-Fill) problem with the objective of either minimizing total delay impact (MDFC) or maximizing the minimum slack of all nets (MSFC), subject to inserting a given prescribed amount of fill. Using Integer Linear Programming as well as efficient greedy techniques, [lo] achieve up to 90% reduction in total delay overhead of dummy fill. In the MSFC formulation, to maximize niiniinuin slack over all nets i n the post-fill layout, we propose an Iterated Greedy approach based on iterations between the static timing analysis (STA) tool and the area fill synthesis. Capacitance impact due to fill feature insertion during area fill synthesis is written in Reduced Standard Parasitic Format (RSPF) as a file input to STA tool. The results given in [lo] suggest that performance awareness in fill insertion can result in significant timing improvements without compromising layout density. This work ignores fill impact on fringing and overlap capacitances. A more accurate fill insertion method which has cognizance of multiple layers is still an open question for research.
A caveat here is that all fill insertion approaches cnr-P e r f o r m a n c e I m p a c t Limited Fill Insertion rently rely on foundry-specified minimum and maximum layout densities. These rules are not aware of design as a result of which they tend to be pessimistic. Accurate CMP models as well as levels of density control can help in less pessimistic and timing correct fill insertion.
5 Manufacturing-Aware Physical Design Futures
In this section we describe futures for manufacturing-aware PD. This includes cost-benefit tradeoffs for "regular" layout structures that are likely beyond SOnm, cost optimizations for low-volume production, and the role of robust and/or stochastic optimization in PD. Future physical design needs to he tied more closely to manufacturing. Several novel objectives may need to be considered. For instance, "fracturing-aware design" may he beneficial, whereby OPC, phase-shifter, and functional feature shapes are chosen or perturbed for reduced shot count. Layouts can also he stretched (via insertion of submicron-scale "dead space") to help definition of major field boundaries (or, soft field boundaries) for mask writing. Our current research develops such optimizations. More complex extraction and characterization capabilities may also be required. For example, extraction and characterization of nonuniform poly CD can yield better estimates of timing and leakage.
5.1
Traditionally, design, mask making and process engineering have depended on rule sets to isolate themselves from having to understand one another's technology. With number and complexity of these design rules exploding and ever decreasing yields, the traditional isolated deterministic design paradigm is breaking down. Close interaction between manufacturing, mask and design communities is Inevitable. Figure 2 shows a near-term design to manufacturing flow.
Regular Layout Fabrics
As uncertainty increases and design guardbanding a p proaches ridiculous extents, it is increasingly important to ensure predictable printability. New solutions being explored at the design end include regularity. 
5.3
Traditionally, no concept of function is injected into the mask 'flow. Thus mask writers work equally hard in perfecting a dummy fill shape, a piece of the company logo, a gate in a critical path, and a gate in a non-critical path; errors in any of these shapes will trigger rejection of the mask in the inspection tool. an algorithm to analyze a design and output EPE for every edge in the layout to meet given yield or guardbanding constraint. These instance-specific EPEs can then be used in the OPC flow for faster, less pessimistic OPC. Another related objective of physical design can be to maximize the minimum CD tolerance over the whole layout. This can lead to better process window for lithographers as the process window is predominantly determined by the CD tolerance specification. Typically, a process is tuned to print a particular pitch very well. Moreover, this tuned pitch may be changed (for example by changing the nominal exposure) on a design-to-design basis. Physical design tools can then help choose the most critical pitch in the design which needs the most predictable and accurate printability. Another yet-to-be explored area in optimization in CAD is designing variation-insensitive solutions to design problems. An example to note here can be designing variationinsensitive power distribution networks. A power distribution network obeys GV = I where G is the conductance Note that E has to maintain the structure of the conductance matrix. 11 G Ill/ G-' 11 is referred to as the condition number of G. It is an indicator of robustness of the solution of the IR drop solution with respect to small variations in the conductance matrix as well as currents. Systematic perturbation of the power mesh conductance matrix G to yield better condition number can lead to more robust power distribution networks. Such probabilistic and robust optimization methods can be key to future CAD algorithms.
