ATPG tools generate test vectors assuming zero dday model for logic gates. In reality, however, gates have finite rise and fall delays that are dependent on process, voltage, and temperature variations across different dies on a wafer and within a die. A test engineer must verify the vectors for timing correctness before they are handed off to the product engineer. Currently, validation of tests is done using dynamic simulation of the circuit using the test vectors. A test vector is invalidated if it cannot reliably distinguish between a good and a faulty circuit under the signal placement and observation error window of the tester equipment. Since StrucNral tests can result in much more switching activity in the circuit than what is estimated during normal functioning, the IR drop in the power & ground lines can be significant, adversely impacting path delays. As a result, the validation performed by simulation can be error prone. Oversizing thc power rails to address this problem impacts the yield. We therefore propose the verification of test vectors for IR drop failure and present a flow for identifying failing vectors. Anempting to address this verification in dynamic simulation will force the use of circuit simulation or mixed-level simulation techniques, which are expensive in terms of run time. We discuss a static approach to validate the test vectors for failure in the presence of IR drop problems.
Introduction
In order to simplify the process of test vector generation, an ATPG tool assumes that logic gates and wires are ideal components with no parasitic delays. Real gates have delays that depend on the manufacNring process used (strong or weak) as well as operating conditions such as temperature and voltage. The operating conditions can vary from one die to another, and within the die. A system-on-chip design built using submicron technology exhibits several nonlinear effects such as crosstalk coupling between wires, resistive drop in the power supply lines, dynamic voltage drop due to power line inductance, and so on. A test vector becomes invalid if it cannot reliably distinguish between a good and a faulty circuit under the signal placement and observation error window of the tester equipment. After a test vector is applied, the output of a gate may glitch one or more times before senling down to the final value. The presence or absence of glitches and the settling delay can vary from one instance of the circuit to another due to variations in Process-Voltage-Temperature (PVT). Due to these problems, test vectors are validated for timing stability through simulation at several PVT comers.
Unfortunately, simulation is a CPU-intensive task since the circuits are large (several million gates), the number of test vectors is large (several million vectors) and the number of PVT comers is large. This is despite the fact that logic simulators do not comprehend delay variations due to deep submicron effects such as crosstalk and IR drop. The quality check made on test vectors, therefore, is suspect if the impact of second order effects is substantial. Normally, a designer builds extra margin into the design to overcome these problems; e.g. the power lines can be made wider to reduce the impact of IR drop.
At-speed tests [3] are at largest risk due to IR-drop related tester failure, In this paper, our objective is to describe a static approach that can factor in the impact of lR-drop during the validation of transition delay fault test vectors. We believe ours is the first attempt to solve this problem. Since we avoid simulation of netlists, our solution is much faster. Essentially, our approach is to eliminate a large number of vectors based on a short-listing algorithm that estimates the switching activity on power rails for each vector and rejects vectors that result in more switching activity than a rail can tolerate. This algorithm rakes into account the strengths of power rails. The vectors so short-listed are analyzed for toggle count and the resulting IR-drops are analyzed using an IR drop estimation tool. The paper is organized as follows. In the following section, we briefly introduce the background required for the paper. Section 3 explains the proposed flow for validation of transition delay tests and algorithms used for estimating the power rail switching. Implementation of the flow is explained in Section 4. Results are included in Section 5. Conclusions are presented in Section 6.
Background
At-speed testing verifies the timing Correctness of the manufactured circuit.
This form of testing is critical for nanometer technologies since their timing is impacted due to a variety of reasons such as crosstalk and IR drop. Embedded memories are almost always tested at-speed using built-in self-test.
Apart from functional at-speed tests, path delay and transition delay tests are two forms of s m t u r e d at-speed tests for logic [2, 3] . In the former, we select a set of paths whose timing is critical to the correct at-speed functioning of the chip and ensue that the effect of rising and a falling transitions at the input pin of a path can be observed at the output pin of the path. Path delay testing is expensive since the test data volume for this form of testing is high even for modest coverage. In transition delay testing (also called gate delay testing), the Fault model consists of a slow-to-rise transition or a slow-to-fall transition at the output of a gate. The number of transition delay faults is linear in the number of gates, as opposed to the number of path delay faults, which is exponential in the number of gates.
The test for a transition delay fault in a full-scan logic circuit consists of the following steps, assuming that the fault is a slow-torise transition at the output of a gate g. See Figure I .
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Find a scan test vector V, to place a logic 0 on the output of g. V I may be applied at a slower speed to avoid excessive dynamic power dissipation due to switching activity that will result from scanning in V,. Find a way to launch a rising transition at g. Suppose this can be done by applying a test vector V2. In the "launch by capture" method of ATPG, we generate V, such that V2 is the state of the circuit afler applying vector V I and then switching to normal mode of operation. Thus, afler scanning in VI, we deactivate Scan Enable signal and do a capture of the circuit state at a slow speed. The transition has now been launched. W e now do a "fast capture" and capture the state information in the scan flops after time T, which is the period of the device under test. If a delay fault exists in the circuit, the effect of the delay fault will be captured into the scan flops. Scan out the contents of the scan flops.
If the switching activity causes considerable current to flow in the power line, the resistive drop (as well as the L di/dt drop) in the power line can lower the power supply to the cells. Since the cell delays are inversely proportional to the power supply, the responses of the circuit can be delayed more than the expected value and fast capture will fail to capture the correct response.
This will result in a potential yield loss, since the circuit is indeed functional, but the test declares it faulty. The same problem holds for memory BlST tests. Memories are tested at-speed using builtin self-test circuitry. When designing memory BIST circuitry, the designer often concentrates on minimizing the number of controllers and test application time. In the process, a large number of memories may get tested at the same time; such memory access panems may not occur in functional mode. This can again cause IR-drop induced speed failure of memory BlST vectors. One way to overcome the above problcm is to simulate the test vectors taking second order effects into account. However, since there are a large number of test vcctors, such a solution is very expensive. We propose a static method to verify test vectors. As pafi of our validation flow, we are interested in identifying test vectors that sink large amounts of dynamic power. The subject of power estimation and impact of instantaneous power on power and ground busses has been repotted by several researchers. Before we survey the pertinent literature, we comment that our work is different in the following aspects: (I) Much of the existing work on current estimation in power and ground lines is for combinational circuits, whereas we consider full scan circuits including embedded memories. (2) We report results on indushialsized benchmarks, whereas the existing literature focuses on small benchmark circuits with less than I0,OM) gates. (3) Much of the recent work on power and ground current estimation focuses on second order effects such as simultaneous switching noise, Ldildt effect, primary input misalignment, etc., and the techniques reported provide accurate estimations on small circuits. In this work, our focus is on IR drop without considering the second order effects, since modeling them would impact the run times on industry-sized benchmarks adversely without adding significant value. An early power estimator was reported by Haroun et [IO] . A hybrid methodology for switching activity estimation, which considers both simulation and probabilistic estimation techniques is presented by Cheng et al [5] . They used simulation for control paths and probabilistic techniques in data paths. In the existing approaches towards vector construction for worst-case power, the actual functionality of the circuit is ignored. If we have several combinational blocks separated by (scan) registers, an extension ofthe above techniques for conshucting a worst-case vector may not be possible for two reasons: (1) Two vectors which result in worst-case power dissipation in two separate blocks A and B may have contradicting inputs. (2) Initialization of scan chains to simultaneouslv maximize the power in two or more blocks may be impossible, In this paper, we anempt the solution of a somewhat different problem, where structural test vectors are provided and they must be verified for potential failure on the tester due to IR-drop and resulting delay faults.
IR-Drop Aware Validation of Test Vectors
Since a SoC has several million gates and several megabytes of memory, it becomes necessary to have multiple power rails in order to supply power to all the circuits. The power rails are normally optimized taking the functional power dissipation into account. Let us first look at the validation of delay test vectors assuming full-scan based test application. Generally, the natural hierarchy in the design is used in designing the scan chains for the chip. Similarly, a hierarchical approach is followed during power supply design. It is possible that supply rail j is connected to several scan chains. See Figure 2 below, where we show two subblocks 9, and Bi in a hierarchical design. We show a scan multiplexer that allows sharing of the scan input and scan output pins between the two blocks. The blocks are tested one at a time. Although a single scan chain is shown (as a dotted line) we can have several scan chains in practice. The figure shows two power rails. Rail R I supplies power to gates in both blocks, whereas rail R2 supplies power to only block B2.
Validation of a test vector for IR-drop failure must test for the following conditions: CI: During test application, 1R drop in at least one power rail will exceed the margin built into the design by the physical designer. C2: There exists a path that is exercised by the test vector which is adversely impacted by the 1R drop. Condition C2 is the stronger condition, and testing for its application will be time consuming. We therefore propose a static method to shortlist test vectors based on condition Cl.
Overview of the method
Our R-drop aware test vector validation algorithm is called TestRail. The primary inputs include transition delay test vectors the gate-level netlist, the physical layout database, and the delay information for the circuit in Standard Delay Format (SDF).
Vector Short-listing
From the information in test vectors, we extract the toggle count on each of the primary inputs, outputs, and scan flops of the circuit.
Let T be the set of test vectors and N be the set of p"mary inputs, primary outputs, and scan flops in the circuit. For each t E T and for each element n E N, let TC,, be the toggle count on n due to application of 1. A na?ve method to validate a test is to consider the sum Z TC,. over all n and apply a threshold. This method does not factor in the power rail information. An improved technique is to consider the mapping of flops to power rails; let R be the set of power rails and let M(n,R) be defined to be a 011 variable which is 1 if and only if element n is powered by rail R. Further, let a(R) indicate the strength of the rail R. We choose the strength function to return a positive integer that is in inverse relation to the threshold on the amount of toggling activity the rail can permit.
We propose the use of the following measure to validate a test:
n). M(n,R) . a(R)
The reader may verify that the activity factor A(f) reflects both the toggling activity generated by test I as well as the stress this causes to the power rails. The vector short-listing algorithm we propose evaluates A(f) for all test vectors and constructs a histogram of A(+ The vectors whose activity factor is far above the average (larger than average + k. standard deviation, where k can be specified) are short-listed. The subset of test vectors short-listed is subjected to further analysis, namely, power rail analysis and critical path analysis.
Power Rail Analysis
A switching activity propagator propagates the toggle information from the set N to internal nodes of the circuit [IS] . An IR drop estimator is used to obtain estimates of IR drop in power rails; we used a commercial software (Synopsys Astrorail [14] ). The tool can generate a color-coded plot which displays in red the areas in the chip where the IR drop exceeds the margin. Typically, such areas are in central portions of the chip. We use the short-listed vectors to compute the toggling on the nets and map this information to IR drops in the AstroRail flow. A vector for which IR drop violations are pointed out by the power rail analysis step is treated as a candidate for futther analysis.
Critical Path Analysis
In this step, we wish to identify vectors which activate critical paths that may fail due to the adverse impact of IR drop on gate delays. A static timing analysis tool (F'rimeTime [12] ) is used to compute the criticial paths in the circuit prior to IR drop analysis. Three different methods were considered for updating the chip delay information after worst-case IR drop has been computed for all nets. Device simulation through SPICE will be an accurate approach, but impractical due to large design sizes. 
Results
We tested the flow presented in the previous sections on two industrial designs, which we will call Chip-A and Chip-B (not the actual names). Chip-A is a memory-intensive design with 70K logic gates and 64KB memory, with a single scan chain (flipflops). This chip has only two power rails, namely, VDD and VSS. Chip-B has around 3 million gates and about 840KB of readiwrite memory, and has 7 scan chains (flip-flops). The chip has 12 power rails. The design Chip-A was selected as a pilot test case to verify our flow. In this relatively small design with a single scan chain and a single VDD power bus, it was possible for us to manually construct test vectors that will fail due to IR drop. We also constructed functional test vectors for this circuit. There are 5 repairable RAMS in the design, all powered by the same bus, and all tested concurrently using SMARCHKBCil algorithm (serial March, checker board) during test mode. In addition to the memories, the design also has a 32-bit sequential multiplier, which was also tested using Scan ATPG transition-delay test vectors generated using TetraMAX. The test vectors were taken through both the TestRail flow as well as the standard simulation-based Figure 4 shows the results on the vector short-listing technique proposed in this paper. We plot the toggle counts for the initial 19 test vectors (selected randomly to illustrate the point). As we can see, the toggle count for Vector 5 is unusually large, larger than the average+3*standard deviation, making it a strong candidate for shortlisting. In fact, Vector 5 indeed failed on this chip, and is the failing vector mentioned in Figure 6(b) . The shortlisting technique saves significant amount of run-time by avoiding fulther analysis.
v-T
Fiiiiii---~=-' :"
-.
---I Figure 6 shows the AstroRail plots of the failing test vector for Chip-B. In Figure 6 (a), the IR-drops are averages, calculated in a pattem-independent fashion. We see that the red area of Figure   6 (a) shows potential problems due to IR-drop failure. In Figure  6 (b), we plot the average IR drops again, but the averaging is done only for the vectors short-listed by our technique. As can be seen, the area marked red in Figure 6 (b) is not only smaller, but has no overlap with the red area in Figure 6 (a). This means two things: ( I ) A designer who looks at plot 6(a) and fixes the power rail to avoid excessive IR-drops may not really address the actual problem while still consuming costly silicon real-estate (2) In the example of Chip 6(h), the critical path that failed due to excessive IR drop indeed passes through the red area; recalling that this area has no overlap with the red area of Figure 6 (a), we conclude that a vectorindependent approach to IR-drop failure analysis can be misleading. verification flow. The test vectors passed the latter flow, but the TestRail flow caught the failing vector, as shown in the AstroRail plot of 1R-drops ( Figure 5(a) ). The central portion in the chip is red since IR drop here exceeds the margin. The failing vector exercised all the five memories using BlST and the multiplier using the transition delay test. To contrast, we also created functional test vectors which exercised only subsets of the five memories (it is not functionallv nossible to exercise all memories We further verified the test vectors that failed the margin test by taking it through the static timing analysis flow and found that the vectors indeed failed due to timing faults in the critical path. The typical run-times for TestRail are shown in Table 1 for different phases of the total flow. As can be seen, for a large design like Chip-B, the vector shortlisting can proceed at the rate of about 1 vectors/4sec. The toggle count estimation was done for the one shortlisted vector in case of Chip B, and this step took -5 min.
This clearly brings out the benefit of the vector shortlisting proposed in this paper; had we performed toggle count estimation for all the 80 test vectors, the total run time for even the toggle count estimation phase would have exceeded 3 hrs, not to speak of the run-times during Astro-Rail and timing analysis phases of the flow. However, due to the impact of overdesign on chip yield, designers are wary of using large design margins. The technique presented in this paper is a design aid that can help save precious silicon area without significantly impacting verification time.
