Clock-Gating-Aware Low Launch WSA Test Pattern Generation for At-Speed Testing by Lin  Yi-Tsung et al.
Clock-gating-aware low launch WSA test pattern
generation for at-speed scan testing
著者 Lin  Yi-Tsung, Huang  Jiun-Lang, Wen  Xiaoqing
journal or
publication title
2011 IEEE International Test Conference
year 2012-01-26
その他のタイトル Clock-Gating-Aware Low Launch WSA Test Pattern
Generation for At-Speed Testing
URL http://hdl.handle.net/10228/00007608
doi: info:doi/10.1109/TEST.2011.6139132
Clock-Gating-Aware Low Launch WSA Test Pattern
Generation for At-Speed Testing
Yi-Tsung Lin1, Jiun-Lang Huang1, and Xiaoqing Wen2
1Graduate Institute of Electronics Engineering
National Taiwan University, Taipei 106, Taiwan
2Department of Computer Science and Electronics
Kyushu Institute of Technology, Iizuka 820-8502, Japan
email: jlhuang@ntu.edu.tw
Abstract—Capture power management has become a necessity
to avoid at-speed testing yield loss, especially for modern complex
and low power designs. This paper proposes a test pattern
generation methodology that utilizes the available clock-gating
mechanism, a popular low power design technique, to reduce
the capture cycle weighted switching activity (WSA) for at-speed
testing. Compared to previous techniques that consider clock-
gating, a very significant test power reduction is achieved without
severe test pattern inflation.
Keywords-test pattern generation, clock-gating, test power
reduction, at-speed testing
I. I NTRODUCTION
It is known that the signal switching activity during man-
ufacturing testing can be more than twice that during the
functional mode [8]. The excessive switching activity may
damage the circuit under test (CUT) or cause the CUT
to malfunction during test application, which leads to test-
incurred yield loss.
This work concerns the negative impact of excessive switch-
ing activity on the capture cycles during at-speed testing.At-
speed testing in general utilizes the two-pattern approach—
the first pattern sets the circuit state and the second pattern
activates the desired transition at the fault site. The fault is
detected if the transition fails to propagate to the target flip-
flop(s) within the functional clock period. Figure 1 depictsthe
timing diagram of the launch-on-capture (LOC) at-speed scan
testing scheme. The rising edges of the two capture cycles,C1
and C2, correspond to the functional clock cycle, called the
launch cyclehereafter. If the transition launched atC1 does not
propagate to the target flip-flop(s) beforeC2, the chip under
test is classified as faulty.
A. Yield Loss due to Excessive Launch-Cycle IR-Drop
The LOC scheme suffers yield loss caused by the power
supply noise in the launch cycle. If the power network syn-
thesis process fails to consider the excessive switching activity
during test application, the power network IR-drop during the
launch cycle may be so large that the resulting extra delay [15]
causes a good CUT to malfunction and fail the test. [7], [5]
reported that, in a 130 nm ASIC design running at 150 MHz







shift cycles capture cycles shift cycles
 : launch cycle / functional clock cycle
Fig. 1. The launch-on-capture (LOC) at-speed testing scheme.
only if the supply voltage is above 1.55 V; otherwise, they
fail.
Assessing the possible yield loss associated with a test
pattern due to excessive switching activity is non-trivial, if
possible at all. First, the whole chip IR-drop profile with
respect to a test pattern depends on the spatial and temporal
distribution of the switching activity as well as the power grid
structure [13], [14]. Second, even if the spatial and temporal
IR-drop profile is available, deriving the resulting path delay
is still difficult.
As a tradeoff between computation efficiency and estima-
tion accuracy, this work utilizes the launch cycle weighted
switching activity (WSA), calledlaunch WSAhereafter for
convenience, associated with a test pattern to assess its poten-









1 if signal i switches
0 otherwise
(2)
andwi, the weight of signali, equalsi’s fanout size plus one.
B. Related Works
Many launch WSA reduction techniques have been pro-
posed. They can be roughly categorized into three classes:
X-filling, power-aware ATPG, and partial capture.
1) X-Filling: Given a set of partially specified test patterns,
X-filling techniques fill theX bits to minimize the difference
between the two at-speed patterns; this reduces the launch
WSA [6], [12], [11]. If the given test is fully specified, test
relaxation techniques [2], [4] can be applied to uncover the
X bits in the test vectors without degrading fault coverage.
X-filling techniques incur no test inflation and circuit modifi-
cation; however, the achievable test power reduction depends
on the original test set.
2) Power-Aware ATPG:A power-aware ATPG integrates
the launch WSA constrain into its decision making mecha-
nism [10]. The advantage, compared toX-filling, is that they
explore a larger search space and has the potential of finding
the optimal solution. In general, they are effective in lowering
launch WSA but often causes high test inflation. Note that,
in a power-aware ATPG, the random-fill stage prior to fault
dropping is often replaced with low launch WSAX-filling.
3) Partial Capture: Partial capture techniques reduce
launch WSA by capturing only a fraction of the test response at
a time [9]. These approaches often require circuit modificaton
to enhance fault coverage and lower test inflation.
Recently, techniques that utilize gated-clock to facilitate
launch WSA reduction have been proposed [3], [1]. [3]
presented the two-stage CTX scheme, which belongs to theX-
filling category. In stage one, CTX aims to deactive as many
clock control signals as possible; this reduces the number of
flip-flops that capture the test response. In stage two, CTX
reduces the number of flip-flops that have signal transitions
during the launch cycle. CTX incurs no test set inflation
becuase it only modifies the given fully specified test patterns.
Without fault coverage loss, CTX achieves around 30% test
power reduction. The drawback is the longfault simulation
time needed to retain fault coverage. Furthermore, the solution
space is limited by the given test set.
The technique in [1] is a power-aware ATPG approach. It
associates with each clock control signal a default pattern,
which is a cube that contains the care bits required to deactive
the clock control signal. During test generation, ATPG merges
the default pattern with the test pattern whenever possible.
Comments.
C. The Proposed Low Launch WSA TPG Methodology
This paper presents a test pattern generation (TPG) method-
ology that utilizes the clock-gating mechanism to reduce
launch WSA. To improve launch WSA reduction without
incurring too much test inflation, the proposed technique
introduces two test generation stages.
1) Cated-Clock-Intact TPG:In this TPG stage, faults are
detected without activating or deactivating any clock contr l
signals. The idea is to detect as many faults as possible before
using the clock-gating mechanism to reduce launch WSA,
which tends to cause test inflation.
2) FF-Activation Reluctant TPG:This stage prefers faults
whose fault effects can be captured in already activated flip-
flops. During test generation, clock controls are enabled only
when this is necessary to detect the target fault. The goal isto
detect faults without unnecessarily activating new flip-flops.
The proposed technique then utilizesX-filling techniques
to (1) deactivate as many clock controls as possible, and (2)
r duce the number of flip-flops transitions during the launch
cycle.
D. Contributions
The main contributions of this work is as follows.
• It introduces the gated-clock-intact TPG stage. By leaving
all clock controls unspecified, this stage lowers test
inflation while helping retain the launch WSA reduction
quality.
• It proposes to use the FF-activation reluctant TPG strat-
egy to detect faults with as few flip-flops activated as
possible. This significantly improves the launch WSA
reduction.
Experimenatl results on larger ITC’99 and IWLS’05 bench-
mark circuits show that the proposed technique outperforms
previous techniques in launch WSA reduction without incur-
ring severe test inflation.
E. Paper Organization
The paper organization is as follows. Section II gives the
necessary background of this work. Section III and Section IV
describe the basic and enhanced flows of the proposed method-
ology, respectively, and present the experimental results. Fi-
nally, Section V concludes this work.
II. PRELIMINARIES
A. Clock-Gating
define terms: active FF: a FF whose clock is enabled.
deactive FF: a FF whose clock is disabled. clock control,
group, clock control enable/disable.
B. Low Capture PowerX-Filling
C. TPG Model in the Presence of Gated Clock
III. B ASIC LOW LAUNCH WSA TPG METHODOLOGY
To better understand the proposed TPG methodology, this











Fig. 2. The basic flow.
The basic flow consists of the proposed “FF-activation reluc-
tant TPG” stage and the followingX-filling stages.
The basic flow is designed to utilize the clock-gating
mechanism to reduce the launch WSA without paying too
much attention to the test inflation issue. As the experimental
results show, it achieves significant launch WSA reduction but
suffers test inflation.
A. Overview of the Basic Flow
Figure 2 illustrates the basic flow. It starts with the “FF-
Activation Reluctant TPG (FAR-TPG)” stage which intends
to detect as many faults as possible while at the same time
activating as few flip-flops as possible. Note that FAR-TPG
limits activation but not deactivation of flip-flops becauseth
latter helps reduce launch WSA.
The remaining unspecified clock controls are processed in
the “FF-Deactivation” and “FF-Silencing” stages. The idea
is similar to CTX [3]—the former justifies as many clock
controls to zero as possible; the latter minimizes flip-flop
transitions.
In each iteration, the basic flow generates a fully specified
test pattern. It then performs fault dropping to detect more
faults.
B. FF-Activation Reluctant TPG
This is the main test generation procedure of the basic flow;
it aims to detect as many faults as possible without enabling
too many flip-flops.
Details of the FF-activation reluctant TPG (FAR-TPG) is
shown in Figure 3. First, a primary fault is selected and then
targeted by a regular TPG; this may activate or deactivate some
clock control signals. If the percentage of activated flip-flops
has reached a preset threshold,k%, the flow exits FAR-TPG.
Choice ofk?
To limit the number of activated flip-flops, FAR-TPG selects
the secondary faults from the fanin cones of currently activted
flip-flops and PO’sbecause detecting other faults is unlikely
without activating more flip-flops. Secondary faults are tar-
geted by the “FF-activationless TPG.” The FF-activationless





> k% FF's 
enabled?
pick a fault from fanin 
cones of active FFs'
FF-activationless TPG
fault detected?
G ⇐ largest 
unspecified group 

















Fig. 3. The basic flow details.
without activating any more clock control. If FF-activationless
TPG succeeds, the (inner) loop is repeated; otherwise, this
fault is targeted by the regular TPG, which will activate some
clock control signal(s).
Note that FAR-TPG sets no constraint on flip-flop deacti-
vation.
C. FF-Deactivation
This stage (see Figure 3 for details) deactivates as many
flip-flops as possible to boost launch WSA reduction.
In each iteration, the largest unspecified groupG is iden-
tified. Then, its clock control signalEN G is jutisfied to
zero. This process continues until there is no more untried
unspecified group.
D. FF-Silencing
This step applies JP-fill [11] to the partially specified test
cube to reduce the number of transition flip-flops during the
launch cycle.
E. Basic Flow Experimental Results
The benchmark circuits include the bigger ones from
ITC’99 and IWLS’05. Commercial tools are utilized to synthe-
size clock gating circuitry. There is a fine/coarse grain option
for gated clock insertion. The former is much faster in terms
of synthesis time.
1) Benchmark Circuit Statistics:Table I lists the bench-
mark circuits. The “.fine” and “.coarse” extensions denote th
fine and coarse grain options, respectively. For the ITC’99
































































the same result. For the IWLS’05 benchmark circuits, both
the fine grain and coarse options are applied.
Columns 2 and 3 list the numbers of gates and flip-flops,
respectively. Clock-gating synthesis increases both the gat
and flip-flop counts. Column 4 is the percentage of flip-
flops controlled by gated clocks. With the fine grain option,
this percentage exceeds 95%; with the coarse grain option, i
ranges from 20 to 35%. Column 5 lists the number of clock-
gating groups. The average number of flip-flops per group is
listed in column 6. Not shown in the table, the maximum and
minimum group sizes are 32 and 4, respectively, for all but
the original circuits.
2) Test Generation Results:Table II shows the test gener-
tion results. Four test generation methodologies are compared.
• FAN-ATPG: This is the baseline ATPG without any
launch WSA consideration.
• CTX*: This is implemented according to [3] for com-
parison; it uses the FAN-ATPG as the underlying test
generation engine.
• default pattern*: This is implemented according to [1] for
comparison; it uses the FAN-ATPG as the underlying test
generation engine.
• basic flow: This is the proposed basic flow.
For the baseline FAN-ATPG, the fault coverage, the pattern
count, and the peak WSA are listed. For the other three
methodologies, fault coverage, test inflation percentage,nd
peak WSA percentage compared to the baseline are shown.
(Test inflation for CTX* is always 0 and not shown.) As the
table shows, CTX* reduces peak WSA by 20 to 30 % without
incurring any test inflation. “default pattern*” improves the
peak WSA reduction to be from 38 to 60%; however, it also
incurs significant test inflation (24 to 68%).
The basic flow further improves the peak WSA reduction
to more than 70%. In terms of test inflation, the result is
unacceptable for the ITC’99 circuits—from 81 to 100%. Test
inflation looks more reasonable (around 28%) for the IWLS’05
















Fig. 4. The proposed flow.
TABLE III























































IV. PROPOSEDLOW LAUNCH WSA TPG METHODOLOGY
While the basic flow achieves very high peak launch WSA
reduction, it sometimes incurs unacceptable test inflation. The
reason is that the basic flow in 2 pays little attendion to test
inflation management.
To alleviate the test inflation problem, the proposed low
launch WSA TPG methodlogy (called the “enhanced flow”
hereafter) introduces a new gated-clock-intact (GC-intact)
TPG stage to the basic flow.
The GC-Intact TPG is depicted in Figure 4; it is also based
on the dynamic compaction flow. In each iteration, it tries
to detect as many faults as possible without activating or
deactivating any clock control signal.
A. Experimental Results
The experimental results of the enhanced flow are shown in
Table III. Test inflation and peak WSA results for the basic
flow are also listed for ease of comparison.
Compared to the basic flow, the enhanced flow significantly
reduces the test inflation to be between 8.6 and 36.1%; at
the same time, the peak launch WSA performance remains
almost the same. This validates the effectiveness of the CG-
Intact TPG flow.
Results with respect to differentk.
TABLE II























































































This paper presented a low launch WSA test pattern gen-
eration methodology for at-speed testing. By introducing the
“CG-intact” and “FF-activation reluctant” test pattern gener-
ation stages, the proposed methodology achieves very high
launch WSA reduction with acceptable test inflation. The
future work includes (1) CPU time improvement, and (2)
extension to handle circuits with multiple clock domains.
REFERENCES
[1] K. Chakravadhanula, V. Chickermane, B. Keller, P. Gallagher, and
P. Narang. Capture power reduction using clock gating awaretest
generation. InInternational Test Conference, paper 4.3, 2009.
[2] A. El-Maleh and K. Al-Utaibi. An efficient test relaxation technique for
synchronous sequential circuits.IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems, 23(6):933–940, June 2004.
[3] H. Furukawa, X. Wen, K. Miyase, Y. Yamato, S. Kajihara, P. Girard, L.-
T. Wang, and M. Tehranipoor. CTX: A clock-gating-based testr laxation
andX-filling scheme for reducing yield loss risk in at-speed scan testing.
In Asian Test Symposium, pages 397–402, 2008.
[4] K. Miyase and S. Kajihara. XID: Don’t care identificationf test patterns
for combinational circuits. IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems, 23(2):321–326, February
2004.
[5] S. Ravi. Power-aware test: Challenges and solutions. InInternational
Test Conference, pages 1–10, 2007.
[6] S. Remersaro, X. Lin, Z. Zhang, S. M. Reddy, I. Pomeranz, andJ. Rajski.
Preferred Fill: A scalable method to reduce capture power forscan based
designs. InInternational Test Conference, paper 32.2, 2006.
[7] J. Saxena, K. M. Butler, V. B. Jayaram, S. Kundu, N. V. Arvind,
P. Sreeprakash, and M. Hachinger. A case study of IR-drop in structured
at-speed testing. InInternational Test Conference, pages 1098–1104,
2003.
[8] L.-T. Wang, C. E. Stroud, and N. A. Touba, editors.System on Chip
Test Architectures. Morgan Kaufmann Publishers, 2008.
[9] S. Wang and W. Wei. A technique to reduce peak current and average
power dissipation in scan designs by limited capture. InAsian and South
Pacific Design Automation Conference, pages 810–816, 2007.
[10] X. Wen, S. Kajihara, K. Miyase, T. Suzuki, K. K. S. L.-T. Wang, K. S.
Abdel-Hafez, and K. Kinoshita. A new ATPG method for efficient
capture power reduction during scan testing. InVLSI Test Symposium,
pages 58–63, 2006.
[11] X. Wen, K. Miyase, S. Kajihara, T. Suzuki, Y. Yamato, P. Girard,
Y. Ohsumi, and L.-T. Wang. A novel scheme to reduce power supply
noise for high-quality at-speed scan testing. InInternational Test
Conference, paper 25.1, 2007.
[12] X. Wen, K. Miyase, T. Suzuki, S. Kajihara, Y. Ohsumi, and K. K. Saluja.
Critial-path-aware X-filling for effective IR-drop reduction in at-speed
scan testing. InDesign Automation Conference, pages 527–532, 2007.
[13] M.-F. Wu, H.-C. Pan, J.-L. H. K.-H. T. T.-H. Wang, and W.-T. Cheng.
Improved weight assignment for logic switching activity during at-speed
test pattern generation. InAsian and South Pacific Design Automation
Conference, pages 493–498, 2010.
[14] M.-F. Wu, K.-H. Tsai, W.-T. Cheng, H.-C. Pan, J.-L. Huang, and A. Kifli.
A scalable quantitative measure of IR-drop effects for scan pttern
generation. InInternational Conference on Computer-Aided Design,
pages 162–167, 2010.
[15] T. Yoshida and M. Watari. MD-Scan method for low power scan testing.
In Asian Test Symposium, pages 80–85, 2002.
