Novel approach to reduce power droop during scan-based logic BIST by Omana, Martin et al.
Novel Approach to Reduce Power Droop  
During Scan-Based Logic BIST* 
 
M. Omaña         D. Rossi         F. Fuzzi         C. Metra 
ARCES – DEI, University of Bologna 
Bologna, Italy 
{martin.omana, d.rossi, filippo.fuzzi, cecilia.metra}@unibo.it  
C. Tirumurti         R. Galivache 
Intel Corporation 




Abstract— Significant peak power (PP), thus power droop 
(PD), during test is a serious concern for modern, complex ICs. 
In fact, the PD originated during the application of test vectors 
may produce a delay effect on the circuit under test signal 
transitions. This event may be erroneously recognized as 
presence of a delay fault, with consequent generation of an 
erroneous test fail, thus increasing yield loss. Several solutions 
have been proposed in the literature to reduce the PD during test 
of combinational ICs, while fewer approaches exist for sequential 
ICs. In this paper, we propose a novel approach to reduce peak 
power/power droop during test of sequential circuits with scan-
based Logic BIST. In particular, our approach reduces the 
switching activity of the scan chains between following capture 
cycles. This is achieved by an original generation and 
arrangement of test vectors. The proposed approach presents a 
very low impact on fault coverage and test time, while requiring a 
very low cost in terms of area overhead. 
Keywords— Logic BIST, Power Droop, Test, Microprocessor 
I.  INTRODUCTION 
The aggressive scaling of microelectronic technology is 
allowing the fabrication of increasingly complex ICs. Together 
with several benefits (improved functionality, decreased cost 
per function, etc.), this comes through with several challenges, 
especially from the points of view of system test and reliability. 
The increase in peak power (PP), and consequently in 
power droop (PD), are serious concerns for ICs’ test and 
operation in the field. Particularly, the PP and PD during test 
may exceed those experienced during the IC in field operation, 
due to the higher switching activity (SA) induced by the 
applied test patterns [1, 2, 3, 4, 5, 6, 7]. As a consequence, a 
delay effect may be generated on circuit under test (CUT) 
signal transitions, which may be erroneously recognized as 
presence of delay faults, with the consequent erroneous 
generation of a test fail (hereinafter referred to as false test 
fail), with consequent increase of yield loss [2, 3]. 
Many ATPG approaches have been proposed to avoid this 
problem. Most of them utilize don’t care (X) bits to reduce the 
SA induced by the applied test patterns (e.g., those in [9, 10]). 
However, such approaches cannot be used for system debug or 
field test, where Logic Built-In Self-Test (LBIST) is becoming 
increasingly vital [8]. 
As known, LBIST can take the form of “combinational 
LBIST” or “scan-based LBIST”, depending on whether the 
CUT is a combinational or a sequential circuit with scan [6, 
11]. A linear feedback shift register (LFSR) generates test 
patterns to be directly given to the CUT primary inputs, in case 
of combinational CUT, or to the scan chain inputs, in case of 
sequential CUT [1, 2, 6, 12, 13]. In scan-based LBIST, testing 
consists of two phases [6, 12]: a shift phase, during which the 
scan chains are filled with test patterns, and a capture phase, in 
which the test patterns are applied to the CUT and the 
produced outputs are sampled. Despite being a widely adopted 
design, LBIST, both combinational and scan-based, suffers 
from the PD-induced problems described above.  
As a significant case, in this paper we consider sequential 
CUT with scan-based LBIST. During the capture phase, this 
kind of circuits suffer from the PD problems discussed above, 
due to the high SA of the CUT, which is induced by the applied 
test patterns. The produced delay effect can be erroneously 
recognized as presence of a delay fault, with the consequent 
generation of a false test fail. Solutions to reduce PD during the 
capture phase in scan-based LBIST are therefore needed, in 
order to avoid yield loss increase. 
Several solutions have been proposed in the literature to 
reduce PP, thus also PD, for combinational LBIST (e.g., [1, 3, 
6]), while fewer approaches exist for scan-based LBIST [2, 8, 
14]. The solutions for combinational LBIST in [1, 3, 6] modify 
the internal structure of traditional LBIST LFSRs to generate 
intermediate test vectors. Such vectors, inserted between each 
couple of original test vectors, allow to reduce the SA of the 
CUT inputs, thus reducing the whole CUT SA [1]. Therefore, 
the PP and PD are reduced as well. These techniques require 
low area overhead and feature negligible impact on fault 
coverage (FC) and test time, but are not effective in reducing 
PD during the capture cycles in scan-based LBIST.  
To address the issue of PD reduction during the capture 
cycles in scan-based LBIST, the solutions in [2, 8, 14] have 
been proposed. Particularly, in [2] PD is reduced by alternately 
disabling groups of scan chains during test. This is a successful 
approach to reduce PD, but requires a significant increase in 
the number of test vectors, and consequently test time, to 
achieve the same FC as with conventional scan-based LBIST. 
In [8] PD is reduced by a multi-cycle BIST scheme with partial 
observation. This solution does not significantly impact FC, but 
it allows to reduce PD by only 10% compared to conventional 
___________________________________________________________________         
*Work partially supported by Intel Corporation (Santa Clara, CA) research grant. 
2013 18th IEEE European Test Symposium (ETS) 
!
978-1-4673-6377-8/13/$31.00 ©2013 IEEE 
!
scan-based-LBIST. 
Instead, compared to conventional scan-based LBIST, the 
solution in [14] inserts an additional phase, namely a “burst” 
phase, between the scan shift and capture cycles. Such an 
additional burst phase aims at increasing the current drawn 
from the power supply up to a value similar to that absorbed by 
the CUT during the capture cycle. This way, the inductive 
component of the PD occurs during the burst phase, and 
vanishes before the capture cycle. Therefore, the PD occurring 
during the capture cycle, consisting of the resistive component 
only, is considerably reduced. This solution does not impact 
test coverage and can be used together with other power 
reduction techniques. However, it requires an accurate 
modeling of the power supply network and increases the total 
power consumed during testing, as well as test time. 
Based on these considerations, in this paper we propose a 
novel approach to reduce PD during the capture cycles in scan-
based LBIST, thus reducing the probability to have false test 
fails during test. Similarly to the solution in [1, 6], our 
proposed approach reduces the SA of the CUT by proper 
modification of test vectors, compared to conventional scan-
based LBIST. This is accomplished by exploiting the phase 
shifter, which is usually adopted in scan-based LBIST to 
reduce the correlation among the test vectors applied to 
adjacent scan-chains [15].  
In our proposed approach, the test vector to be applied at 
the generic capture cycle i in conventional scan-based LBIST 
is replaced by a new vector, hereinafter denoted by substitute 
test vector, which is generated starting from the test vectors to 
be applied at capture cycles (i-1) and (i+1), in order reduce the 
CUT SA. Such two test vectors are provided by the phase 
shifter at proper outputs, as clearly described in Section IV. 
The substitute test vector is generated in order to reduce the 
maximum number of transitions at the outputs of the scan 
chains between capture cycles (i-1) and i, and i and (i+1), 
compared to the original test sequence. This way, the CUT SA, 
thus also PD, is decreased during capture cycles. Our approach 
allows a 50% reduction of the maximum SA during capture 
cycles, with no impact on test length and fault coverage 
compared to conventional scan-based LBIST. Moreover, it 
requires a very limited area overhead.  
The remainder of the paper is organized as follows. In 
Section II, we describe the considered scan-based LBIST. In 
Section III, we introduce our approach for PD reduction during 
capture cycles. In Section IV, we describe a possible 
implementation of our proposed approach. In Section V, we 
evaluate the effectiveness and costs of our approach, and 
compare them to those of conventional scan-based LBIST and 
of the solution in [2] providing a PD reduction similar to our 
approach. Finally, some conclusions are drawn in Section VI. 
II. CONSIDERED SCENARIO 
We consider the widely adopted scan-based LBIST 
architecture represented in Fig. 1 [1, 4, 6, 12, 15]. 
The state flip-flops of the CUT are converted into scan flip-
flops, and arranged into many short scan chains (s scan chains 
in Fig. 1). Additional scan flip-flops are included in such scan 
chains to drive and sample the primary inputs (PI) and primary 
outputs (PO), respectively. 
The Pseudo-Random Pattern Generator (PRPG) is 
implemented by an LFSR [4, 12, 15]. The Phase Shifter (PS), 
allowing to reduce the correlation among the test vectors 
applied to adjacent scan-chains [15], is composed by an XOR 
network expanding the number of outputs of the LFSR in order 
to match the number of scan chains s. In fact, the number of 
LFSR outputs is usually considerably smaller than the number 
of scan chains [15]. At the same clock cycle, the PS provides as 
outputs, the current LFSR sequence together with many 
future/past sequences. As described later on, this feature will be 
exploited by our proposed solution in order to derive the new 
test vectors allowing to reduce PD during capture cycles. 
The Space Compactor compacts the outputs of the s scan 
chains to match the number of inputs of the MISR. The MISR, 
as well as the Test Response Analyzer (TRA) and the BIST 
Controller are the same as in conventional scan-based LBIST 
[6, 12]. 
As known, two phases can be identified in scan-based 
LBIST [12]: a shift phase, during which the scan chains are 
filled with test vectors, and a capture phase, in which the test 
vectors are applied to the CUT and the produced outputs are 
sampled. In particular, during the shift phase, at each clock 
cycle, the phase shifter provides a new bit to each one of the s 
scan chains (in parallel). Thus, in this phase, the test vector ௜ܶ௠ 
to be applied to the CUT at the i-th capture cycle is loaded into 
the m-th scan chain (m = 1..s) after n shift cycles (where n is 
the number of scan flip-flops of the longest scan chain). After 
such shift cycles, a single capture cycle is performed, and the 
CUT response is sampled on the scan chains. Then, other n 
scan shift cycles are required to shift-out the CUT response and 
to shift-in the new test vector ௜ܶାଵ௠  (m = 1..s).  
As for scan-flip-flops, we have considered the widely 
adopted scheme in [16], which updates its output only at the 
beginning of capture cycles, while keeping it constant to its 
previous value loaded during the shift phase. 
III. PROPOSED APPROACH FOR POWER DROOP REDUCTION 
DURING SCAN-BASED LBIST 
In this section, we introduce our approach for PD reduction 
during the capture phase in scan-based LBIST. As previously 
introduced, our approach exploits the phase shifter (PS) to 
determine the substitute test vector ܵ ௜ܶ௠ (Fig. 2(a)) replacing 
the original test vector ௜ܶ௠ to be applied to the scan chain m 
Fig. 1. Considered scan-based LBIST architecture.
!
!
(m=1..s) at the i-th capture cycle. As will be shown later, the 
PS allows to easily construct ܵ ௜ܶ௠, based on the structure of 
test vectors ௜ܶିଵ௠  and ௜ܶାଵ௠  to be applied to scan chain m at the 
(i-1)-th and (i+1)-th capture cycles. Starting from the (i-1)-th 
capture cycle (as represented in Fig. 2), the new test vector 
sequence in each scan chain m will be as follows:  
௜ܶିଵ௠  –  ܵ ௜ܶ௠ –  ௜ܶାଵ௠  –  ܵ ௜ܶାଶ௠  –  ௜ܶାଷ௠   … 
As schematically represented in Fig. 2(b), the substitute test 
vector is constructed using random injection [6]. As can be 
observed from Fig. 2(b), denoting by ܵ ௜ܶ௠ሺ݆ሻ, ௜ܶିଵ௠ ሺ݆ሻ and 
௜ܶାଵ௠ ሺ݆ሻ the values in position j in the test vectors ܵ ௜ܶ௠, ௜ܶିଵ௠  
and ௜ܶାଵ௠ , respectively, the value of ܵ ௜ܶ௠ሺ݆ሻ is determined as 
follows: 
ܵ ௜ܶ௠ሺ݆ሻ ൌ ቐ
௜ܶିଵ௠ ሺ݆ሻ     ݂݅ ௜ܶିଵ௠ ሺ݆ሻ ൌ ௜ܶାଵ௠ ሺ݆ሻ
ܴ           ݂݅ ௜ܶିଵ௠ ሺ݆ሻ ് ௜ܶାଵ௠ ሺ݆ሻ
, 
where R is a random bit. Therefore, in all positions j in which 
vectors ௜ܶିଵ௠  and ௜ܶାଵ௠  coincide, ܵ ௜ܶ௠  maintains the same logic 
value as the previous vector ௜ܶିଵ௠ , while in the positions j in 
which vectors ௜ܶିଵ௠  and ௜ܶାଵ௠  differ, ܵ ௜ܶ௠  assumes a random 
logic value R. The bit R can simply come from one of the 
outputs of the LFSR itself, as suggested in [6]. 
This way, ܵ ௜ܶ௠ will present more bits equal to ௜ܶିଵ௠  and 
௜ܶାଵ௠  than the original ௜ܶ௠, and thus, the number of switching 
bits in the new sequence ௜ܶିଵ௠  - ܵ ௜ܶ௠ - ௜ܶାଵ௠  will be smaller than 
that in the original test vector sequence of conventional LBIST 
௜ܶିଵ௠  - ௜ܶ௠ - ௜ܶାଵ௠ . Moreover, ܵ ௜ܶ௠ presents a random bit R in 
the positions where ௜ܶିଵ௠  and ௜ܶାଵ௠   are different. This way, the 
new sequence ௜ܶିଵ௠  - ܵ ௜ܶ௠ - ௜ܶାଵ௠  preserves the randomness of 
the original sequence in these bits [6].  
As will be shown in Section V, our approach allows a 
reduction of approximately 50% in the SA of the CUT with 
respect to conventional LBIST, while featuring the same fault 
coverage and test length. In turn, this leads to a significant 
reduction of the PD and, consequently, of the probability to 
generate false test fails. 
Since our approach reduces the number of switching bits in 
the new test sequence compared to conventional LBIST, the 
power consumption associated to glitches due to unbalanced 
paths within the CUT is expected to be reduced as well. 
It is worth noticing that, if a higher SA reduction is 
required, our solution can be properly scaled by introducing 
two or more proper substitute test vectors (depending on the 
target SA reduction) between two original test vectors. 
IV. POSSIBLE IMPLEMENTATION 
Our solution exploits the fact that, for each scan chain m 
and capture cycle i, at every scan CK cycle j (j=1..n), the PS 
provides at its outputs the values ௜ܶ௠ሺ݆ሻ (m = 1..s), together 
with many of its past/future values. In fact, if the number of 
outputs m of the PS is considerably larger than the depth n of 
the longest scan chain (i.e., if m >> n), as it is usually the case 
in actual designs [15], then it is very likely that the value of 
௜ܶ௠ሺ݆ሻ at n past and future CK cycles are provided by other 
outputs of the PS. Nevertheless, the PS can be designed in 
order to provide all necessary values for the application of the 
proposed approach. Therefore, given the PS design, at each 
scan CK cycle j, we can determine the past/future logic value 
( ௜ܶିଵ௠ ሺ݆ሻ/ ௜ܶାଵ௠ ሺ݆ሻ) of each ௜ܶ௠ሺ݆ሻ (m=1..s) by observing proper 
outputs of the PS itself. 
Denoting by Om (m = 1..s) the PS output feeding the scan 
chain m, the logic value in the j-th position of the i-th test 
vector of the scan chain m, that is ௜ܶ௠ሺ݆ሻ, is given by: 
௜ܶ௠ሺ݆ሻ ൌ ܱ௠ሺξሻ, 
where ξ ൌ ݊ሺ݅ െ 1ሻ ൅ ݆ is the current shift clock cycle, 
represented as the number of the total shift clock cycles applied 
by the LBIST architecture from the beginning of the test. 
Therefore, considering that each capture cycle i requires n shift 
cycles, the logic values in the position j of the vector applied at 
the previous and to be applied at the next capture cycles to the 
scan chain m ( ௜ܶିଵ௠ ሺ݆ሻ and  ௜ܶାଵ௠ ሺ݆ሻ, respectively) are the values 
assumed by ܱ௠ at n cycles before and after, respectively, the 
current shift clock cycle ξ. Namely, it is: 
௜ܶିଵ௠ ሺ݆ሻ ൌ ܱ௠ሺξ െ ݊ሻ;    ௜ܶାଵ௠ ሺ݆ሻ ൌ ܱ௠ሺξ ൅ ݊ሻ. 
As per the characteristic of the PS to provide at its outputs 
many past/future values of each output Om, we can determine 
the values of ܱ௠ሺξ െ ݊ሻ and ܱ௠ሺξ ൅ ݊ሻ from the current 
value present at proper two PS outputs. Therefore, there exist 
two values k and p, with k, p = 1..s, k ≠ p and both different 
from m, so that: 
ܱ௠ሺξ െ ݊ሻ ൌ ܱ௞ሺξሻ;   ܱ௠ሺξ ൅ ݊ሻ ൌ ܱ௣ሺξሻ. 
As an example, Fig. 3 shows a possible implementation of 
our proposed scheme, for the case in which the depth of the 
longest chain(s) is n.  
As shown in Fig. 3, our approach requires 2 multiplexers 
(M1 and M2) and an XOR gate for each scan chain m. The 
multiplexer M2 allows to load in the scan chain m : 1) at the (i-
1)-th and (i+1)-th capture cycles, the test vectors ௜ܶିଵ௠  and ௜ܶାଵ௠  
generated by PS, by setting the selection signal int=0; 2) at the 
i-th capture cycle, the substitute vector ܵ ௜ܶ௠, provided by the 
multiplexer M1, by setting int=1. The signal int is generated in 
such a way that it switches from 0 to 1 (and vice versa), at 
following capture cycles. Fig. 2. Schematic representation of our approach: (a) sequence of test vectorsfilling each scan chain, (b) substitute test vector ܵ ௜ܶ௠ using the random




As for the XOR gate, at each scan CK cycle j, it compares 
the logic value at PS output ܱ௞ሺξሻ (equal to ௜ܶିଵ௠ ሺ݆ሻ) with the 
logic value at PS output ܱ௣ሺξሻ (equal to ௜ܶାଵ௠ ሺ݆ሻ). Thus, the 
XOR makes sel=0, if ܱ௞ሺξሻ ൌ ܱ௣ሺξሻ (or, equivalently, if 
௜ܶିଵ௠ ሺ݆ሻ ൌ ௜ܶାଵ௠ ሺ݆ሻ), indicating that the logic value of bit ܵ ௜ܶ௠ሺ݆ሻ should be equal to ܱ௞ሺξሻ ൌ ௜ܶିଵ௠ ሺ݆ሻ. Instead, the XOR 
gate makes sel=1, if ܱ௞ሺξሻ ് ܱ௣ሺξሻ (or, equivalently, if 
௜ܶିଵ௠ ሺ݆ሻ ് ௜ܶାଵ௠ ሺ݆ሻ), indicating that the logic value of ܵ ௜ܶ௠ሺ݆ሻ 
should be a random value R. Therefore, when it is int=1, 
depending on the sel value, M1 selects whether to drive in the 
scan chain m the value on ܱ௞ሺξሻ, or the random value R. 
In order to better illustrate our proposed approach, let us 
consider the simple scan-based LBIST structure schematically 
represented in Fig. 4 as an example. It consists of: i) a 4 bit 
LFSR (ݔଵሺξሻ,  ݔଶሺξሻ, ݔଷሺξሻ, ݔସሺξሻ), with the characteristic 
polynomial p(x)=x4+x+1; ii) a phase shifter (PS), expanding 
the 4 bits of the LFSR to s=12 outputs: ܱଵሺξሻ. . ܱଵଶሺξሻ 
(providing signals ௜ܶଵሺ݆ሻ .. ௜ܶଵଶሺ݆ሻ), where 12 is the number of 
scan chains in the considered scan-based LBIST structure. 
Additionally, for simplicity, but without loss of generality, we 
suppose that the longest scan chain is composed by n=3 scan 
flip-flops. Therefore, in the considered example, each shift 
phase requires 3 scan CK cycles. 
As shown in Fig. 4, the PS has been designed in order to 
provide, at every shift cycle ξ,: i) the current state of the LFSR 
(i.e., ݔଵሺξሻ. . ݔସሺξሻ), on ܱଶሺξሻ, ܱହሺξሻ, ଼ܱሺξሻ, ܱଵଵሺξሻ; ii) the 
state of the LFSR at 3 scan CK cycles before the current state 
(i.e., ݔଵሺξ െ 3ሻ. . ݔସሺξ െ 3ሻ), on ܱଵሺξሻ, ܱସሺξሻ, ܱ଻ሺξሻ, ܱଵ଴ሺξሻ; 
iii) the state of the LFSR at 3 scan CK cycles after the current 
state of the LFSR (i.e., ݔଵሺξ ൅ 3ሻ. . ݔସሺξ ൅ 3ሻ), on the 
remaining signals ܱଷሺξሻ, ܱ଺ሺξሻ, ܱଽሺξሻ, ܱଵଶሺξሻ. 
The logic operations performed by PS to compute 
ܱଵሺξሻ. . ܱଵଶሺξሻ as a function of the current state of the LFSR 
ݔଵሺξሻ. . ݔସሺξሻ are reported in the second column of Table I. As 
can be seen, all ܱ௠ሺξሻ signals (m=1..12) are expressed as 
linear combinations of the present state of the LFSR, and can 
be computed by simple XOR trees. 
As previously discussed, since the depth of the longest scan 
chain is n=3, for each scan chain m (m=1..12) and at each scan 
CK cycle ξ, our approach needs to determine the value present 
on the considered scan chain at 3 scan CK cycles before (i.e., 
ܱ௠ሺξ െ 3ሻ) and at 3 scan CK cycles after (i.e., ܱ௠ሺξ ൅ 3ሻ), to 
determine ௜ܶିଵ௠ ሺ݆ሻ and ௜ܶାଵ௠ ሺ݆ሻ, respectively. The third and 
fourth columns of Table I report the PS outputs, or output 
combinations, giving ܱ௠ሺξ െ 3ሻ and ܱ௠ሺξ ൅ 3ሻ for each 
ܱ௠ሺξሻ ሺ݉ ൌ 1. .12ሻ. 
From Table I, we can observe that the logic values ܱ௠ሺξ െ
3ሻ and ܱ௠ሺξ ൅ 3ሻ, for m = 1, 2, 5, 8, 11 and 12, are equal to 
the values assumed by other outputs of the PS at the current 
CK cycle ξ. 
Instead, for m = 3, 4, 6, 7, 9 and 10, the past/future values 
of ܱ௠ሺξሻ are not directly present on other outputs of the PS. 
Nevertheless, they can be simply obtained as a linear 
combination of the current outputs of the PS. This will require 
a small extra area overhead, due to the additional XOR gates. 
However, this extra area is negligible in practical designs with 
a large number of PS outputs. 
V. VERIFICATION AND COMPARISON 
In this section, we first report the results of the simulations 
that we have performed with the Synopsys Design Compiler 
tool to verify the effectiveness of our approach in reducing PD 
during scan-based LBIST. In particular, we have evaluated the 
SA at the outputs of the scan chains between two following 
capture cycles. We also report the results of the Synopsys 
TetraMAX simulations that we have performed to evaluate the 
Fig. 3. Schematic representation of a possible implementation of our proposed
approach.  
Fig. 4. Schematic representation of the simple LFSR and phase shifter
considered here to illustrate the operation of our approach.  
TABLE I. PS PERFORMED FUNCTIONS AND GENERATED OUTPUTS.
!
!
fault coverage (FC) achieved with our solution.  
Finally, we also compare effectiveness and costs of our 
approach with those of conventional scan-based LBIST [12] 
(hereinafter referred to as Conv-LBIST) and the solution in [2], 
which provides a PD reduction similar to our proposed scheme. 
A. Verification and Comparison with Conv-Scan-Based LBIST 
We have considered the five ISCAS’89 benchmarks 
reported in Table II and, for all circuits, we have used a 20 bits 
LFSR, with the maximal length characteristic polynomial 
p(x)= x20+x3+1 [17]. The number of scan chains employed for 
each benchmark circuit is reported in Table II. As for PS, it has 
been implemented in order to minimize area overhead, 
according to the rules described in [15]. 
For our solution and the Conv-LBIST, Fig. 5 shows the 
distribution of SA in all scan-chains between any two 
following test vectors, after the application of 10000 test 
vectors. As can be seen, for all considered cases our solution 
allows to reduce considerably (by approximately 50%, as 
expected) the maximum SA (SAMAX in Tab. II) with respect to 
the Conv-LBIST. Therefore, our solution allows a considerable 
PD reduction compared to conventional scan-based LBIST. 
Moreover, from Fig. 5 we can observe that, for all considered 
cases, our solution allows to reduce also the mean SA by 
approximately 50% compared to Conv-LBIST. As a result, also 
the total power associated with the capture cycles is 
considerably reduced. As shown later, these results are 
achieved without increasing the test length. 
Fig. 6 shows the total number of 1s loaded on each scan FF 
after the application of all 10000 test vectors for both Conv-
LBIST and our solution. The large benchmark s38548 has been 
considered. We can observe that the number of 1s in each SFF 
of our solution (Fig. 6(b)) is approximately equal to 5000 (half 
the number of applied test vectors, and approximately equal to 
the number of 0s), which is equal to the number of 1s in each 
SFF in the Conv-LBIST (Fig. 6(a)). Thus, we can expect that 
our solution does not impact the randomness of test vectors 
with respect to a conventional scan-based LBIST, as shown in 
[6]. The preservation of the randomness of the test vectors, as 
described in [6], and as proven later in this section, allows us to 
reasonably expect that the FC achieved with our approach will 
be approximately the same as that of Conv-LBIST. 
Table II reports the values of the FC and the SAMAX for our 
solution and Conv-scan-based LBIST, as well as their relative 
variations, for five different ISCAS’89 benchmarks, after the 
application of 10000 test vectors. The relative variations of FC 
and SAMAX are calculated as: ∆FC =100*(FC OUR – FC Conv-
LBIST) / FC Conv-LBIST), and ∆SAMAX =100*(SAMAX OUR – SAMAX 
Conv-LBIST) / SAMAX Conv-LBIST). 
As anticipated before, we can observe that, in all cases, our 
solution allows to reduce considerably (approximately 50%) 
the SAMAX with respect to Conv-LBIST for the same number 
of applied test vectors #TV, while featuring a similar FC. 
Fig. 6. Total number of 1s on each scan FF after the application of all 10000
test vectors to the s38584 benchmark implemented using: (a) conventional
scan-based LBIST, (b) our proposed scan-based LBIST solution. 
Fig. 5. Distribution of the total switching activity in all scan-chains between
two following test patterns, for both the conventional scan-based LBIST and
our approach, for the benchmarks: (a) s9234, (b) s13207, (c) s38417 and (d)
s38584. 




Therefore, with respect to conventional scan-based LBIST, our 
solution allows to reduce considerably the PD, without 
impacting neither the FC, nor the test length.  
As for area overhead, we evaluated it for the two largest 
benchmarks (i.e., the s38417 and the s38584 ones) as the 
relative area increase in the PS, due to the extra XOR gates that 
are needed to generate the missing past/future states in the PS. 
For the s38417 benchmark circuit, such an area overhead is 
+0.3%, while for the s3854 circuit, it is +1.8%. Therefore, the 
area increase required by our solution over the area of the 
whole LBIST architecture is negligible.  
It is worth noticing that the increase in the layout 
complexity  over conventional scan-based LBIST is negligible. 
In fact, since the additional circuitry required by our solution 
can be placed close to the PS, the layout of the signals from the 
PS to the scan chains is not significantly modified. 
B Comparison with Alternative Solution  
We have compared our solution with the alternative 
technique proposed in [2] in terms of: 1) SAMAX in the scan 
chains between following capture cycles; 2) number of test 
vectors (#TV) required to achieve a target FC. For comparison 
purposes, we have considered the same benchmarks and 
implementation of our solution as in the previous subsection. 
As for the solution in [2], it has been implemented considering 
two scan-chain groups (i.e., the case of N=2 described in [2]), 
thus allowing to obtain a value of SAMAX similar to that 
obtained with our approach. 
The results of the performed comparison are summarized in 
Table III, where the relative variations of SAMAX and #TV are 
calculated as: ∆SAMAX =100*(SAMAX OUR – SAMAX [2]) / SAMAX 
[2]), and ∆#TV = 100 * (#TV OUR – #TV[2])/ #TV[2]).  
As can be seen, to achieve the same FC, the compared 
solutions present a similar SAMAX, thus both allowing to reduce 
significantly the PD with respect to Conv-LBIST. However, 
the solution in [2] requires more than twice (in the best case) 
the number of test vectors required by both our approach and 
by Conv-LBIST. Therefore, our solution allows to reduce 
significantly the total test time with respect to [2], while 
achieving the same FC and a similar PD reduction.  
VI. CONCLUSIONS 
We have presented a novel approach to reduce peak power 
and power droop during the capture cycles in scan-based Logic 
BIST, thus reducing the probability that the induced delay 
effect is erroneously recognized as presence of a delay fault, 
with consequent erroneous generation of a test fail. We showed 
that our approach allows to reduce by approximately 50% the 
switching activity (SA) in the scan chains between following 
capture cycles, with respect to standard scan-based LBIST. 
This is achieved by exploiting the operation of the phase 
shifter, usually inserted in LBIST structures in order to reduce 
the correlation among the test patterns applied to adjacent scan-
chains. We also showed that our approach requires a 
significantly lower test time compared to the alternative, recent 
technique in [2]. The proposed approach exhibits no impact on 
test coverage and test time, while requiring a very low cost in 
terms of area overhead. Moreover, it is fully compatible with 
standard scan-based LBIST architectures. 
REFERENCES 
[1] P. Girard, et al., “A Modified Clock Scheme for a Low Power BIST 
Test Pattern Ggenerator” in Proc. of IEEE VLSI Test Symp., 2001, pp. 
306 – 311. 
[2] S. M. Reddy, et al., “A Low Power Pseudo-Random BIST Technique”, 
in Proc. of IEEE Int.l On-Line Testing Workshop, 2002, pp. 140 – 144. 
[3] M.Tehranipoor, M. Nourani, N. Ahmed, “Low Transition LFSR for 
BIST-Based Applications”, in Proc. of 14th Asian Test Symp., 2005, pp. 
138 – 143. 
[4] Y. Huang, X. Lin, “Programmable Logic BIST for At-Speed Test”, in 
Prco. of 16th Asian Test Symp., 2007, pp. 295 – 300. 
[5]  I. Polian, A. Czutro, S. Kundu, B. Becker, “Power Droop Testing”,  
IEEE Design & Test of Computers 24(3), 2007, pp. 276 – 284.  
[6] M. Nourani, et al., “Low-Transition Test Pattern Generation for BIST-
Based Applications”, IEEE Trans. on Comp., Vol. 57, No. 3, March 
2008, pp. 303 – 315. 
[7] X. Lin, E. Moghaddam, N. Mukherjee, J. Tyszer, “Power Aware 
Embedded Test”, in Proc. of IEEE Asian Test Symp., 2011, pp. 511 – 
516. 
[8]  Y. Sato, S. Wang, T. Kato, Kohei Miyase, S. Kajihara, “Low Power 
BIST for Scan-Shift and Capture Power”, in Proc. of IEEE Asian Test 
Symp., 2012, pp. 173 – 178. 
[9] X. Wen, Y. Yamashita, S. Kajihara, L-T. Wang, K. Saluja, K. 
Kinoshita, “On Low-Capture-Power Test Generation for Scan Testing”, 
in Proc. of IEEE VLSI Test Symp., 2005, pp. 265 – 270. 
[10] E. Moghaddan, J. Rajski, S. Reddy, “At-Speed Scan Test with Low 
Switching Activity”, in Proc. of IEEE VLSI Test Symp., 2010, pp. 177 
– 182. 
[11] L-T Wang, C. Stroud, N. Touba, “System-on-Chip Test Architectures: 
Nanometer Design for Testability”, Morgan kaufmann, San Francisco, 
Nov. 2007. 
[12] G. Hetherington, T. Fryars, N. Tamarapalli, M. Kassab, A. Hassan, J. 
Rajski, “Logic BIST for Large Industrial Designs: Real Issues and Case 
Studies”, in Proc. of Int. Test Conference, 1999, pp. 358 – 367. 
[13] P. Girard, L. Guiller, C. Landrault, S. Pravossoudovitch, J. Figueras, S. 
Manich, P. Texeira, M. Santos, “Low-Energy BIST Design: Impact of 
the LFSR TGP Parameters on the Weighted Switching Activity”, in 
Proc. of IEEE Int’l Symp. on Circuits and Syst., 1999, pp. 110 – 113. 
[14] B. Nadeau-Dostie, K. Takeshita, J.-F. Cote, “Power-Aware At-Speed 
Scan Test Methodology for Circits with Synchronous Clocks”, in Proc. 
of IEEE Int’l Test Conference, 2008, paper 9.3.  
[15] J. Rajski, N. Tamarapalli, J. Tyszer, “Automated Synthesis of Large 
Phase Shifters for Built-In Self-Test”, in Proc. of Int. Test Conference, 
1998, pp. 1047 – 1056. 
[16] A. Mishra, N. Sinha, Satdev, V. Singh, S. Chakravarty, A. D. Singh, 
“Modified Scan Flip-Flop for Low Power Testing”, in Proc. of Asian 
Test Symposium, 2010, pp. 367 – 370. 
[17]   www.xilinx.com/support/documentation/application_notes/xapp052.pdf 
TABLE III. COMPARISON OF THE FAULT COVERAGE (FC), NUMBER OF TEST 
VECTORS (#TV) TO ACHIEVE A TARGET FC, AND MAXIMUM SWITCHING 
ACTIVITY (SAMAX) OF OUR APPROACH AND THE SOLUTION IN [2]. 
!
!
