Frequency-Independent Warning Detection Sequential for Dynamic Voltage and Frequency Scaling in ASICs by Das, Bishnu Prasad & Onodera, Hidetoshi
Title Frequency-Independent Warning Detection Sequential forDynamic Voltage and Frequency Scaling in ASICs
Author(s)Da , Bishnu Prasad; Onodera, Hidetoshi




© 2014 IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in
any current or future media, including reprinting/republishing
this material for advertising or promotional purposes, creating
new collective works, for resale or redistribution to servers or






Sequential for Dynamic Voltage and Frequency
Scaling in ASICs
Bishnu Prasad Das and Hidetoshi Onodera Senior Member, IEEE,
Abstract—In this paper, a metastability immune warning
flipflop (FF) is proposed which consists of an edge detector,
a warning window generator and a warning detector along
with a traditional FF. The delayed data is monitored during
the warning window to flag a warning signal before the data
enters the erroneous zone. In this scheme, the warning window
is independent of input clock frequency and hence is suitable
for frequency scaling application. A 16-bit Kogge-stone adder is
implemented in 65 nm technology which uses warning FF for
dynamic voltage and frequency scaling (DVFS). The warning FF
based DVFS allows elimination of safety margins and operates
till the point of first warning of the adder without any erroneous
results. Experiments were conducted with different supply volt-
ages, phase-shifted clocks and process conditions. The circuit is
helpful to determine when to stop further reduction in supply
voltage by producing the warning signal with pre-defined timing
slacks in DVFS application. The test chip results demonstrate
that the proposed circuit can track the critical path delay of 2.4
ns to 7.5 ns at warning voltage of 1.15 V to 0.72 V respectively.
The measured results from 10 different chips show effectiveness
of the proposed concept across process variation.
Index Terms—Resilient circuits, error detection sequential
(EDS), warning prediction sequential, dynamic voltage scaling
(DVS).
I. INTRODUCTION
As the process scales to the nanometer regime; there is a
great demand to achieve maximum performance with stringent
power envelope. Variation is a major bottleneck in achieving
such a goal. Variations can be static like process variation
or dynamic like environmental variations. Static process vari-
ations can be handled by chip binning or post silicon sup-
ply/body voltage tuning. However, the dynamic variations are
transient phenomena those depend on time and environment.
The examples of the dynamic variations are supply voltage
variation, temperature variation, and aging effect such as
negative bias temperature instability (NBTI). The traditional
approach utilizes the concept of worst case supply voltage
and frequency to avoid such kind of undue effects [1]–[3].
Adaptive techniques have been developed to eliminate design
margin by dynamically modifying the supply voltage, the body
bias and the clock frequency [4], [5].
Bishnu Prasad Das was with the Department of Communications and
Computer Engineering, Graduate School of Informatics, Kyoto University,
Kyoto, Japan, and is presently with Department of Electrical and Com-
puter Engineering, Carnegie Mellon University, Pittsburgh, USA (e-mail:
bishnu.iisc@gmail.com).
Hidetoshi Onodera is with the Department of Communications and Com-
puter Engineering, Graduate School of Informatics, Kyoto University, Kyoto,
Japan and also with JST, CREST. (e-mail:onodera@i.kyoto-u.ac.jp).
A traditional dynamic voltage scaling (DVS) approach em-
ploys look-up tables which store pre-characterized voltage and
frequency points [1]–[3]. In this approach, supply voltage of
the circuit is determined based on the frequency of the crit-
ical path monitoring circuit. However, the voltage-frequency
points are characterized by considering the worst case process,
voltage and temperature variations. Hence, the designer loses
some margins in this kind of pessimistic choice of voltage and
temperature points.
Razor I flip-flop shows the direction for reduction of worst
case margin [5]. It uses a error detecting FF on the critical
path of the design to reduce the supply voltage, which finds
the first failure point for a given frequency. It allows the
reduction in design margins leading to the significant energy
saving. However, the technique requires additional circuitry
like shadow latch and meta-stable detector for error detection.
The crucial limitation of this technique is that it checks the
error at the output of a FF which requires a meta-stable
detector to resolve the output of the FF. The canary FF uses
a delayed data and a shadow FF along with the traditional FF
to detect the timing error [6], [16]. Since, it compares the data
at the output of a FF, it also requires meta-stable detector.
Razor II is another flavor of Razor I where data transition
is checked at the input of a FF [7], [8]. Hence, it does not
require a meta-stable detector. However, Razor I and Razor II
are used in a processor framework where the corrective action
is performed using re-execution of instructions.
Authors in [9], [10] proposed an error detection sequential
using transition detector with time borrowing (TDTB) and
double sampling with time borrowing (DSTB). However, all
these techniques are useful in a processor architecture where
instruction replay mechanism is readily available.
The yield enhancement technique using the defect predictive
FF (DPFF) is proposed in [11]. The DPFF produces a warning
signal based on the timing error which is used to replace the
faulty block with a working block. However, warning signal is
generated by comparing the output of two FFs. The issue arises
when the output of one of the FFs falls into the meta-stable
zone. Hence, it requires an extra hardware like meta-stable
detector to resolve the data at the output of the FF.
Tunable replica circuit (TRC) is proposed which can be
used to tune the supply voltage, after fabrication, to match the
critical path on the die [12]. The error detecting sequential is
used in the TRC to report the timing error if the delay of the
circuit exceeds the clock period of the TRC. However, in this
technique post-silicon calibration is required which increases
2testing cost. Recently, error detection in a FF using transition
detector is proposed in [13]. This technique uses an on-die
adaptive frequency controller to adjust the frequency based on
workload and error-rate. The major limitation of the [12], [13]
is that it requires error recovery mechanism which is available
in case of a processor. However, in case of an application
specific integrated circuit (ASIC), the error would lead to
malfunctioning of the whole system.
Majority voting and glitch filtering are mostly used in Single
event upset (SEU) and Single event transient (SET) hardened
FF design. In case of majority voting, the delayed clocks
are used in three FFs in parallel and the output of the FFs
is compared using a voting unit [14]. The area overhead is
huge as three FFs are used. In case of glitch filtering, the
glitch generated due to SET is filtered by delaying the data
path [15]. The major difference between majority voting or
glitch filtering and our work is that majority voting and glitch
filtering are used for error correction whereas the proposed FF
is used for warning detection. Dynamic voltage and frequency
scaling (DVFS) requires a control signal when to stop further
reduction of supply voltage or clock frequency. Hence, the
majority voting or glitch filtering is not suitable for DVFS
application.
The wear-induced failure prediction of various part of the
microprocessor is presented in [17] using the online delay
monitoring and statistical analysis of delay data. The basic
concept of delaying mechanism is proposed first by Kehl [18]
for automatic tuning of hardware by monitoring the perfor-
mance of the system at different time interval. Incoming data
is sampled at three different sampling times. All three samples
are compared to the incoming data and resultant signal is
used to adjust the clock frequency. In [18], the incoming data
is sampled at three different sampling time whereas in our
proposed technique delayed data is sampled once by the edge
detector. The area requirement is higher in [18] compared to
proposed technique.
Technology scaling allows packing billion devices in a small
area. However, the impact of process variation is rapidly
increasing in the scaled technology node. To encounter process
variation, statistical static timing analysis (SSTA) has been
developed which incorporates the intra-die and inter-die vari-
ations to provide statistical guarantees to the timing budget of
the circuit. The voltage and temperature scalable SSTA [19]
has been developed to analyze the statistical behavior of the
circuits at different supply voltage and temperature conditions
which is helpful to reduce the timing margins prior to circuit
fabrication. The post silicon tuning [20] is emerging as a
technique to reduce timing margins after the fabrication of
the chip.
Authors in [21], [22] proposed two aging sensors such as
(1) stability checker design and (2) double sampling design.
Both the proposed technique and the stability checker design
have the similar approach as both use edge detection circuit.
However, the stability checker design in [21], [22] is frequency
dependant whereas the proposed technique is not frequency
dependant. To make a fair comparison, we have compared
the edge detection approach of [21], [22] with the proposed
technique in the paper. The double sampling design in [21],
[22] and canary FF in [6] have similar approach as both
use two FFs. In this approach, the warning window is not
frequency dependent.
An aging sensor for the combinational part of the critical
path is proposed in [23]. The circuit compares two code words
in order to predict the aging effect. However, a comparator is
needed to monitor the two code words which in turn increase
the area of the sensor circuit. The timing slack monitoring
circuit with a window generator and a sensor cell is presented
in [24]. As large number of buffers are required in the window
generator circuit which would lead to difficulty in balanced
clock tree synthesis and large clock power consumption. The
major contribution of the paper is the usage of delayed
data in an edge detector which can make the warning FF
frequency independent without any effect on clock network.
In author’s view, the above concept is not published so far.
The previous warning FF published either use the concept of
double sampling [21] and canary FF [6] or the stability checker
with edge detection [21] which is not frequency independent.
The proposed technique in this paper is a metastability
immune warning FF which is used in an ASIC framework [25].
The proposed circuit uses the concept of delayed data in
the transition/edge detector which flags the “warning” instead
of the “error”. Since it reports the warning; the appropriate
data is captured correctly by the FF in the same clock cycle.
The contribution in this paper includes on-chip demonstra-
tion of proposed warning FF for DVFS application in ASIC
framework. The measured results from a test chip show the
effectiveness of the proposed concept across supply, clock
frequency (or phase shifted clock) and process condition.
The paper is organized as follows. Section II describes
the concept of warning detection by defining various timing
parameters. The proposed warning flip-flop is described in
Section III. The test structure and testing strategies are ex-
plained in Section IV. The experimental results are presented
in Section V. The usage of warning FF in real application is
described in Section VI and final conclusion is presented in
Section VII.
II. CONCEPT OF WARNING DETECTION
Setup time: It is the time before the clock edge during
which the data should be available such that the data can be
sampled properly by the FF. The worst case setup time is the
worst value of the setup time after performing statistically
meaningful simulations (e.g. Monte Carlo) considering
variation to the process parameters such as the gate length
and the threshold voltage and environmental parameters such
as the supply voltage and the temperature. The worst case
setup time is denoted as twsetup.
Hold time: It is the time after the clock edge during which
the data should be available such that the data can be sampled
properly by the FF. The worst case hold time is found out
following the method explained in the definition of worst
case setup time. The worst case hold time is denoted as twhold.
In our warning detection scheme, the warning is detected by
monitoring the delayed data transition. In case of the delayed
3data, the warning window twarning is after the rising edge of
the clock as shown in Fig. 1(a). The minimum delay bound
between delayed data and direct data is equal to the sum of
twsetup and twarning as shown in Fig. 1(a) so that the warning
window will appear after the rising edge of the clock. The
same amount of delay between direct data and delayed data is
maintained in both Fig. 1(a) and Fig. 1(b). However, delayed
amount is only shown in Fig. 1(a) and not shown in Fig. 1(b)
to maintain the clarity of figure. When the data arrives early,
both the data and delayed data are outside the warning window
as shown in Fig. 1(a). When the data arrives late, the data is
outside the warning window and the delayed data is inside
the warning window as shown in Fig. 1(b). Since, the delayed
data is inside the warning window, the data is safely sampled
by the flip-flop in the rising edge of the clock as shown in
Fig. 1(b). The corrective action should be taken in the next
clock period such that the delayed data transition would not
happen in the warning window. In this work, the delayed data
transition is monitored in the warning window to flag warning
signal.
III. PROPOSED WARNING FLIP-FLOP
The circuit consists of an edge detector, a warning window
generator and a warning detector sub-circuits along with a
traditional FF as shown in Fig. 2. The timing error can be
monitored by two methods such as (i) monitoring input data
transition during the warning window and (ii) comparing the
output of a FF with that of another FF. The latter method
requires meta-stable detector to resolve the time critical data.
However, in this work, the data transition at the input of the
FF is monitored to prevent the timing error. In the proposed
approach, the output of the warning detector would be in the
state of partial pull down if there is small overlap between
the data edge and warning window. The partial pull down of
the warning detector would only occur when the input data
edge is on the borderline of generating warning signal. It
would only affect the adaptive response by the controller as the
warning signal is not evaluated properly. The main FF would
sample the correct value. Hence, the design is immune to
impact of a partial pull down phenomena in warning detector.
To avoid such kind of effect, the warning window width
should include the maximum propagation delay of the warning
detector considering worst case process variations and number
of fan-outs of the warning signals. Hence, this approach does
not require a meta-stable detector and is otherwise termed as
metastability immune method in Razor II [7]–[10]. Instead of
inputting the direct data to edge detector which is used in
error detector sequential, the delayed data is fed to the edge
detector to detect the warning in this work. The FF works with
the direct data whereas the delayed data is used in the edge
detector.
Figure 3 shows the conceptual timing diagram of a warning
flip-flop which does not include internal delay of each block.
The delayed data transition is monitored during the warning
window. The edge of the delayed data is generated using edge
detector. The warning window is generated from the clock















Fig. 1. Proposed warning detection using the delayed data showing the
warning window after rising edge of the clock. (a) Early data arrival, and (b)
Late data arrival
is generated at the rising and falling transition of the delayed
data whereas the warning window is created only at the rising
edge of the clock as in Fig. 3. In case of late data arrival,
the delayed data enters the warning window first and flags a
warning signal. Since the warning signal is flagged based on
the transition of the delayed data, the direct data is sampled
safely by the FF before entering the erroneous window. The
warning signal is flagged at the rising edge of the third clock
signal as in Fig. 3. Since the warning signal is generated from
the transition of the delayed data, no erroneous data transition
occurs at the output of the FF. The correction action can take
many clock cycles based on the response time of the controller.
The corrective action includes adjusting the supply voltage, the
clock frequency and the body bias.
It is easy to control the slow changing variation such as
temperature variation and transistor aging as the change in





















Fig. 2. Block diagram of the proposed warning flip-flop
frequency voltage droop, the change in the critical path delay
is fast. The proposed warning FF has one limitation that is
common to any warning FF: it can not function correctly
where the response time for detecting and responding to the
variation is not sufficient enough to avoid an actual timing
violation. The proposed FF could not predict the timing
violations due to a high frequency voltage droop; rather these
types of variations would have a guard band quite similar to
a conventional design.
The maximum critical path delay of the proposed technique
is more than the conventional design as number of buffers
added in the path of edge detector. In our proposed technique,
configurable delay buffers are used in the path of the edge
detector which can adjust the critical path delay based on
process corner to ensure the proper functionality of warning
FF.
In the proposed warning FF, some extra circuits such as an
edge detector, warning window generator and warning detector
are added to conventional FF to especially monitor the setup
time violation of the FF. However, the proposed FF is not
designed for monitoring hold time constraint. We suggest that
hold time violation needs to be fixed during the circuit design
time at all operating conditions. One example way of the fixing
the hold constraint is to simulate the design at best/fast process
corner. Hence, like [21], in proposed FF, there is no additional
hold time penalty as compared to a conventional FF. In other
words, hold time constraint is not required to be included in
the width of the warning window.
A. Edge Detector
Normal edge detectors use either static CMOS logic
style [13] or dynamic logic style [26]. However, the proposed
edge detector is a pass transistor based design as shown
in Fig. 4 and its conceptual timing diagram is shown in
Fig. 5. The proposed edge detector consists of two inverters
I1 and I2, a conditional inverter I3 and a transmission gate
T as shown in Fig. 4. In this approach, the output of the
conditional inverter I3 and the output of the transmission
gate T are connected to generate the output of the edge
detector. When Delayed data = 1, the conditional inverter
I3 behaves as a normal inverter and its output acts as the





















Fig. 3. Conceptual timing waveforms of warning flip-flop
is not operational. When Delayed data = 0, the transmission
gate T is operational and its output acts as the output of the
edge detector. In this case, the conditional inverter I3 is not
operational. The control signals (i.e. output of I1 and output
of I2) for the transmission gate T and inverter I3 are same
which allows either inverter I3 or transmission gate T to be
active at one time. The input signal (i.e. delayed data) for
transmission gate T and inverter I3 are same. However, the
delay of inverter I3 and transmission gate T are not same.
Hence, there would be small amount of race due to the delay
difference between inverter I3 and transmission gate T . We
have performed extensive Monte Carlo simulation as well as
all the corner simulation such as typical, slow, fast, fast-slow
and slow-fast corners of the proposed edge detector at low
supply voltage 0.5V to verify the contention or race issue
of the proposed edge detector. It shows that proposed edge
detector works properly until 0.5V. Hence, this edge detector
can be used in low supply voltage applications. We have
performed the substrate noise simulation [33] of the edge
detector which shows that it is immune to substrate noise even
at low supply voltage of 0.5V.
The delay of the inverter is very small compared to other
gates in any technology node; hence more numbers of buffers
are needed instead of just one buffer and two inverters as
shown in Fig. 4. The buffer B1 and inverter I1 before the
node E1 in Fig. 4 determines the width of the edge as shown
in the conceptual timing diagram in Fig. 5. Since these buffers
are also needed for the edge detector designs in [13], [24],
[26], these extra buffers are excluded for each design during
comparison in Table I. Inserting more buffers creates wider
edge which is needed for proper functioning of the warning
5FF. In our approach, the buffers are inserted before inverter
I1 in Fig. 4 for creating wider edge. We have compared the
different types of edge detectors based on area, speed, power
dissipation, maximum response time and minimum supply
voltage of operation and are presented in Table I. To make
a fair comparison, two extra delay buffers are added in each
type of edge detectors. It is found that the power dissipation
of the proposed edge detector is 9µW which is least among
all types of edge detectors. The dynamic implementation of
the previous edge detectors increases the power dissipation.
The maximum response time of the proposed edge detector
is 12ps which is least among all the types of edge detectors.
This is due to the fact that the previous implementations use
the stacking of transistors which increase the response time.
The minimum supply voltage of operation of different edge
detectors is presented in the 6th column of Table I. It shows
that the proposed edge detector can operate at minimum supply
voltage of 0.5V which is the least among other implemen-
tations. The stacked transistor implementation of the other
edge detectors increases the minimum operating voltage. From
Table I, it is also found that proposed edge detector uses
the least number of transistors among all of the previously
proposed ones. The simulation results in Table I are performed
at typical corner, supply voltage of 1.2V and temperature of
25oC. These results have been obtained by simulating the
schematic of the circuits with the diffusion capacitance of
each transistor. We have simulated the RC extracted layout of
the proposed edge detector and found that maximum response
time is 17ps and power dissipation of 14µW which is in close
agreement with the schematic simulation. We expect similar
variation in other previously proposed edge detectors. Since,
the previously proposed edge detectors are custom circuits; we
have compared all the approaches by performing schematic
simulation.
B. Warning Window Generator
The warning window is also known as guard band interval
in [21], time window control(TWC) in [23] and detection
window in [24]. The creation of the warning window is
explained in the following two cases.
1) Case I: Before the rising clock edge: The guard band in-
terval in [21] and the TWC in [23] are created before the rising









Fig. 4. Pass-transistor based edge detector
the warning window is generated for a rising edge from the
previous rising edge. Accordingly, the previous rising edge
acts as the reference signal for generating warning window for
the present rising edge. In this case, the warning window width
depends on clock frequency and designed for a fixed clock
frequency. It requires large number of buffers to create the
required warning window before the rising clock edge which
in turn leads to increase in area and power dissipation in low
clock frequency. However, the edge detector in this approach
operates with input data directly. The detection window in [24]
is generated from the leaf clock by inserting buffer cells in the
path which leads to huge dynamic power consumption. In this
approach [24], the clock for the flip-flop also have delay cells
so that the detection window is created before the rising clock
edge for the flip-flop which create difficulty in balanced clock
tree synthesis.
Authors in [21] proposed two build-in aging sensors such
as (1) stability checker design and (2) double sampling design.
Both the proposed technique and the stability checker design
have the similar approach as both use edge detection circuit.
However, the stability checker design in [21] is frequency
dependant whereas the proposed technique is not frequency
dependant. To make a fair comparison, we have compared the
edge detection approach of [21] with the proposed technique
in the paper. The clocking power of the stability checker
approach of [21] is large as more number of buffers is needed
for creating small warming window which also depends on
system clock frequency.
The double sampling design in [21] and canary FF in [6]
have similar approach as both use two FFs. We have simulated
the layout extracted netlist of the double sampling approach
in [21] and the proposed warning FF. The SPICE simulation
results show that the clock-only power of the double sampling
approach in [21] and the proposed FF consumes 13 µW and












Fig. 6. Circuit for generating warning window from clock signal
6TABLE I
COMPARISON OF DIFFERENT TYPES OF EDGE DETECTORS
Edge Number of Num of Extra Power Max Response Min Voltage
detector transistors Buffers needed dissipation (μW) time (ps) Operataion (V)
Bull et. al. [13] 16 2 18.1 74 0.85
Hirose et. al. [26] 9 2 16.6 49 0.7
Rebaud et. al [24] 12 2 17.5 52 0.65
Proposed 8 2 9.07 12 0.5
TABLE II
QUANTITATIVE COMPARISON OF WARNING WINDOW GENERATION
SCHEMES
Case I Case II
Clock frequency Dependent Independent
Reference signal Previous clock edge Present clock edge
Nbs of delay Buffers (in
low clock frequency) More Less
Area/Power (in
low clock frequency) More Less
Edge detector Direct Delayed
operates on data signal data signal
Proposed in [21] and [23] This paper
more clock-only power consumption compared to the double
sampling approach in [21]. One of the possible solutions
for reducing the clock-only power of the proposed FF is to
share the warning window generator with other warning FFs
in the design. However, it would require additional routing
resource for warning window signal. So, there is a tradeoff
exists between reduction of clock power in warning window
generator versus additional routing resource.
2) Case II: After the rising clock edge: In our approach,
the warning window is generated after the rising edge of a
positive edge triggered FF. In this case, the warning window
is generated for a rising edge from the same rising edge.
Accordingly, the same rising edge acts as the reference signal
for creating the warning window. Hence, it would require a
few buffer chains. In this case, the warning window width is
independent of clock frequency. A few buffer chains in the
clock path make the warning window generator circuit area-
efficient and power-efficient. The warning window is generated
by logical AND operation between the clock signal and the
delayed inverted clock signal as shown in Fig. 6. In this case,
the edge detector operates with the delayed data. The width
of the warning window is determined by the buffer inserted
in the inverted clock path as shown in Fig. 6.
The summary of both the warning window generation
schemes is presented in Table II where it is clearly shown that
case II is superior to case I in both area and power comparison.
C. Width of warning window
The warning FF can report warning only if the edge of
the delayed data is moving gradually towards clock edge. The
supply voltage and/or clock frequency should be decreased
gradually so that the delayed data will not miss the warning
window. If the direct data transition falls directly into the
erroneous window, then the warning signal can not be detected.
The generalization of the warning window width is presented
here.
TABLE III
AREA AND POWER COMPARISON OF WARNING WINDOW GENERATOR FOR
DIFFERENT VALUES OF WARNING WINDOW WIDTH




The tminwarning depends on the following factors
1) Find the worst case setup time of FF by simulating the
FF at the worst process corner (tWsetup)
2) The maximum delay change tV R) either due to the
one step change in the voltage regulator or due to
instantaneous power supply drop or due to fast moving
transients.
3) The maximum delay of the warning detector (tWD)
Now tminwarning is determined as follows:
tminwarning ≥ twsetup + tV R + tWD (1)
The value of tminwarning is basically determined using the
above equation. The higher value of twarning is good for safe
operation of the warning FF. However, it would reduce the
savings of the delay margin due to dynamic variations. The
lesser value of twarning would lead to mal-functioning of the
warning FF. So, there exists a trade-off between the saving
of the delay margin versus functionality of the warning FF.
The higher value of twarning would increase the area/power
dissipation of the warning FF as a higher value of twarning is
achieved by inserting buffers in the warning window generator
sub-circuit shown in Fig. 6. We have presented how much
area/power increases with the different values of warning
window width in Table III. As width of warning window is
increased by 27%, the area of the warning window generator
increases by 17% and power dissipation increases by 20%.
Table III summaries that more width of the warning window
requires more number of buffers which increases the area and
power dissipation of the circuits.
D. Delay Buffer and Warning Detector
The delay buffer used in the warning FF is basically a
chain of buffers. Let twsetup is the worst case setup time
considering the process, voltage, temperature variation and
clock skew and jitter and tminwarning is the minimum width of
the warning window considering worst case process, supply
and temperature variation. Then, the minimum delay of the
buffer tminbuffer is given by
tminbuffer ≥ twsetup + tminwarning (2)
7The value of tbuffer is basically determined by the circuit
designer based on experience. In our case, a reconfigurable
delay buffer is used which can be adjusted from outside the
chip. The delay buffers are used to delay the input data to
the edge detector. The reconfigurable delay buffer allows the
opportunity to adjust the delay margin of the critical path for
the warning FF. Another reason for this is that the warning FF
generates warning signal and not the error signal. Hence, we
need to observe the delayed data.
Warning detector is used to monitor edge of the delayed data
in warning window interval. If the transition of the delayed
data happens in the warning window, then it flags warning
signal. The transistor level circuit of the warning detector
is shown in Fig. 7. The circuit performs the logical AND
operation of both the signals to generate a warning signal.
The dynamic high impedance node of the warning detector is
susceptible to charge leakage or external coupling. Hence, it is
good to have a feedback keeper in order to avoid any external
coupling or charge leakage in the high impedance node of the
warning detector.
E. Pulse Widening Circuit (Optional)
This circuit is purely optional in the proposed warning FF
design. The warning signal produced is of the order of width
of the warning window which is typically very small. The
warning signal of such a small width could not be possible
to bring outside the chip. If we want to observe the warning
signal outside the chip, then this circuit is needed. However, if
warning signal is used in a controller inside the chip, then this
circuit is not required. The pulse widening circuit is simply
a normal FF as shown in Fig. 8 with data input connected
to “logic 1”, clock input connected to the narrow warning
signal and reset input connected to system clock. The output
of the circuit is a wide warning signal. The conceptual timing
diagram of the pulse widening circuit is shown in Fig. 9.
The pulse widening circuit is just a normal FF with reset.
It is important to note that in some master-slave based FFs, a
minimal clock pulse width requirement needs to be monitored
for the proper functionality of the circuit. In some cases, a
pulse triggered FF might be a better alterative. The area of the
overall warning FF is increased by 33% by adding the pulse







Fig. 7. Circuit for generating the warning signal from the warning window
and the data edge
with and without the pulse widening circuit are 46µW and 40
µW respectively. Hence, the pulse widening circuit increases
the power dissipation of the warning FF by 15%. The impact
of area and power dissipation of pulse widening circuit can be
reduced by sharing among many warning FFs.
IV. TEST STRUCTURE AND TESTING STRATEGIES
The block diagram of warning system consists of main
blocks of the test chip and FPGA as shown in Fig. 10. The
main blocks of the test chip consist of a 16-bits Kogge-stone
adder, a set of normal FF (33-bits), a set of warning FF (17-
bits) and three shift registers. The PLL inside the FPGA is
used to generate all the clocks for the test chip.
Here, we have presented a strategy of testing the warning
FF with limited number of input pins to the chip. Due to the
limitations of number of input pins, one shift register is used
to store the input A and B of the adder. However, the carry
input of adder has been assigned a dedicated input pin. This
is because, for testing the warning FF, toggling of the data
is required. The adder sums up the inputs A, B and Cin to
produce the sum and Cout. So, the toggling of the Cin input
is reflected at the output of the adder. Clk0 is used as the clock
in the input shift register. The two output pins Adder sout
and Warning sout are used to serially get the adder and the
warning signal output outside the chip respectively. These two
pins are observable points for the test chip.
Another limitation of our chip is that high frequency signal
can not be fed to the chip. One way to deal with this problem is
to create long critical path. However, designing a circuit with





















































































Fig. 10. Block diagram of warning system consists of main blocks of the test chip and FPGA. The test chip contains 16-bits Kogge-stone adder, 33-bits
normal FF, 17-bits warning FF, 3 shift registers and FPGA’s PLL generates the required clock for the test chip.
signal to the chip is 1 MHz, then the delay of critical path
should be at least 1 µs which requires large area. However, in
our design, the critical path delay is 2.4 ns which requires less
area. This critical path is measured from a typical corner test
chip at supply voltage of 1.15V in room temperature of 27 oC.
In our case, two clocks Clk1 and Clk2 are used for sampling
the input and output data of the adder. These two clocks are
having same frequency with a phase shift of 2.4 ns or more.
The basic purpose of using phase shifted clock is to emulate
high frequency clock inside the chip using the low frequency
clock available outside the chip. The phase between two clocks
is the fraction of clock cycle which is shifted in between
two signals at any arbitrary point and graphically explained
in Fig. 11. The Clk3 is used to sample the two output shift
registers i.e one for adder output and other for warning signal
output. To sample the warning signals, the Clk3 should be
either negative edge of Clk2 or phase shifted Clk2. In our
measurement setup, phase shifted Clk2 acts as Clk3. These
three clocks such as Clk1, Clk2 and Clk3 are having same
frequency and differ in phase as shown in Fig. 11. These clocks
are generated off-chip using the PLL inside the FPGA. These
are our testing strategies. However, in real design, two extra











φ ⎛ ⎞= Δ
⎜ ⎟
⎝ ⎠
Fig. 11. Showing three clocks Clk1, Clk2 and Clk3. The phase shift
between Clk1 and Clk2 is φ1 and phase shift between Clk1 and Clk3 is
φ2
V. EXPERIMENTAL RESULTS
The experimental setup consists of the test board along with
FPGA board. Figure. 12 shows the experimental setup along
with chip micrograph. A test chip has been fabricated in 65nm
industrial process technology to demonstrate the functionality
of the proposed warning FF for low power application. The key
design parameters of the test chip is summarized in Table IV.
The experiments were carried out to verify the proposed FF
for different supply voltage, frequency ( or phase shift) and
process conditions. The shift register inside the chip was
programmed to activate one of the critical path of the adder.
9TABLE IV
KEY DESIGN PARAMETERS OF THE TEST CHIP
Key Design Parameters Values
Process 65nm Bulk CMOS
Supply voltage 1.2V
Vth flavor Nominal Vth









Fig. 12. The experimental setup consists of the FPGA board and test board.
The micrograph of the test chip of size 2.1mm X 2.1mm contains the warning
system of size 115.2μm X 576μm
The two 16-bits A and B inputs of the adder are set to
(0000)16 and (FFFF )16 respectively. The Cin input to the
adder is given toggled waveform so that the data transition
happens at the output of the adder. The output of the adder
has two possible outputs based on the value of Cin input to
the adder. If the Cin input to the adder is 1, then the output of
the adder is a group of zeros for 17-bits and if the Cin input
to the adder is 0, then the output of the adder is a group of
ones for 17-bits. The same pattern of the output of the adder is
repeated as the input Cin to the adder is changing as shown in
Fig. 14(a). 17-bit warning FFs are inserted at the output of the
adder. The 17-bits adder output and 17-bits warning signal are
shifted out using two parallel-in serial out shift registers. The
same critical path is activated for all types of measurement
below.
A. Impact of supply voltage variation on Warning FF
The supply voltage is an important parameter to study the
feasibility of the proposed approach. The proposed warning
FF was simulated across supply voltage and temperature
variation. Fig. 13 shows the impact of the supply voltage
on setup time and the warning window width across supply
voltage at different temperature. It is found that the setup
time and the warning window width increases with decrease
in the supply voltage as the gate overdrive decreases with
decrease in the supply voltage at different temperature. This
is an important requirement for the warning FF design. It
also confirms that the warning window generator works under
supply voltage and temperature variation.
To verify the functionality of warning system, the test
chip was measured across various supply voltages. The 17-
bit warning FFs at the output of the adder monitors the timing
violation at the output of the adder. As mentioned earlier, 17-
bits warning signal are shifted out using a parallel-in serial out
shift register. The measured input clock, shifted output of the















































Fig. 13. Impact of supply voltage on normalized setup time and warning
window width at different temperature
adder and warning signal of the warning system are shown in
Fig. 14. When the supply voltage was above 1.09 V, no error
and warning were observed as shown in Fig 14(a). At supply
voltage 1.08V, the one bit of the adder is critical and flags
the first warning signal as Fig 14(b). If the supply voltage
is reduced further, then multiple critical paths are activated.
Hence, the multi-bit warning signals are flagged at supply
voltage 0.96 V as shown in Fig 14(c). If the supply voltage
is reduced further, some of the critical paths are failed. The
first error bit is shown at supply voltage 0.94 V as shown in
Fig 14(d). In this type of experiment, the phase shift was kept
constant to isolate the impact of supply voltage on warning FF.
The whole operation of the supply voltage on warning system
can be divided into three zones such as
1) No warning and no error zone ( Supply voltage above
1.09 V)
2) Warning zone ( Supply voltage lies between 0.95 V and
1.08 V)
3) Error zone ( Supply voltage starts at 0.94 V)
B. Impact of phase shift or clock frequency on Warning FF
As it is mentioned earlier, due to the limitation of input
pins, the high frequency clock can not be fed to the chip.
Two clocks Clk1 and Clk2 having same frequency with some
amount of phase shift was given to the chip to study the
impact of phase shift or clock frequency on warning FF. The
two clocks Clk1 and Clk2 waveforms generated from FPGA
PLL are shown in Fig. 15. It is very convenient to generate
the phase shifted clock of the order of nanoseconds using
the PLL inside FPGA by Megafunction tools and interested
reader may refer [29]. The test chip was applied various phase
shifted clock waveforms to study the impact of phase shift in
warning system. The clock, adder output and warning signal
for different phase shifted clock is shown in Fig. 16. When
the phase shift between two clocks were 5.4◦(or 6 ns), then
no error and no warning signal was observed as shown in
Fig. 16(a). However, warning signal was flagged, while the
phase shift between two clocks were 3.5◦ (or 3.88 ns) as shown
in Fig. 16(b). In the both cases, the data is sampled correctly
until the warning signal is observed. In this experiment, the
supply voltage is kept constant (i.e. 0.86 V) to isolate the







A group of 17-bits zeros 
A group of 17-bits ones 
(a)
A group of 17-bits zeros 







A group of 17-bits zeros 














Fig. 14. Impact of supply voltage on warning system showing the waveform of clock, warning signal and 16-bit Kogge-stone adder output. (a) Measured
no error and no warning signal at supply voltage above 1.09 V, (b) Measured first warning signal at supply voltage 1.08 V, (c) Measured multi-bits warning
signals and no error at supply voltage 0.96 V, and (d) Measured first one bit error in the output of the adder at supply voltage 0.94 V
A group of 17-bits zeros 








A group of 17-bits zeros 








Fig. 16. Impact of phase shift on warning system showing the waveform of clock, warning signal and 16-bit Kogge-stone adder output. (a) Measured no
warning and no error at supply voltage 0.86 V and at phase shift between Clk1 and Clk2 5.4◦(or 6 ns), and (b) Measured warning signal at supply voltage
0.86 V and at phase shift between Clk1 and Clk2 3.5◦ (or 3.88 ns)
Based on the process corner, operating point, workload and
switching activity of the ASIC, many critical paths may be
critical. By performing the statistical static timing analysis on
the ASIC, a few critical paths are derived and the warning FF
is used in those critical paths. The first warning signal voltage
is defined as the voltage at which one of the paths become
critical and flag warning signal while the system is running at a
fixed clock frequency or phase shifted clock. Figure 17 shows
the impact of phase shift on first warning signal voltage. As
phase shift between two clocks are increased, the delay margin
is increased. The increase in delay margin can be met with
decrease in supply voltage as shown in Fig 17. Hence we can
achieve low power by running system at low supply voltage
and low frequency. In this case, the warning signal acts as a
reference for when to stop further decrease in supply voltage
without causing any mal-functioning to the system. The phase
shift between the two clocks Clk1 and Clk2 is fundamentally




Fig. 15. The two clocks Clk1 and Clk2 generated from PLL inside Altera
FPGA having same frequency 2.5 MHz and phase shift of 2.75◦(or 3.05ns)




























y = − 0.043*x3 + 0.1*x2 − 0.12*x + 0.82
Measured data
   Cubic Fit
Fig. 17. Measured first warning voltage versus phase shift between the two
clocks Clk1 and Clk2
proposed circuit can track the critical path delay of 2.4 ns to
7.5 ns at warning voltage of 1.15 V to 0.72 V respectively. A
cubic polynomial is approximated with the measured warning
voltage for interpolation of the intermediate points as shown
in Fig 17.
C. Number of buffers in delayed data
In this proposed warning FF, the transition of delayed data
is monitored to flag a warning signal. We need to quantify the
number of delay buffers in the data. The number of buffers in
the delayed data path represents the delay margin of the critical
path. The proposed design uses a configurable delay buffers
for creating the delayed data for monitoring. The configurable
delay buffers can be controlled based on the process condition
and hence, the supply voltage, frequency and body bias can be
changed adaptively based on the process corner. The proposed
technique has full flexibility to modify the critical path delay
based on process corners. This type of adaptive modification
of critical path delay is not possible in case of conventional
design.
Figure 18 shows the number of buffers in the delayed data
versus measured first warning voltage and first error voltage.
The first error voltage is the supply voltage at which one
bit error appears for a given critical path at a fixed clock
1 2 4 6 8
0.9 
0.95



















Fig. 18. Measured first warning voltage and first error voltage versus number
of buffers in delayed data
frequency. Since, error does not depend upon the delayed data.
So the error voltage is constant for a critical path at a given
clock frequency. Here, the mean first error voltage is 0.95 V
as shown in Fig. 18. The first warning voltage depends on the
number of buffers in the data path. If the number of buffers
in the data path are more, we are monitoring the delayed data
earlier in time and hence the first warning voltage is large and
saving in power is less. In this case, delay margin is large as the
gap between the first error voltage and first warning voltage
is large. If the number of buffers in the data path are less,
we are monitoring the delayed data later in time and hence
the first warning voltage is small. In this case, delay margin
is small as the gap between the first error voltage and first
warning voltage is small and saving in power is more. Hence, a
trade-off exists between the delay margin versus power saving.
So, the designer need to choose the number of buffers based
on the available delay margin for a design. It is found that
the warning voltage varies linearly with number of buffers as
shown in Fig. 18. The measured results of first warning voltage
and first error voltage from 10 different chips are shown as the
error bar in Fig. 18. This error bar is due to the manufacturing
die-to-die variation of critical path in different chips. It shows
that the proposed circuit works properly even if there is die-to-
die process variation. It is found that the maximum variation
of first warning voltage and first error voltage is 50 mV and
35 mV respectively which is due to the die-to-die variation of
critical path across chips.
D. Power dissipation
This section describes the comparison of power dissipation
between the proposed design and the conventional design
while considering only the dynamic variations (i.e supply and
temperature) at a given process corner. In the conventional
design, we assume that process monitor is available to monitor
the process condition of the design. However, the power
dissipation due to process monitor is not included in both the
conventional and proposed design in this comparison.
In the conventional approach, the Kogge-stone adder with
17-bits normal FFs at the output is simulated whereas in the
12
TABLE V
COMPARISON OF PROPOSED DESIGN WITH THE CONVENTIONAL DESIGN AT DIFFERENT PROCESS AND TEMPERATURE CONDITIONS
Conventional Design Proposed Design % Saving in
Process Temperature(oC) Min Period Supply Power Diss. Supply Power Diss. power Diss.
(ns) (V) (mW) (PC ) (V) (mW) (PP ) (PC−PP )PP ∗ 100%
Best 25 0.8 1.05 0.364 0.87 0.288 26.4
Best 125 0.8 1.05 0.393 0.94 0.360 9.2
Typical 25 1.3 1.05 0.218 0.94 0.204 6.8
Typical 125 1.3 1.05 0.223 0.94 0.208 7.2
Worst 25 3.2 1.05 0.088 0.94 0.082 7.3
Worst 125 3.2 1.05 0.089 0.86 0.070 27.1
proposed scheme, the Kogge-stone adder with 16-bits normal
FFs and 1-bit warning FF at the output is simulated. In the pro-
posed approach, one warning FF is inserted in the critical path
of the design. The conventional design requires supply margins
as it does not have any monitoring mechanism. However, the
proposed warning FF based design does not require extra
margin for dynamic supply voltage and temperature variations.
Hence, we can operate the proposed design at lower supply
voltage till the warning signal appears and hence the power
dissipation of the design can be reduced. We have assumed
10% dynamic supply voltage variation in case of conventional
design considering worst case temperature at a given process
corner. In this analysis, we have varied the clock frequency
of the design across process corner at a fixed supply voltage
of 1.05V for the conventional design. The clock frequency
of the design is determined considering 10% supply voltage
variation at worst case temperature. The maximum value of
warning voltage at any corner is 0.94V which is around 10%
voltage variation as shown in Table V.
The minimum clock period of the conventional design is
determined considering supply voltage variation at worst case
temperature for a given process corner. Both the conventional
and the proposed designs are simulated in SPICE at minimum
clock period. Table V shows comparison of the power dissi-
pation between the conventional design and proposed design
at different process and temperature conditions. It shows that
the maximum power saving of up to 27% can be achieved
in case of the proposed design at the worst process corner
with temperature of 125oC. In the typical corner, the warning
voltage does not change with temperature variation. This is
because of inverted temperature dependence effect of the gate
delay on temperature at low supply voltage and explained
in detail in [32]. At low supply voltage, the gate delay can
decrease with increase of temperature. This is due the opposite
behavior of mobility and threshold voltage on gate delay. Due
to this effect, the critical path delay of the design does not
change much with temperature variation for typical corner
at low supply voltage. Hence, the warning voltage does not
change for the typical corner at low supply voltage with the
temperature variation.
E. Impact of Area
The layout of warning FF consisting of a traditional flip-
flop, a delay buffer, a warning window generator, an edge
detector and a warning detector is shown in Fig. 19. The
proposed warning FF is around three times the area of a
traditional FF. Hence, the warning FF increases the area of a
normal FF by about two times. The warning window generator
can be shared among all warning FFs in the design which
reduces the area penalty of the proposed FF. The power
dissipation of proposed warning FF and a conventional FF
are 40 µW and 11 µW respectively. Hence, the warning FF
dissipated 2.6× more power compared to a conventional FF.
However, the proposed scheme can achieve power saving of
upto 27% compared the conventaional approach as explained
in Section V-D.
F. Comparison with the existing techniques
The similarity and difference of existing warning detection
schemes is presented in this paragraph. The warning detection
scheme can be implemented in two different methods such as
double sampling and edge detection. The major disadvantage
of warning detection using double sampling method is the
metastability of FF which can lead to malfunctioning of the
system. The major disadvantage of warning detection using
edge detection method is the partial evaluation of warning
detection which can lead to misleading adaptive response.
The proposed warning FF is based on the edge detection
scheme. The major problem of the existing edge detector based
warning FF is the effective generation of warning window
for monitoring data edge detection. The warning window
generation scheme in stability checker [21] depends on the
input clock frequency as discussed elaborately in Section III-
B. Authors in [24] proposed a specialized clocking circuit
for warning window generation which may lead to imbalance
clock network. The warning window generation scheme in the
proposed approach is simple and independent of input clock
frequency. It does not affect the clock network. The summary
of the comparison of existing techniques are presented in
Table VI.
VI. DESIGN FLOW OF WARNING FF
In this section, the design flow of the warning FF is pre-
sented. Based on the process corner, operating point, workload
and switching activity of the ASIC, many critical paths may
Fig. 19. Layout of warning FF consisting of a traditional flip-flop, a delay
buffer, a warning window generator, an edge detector and a warning detector.
13
TABLE VI
COMPARISON OF VARIOUS WARNING DETECTION SCHEMES
Types of Techniques Clock Frequency Clock Network Metastability / Affects System
Warning detector used Dependency Requirement Partial Evaluations Performance
Canary FF [6], [21] Double Sampling No No Metastability in FF Malfunctioning of the system
Stability Checker [21] Edge detection Yes No Partial Evaluation in WD* Affects adaptive response
Rebaud et. al [24] Edge detection No Yes Partial Evaluation in WD Affects adaptive response
Proposed Edge detection No No Partial Evaluation in WD Affects adaptive response
*WD means warning detector.
be critical. Monitoring all the critical paths of the design
is not logically feasible as it would increase the area and
power dissipation of the design. Authors in [24] suggest using
statistical static timing analysis flow to create a few important
critical paths in the design. A set of equivalent replica critical
path of these important paths is needed to be created as in [30]
and guarded with warning FFs. The replica paths should be
activated all the time. Then, one can make sure that warning
FF will continuously monitor the critical path as required. To
find best replica critical path is out of scope of the present
paper and interested reader may refer [30].
The replica critical paths are created and guarded by warn-
ing FFs. The warning signals of multiple replica critical paths
are combined using a multiple input OR gate. The final output
of the OR gate is fed to the pulse widening circuit in Fig. 8 to
generate the wide warning signal. The wide warning signal
is used as a trigger signal for the adaptive controller. We
recommend the user not to reduce the supply voltage further
after the first warning voltage. Some margin between the first
warning voltage and first error voltage should be maintained
to preserve the data integrity at the lowest possible supply
voltage. The adaptive controller is used to switch the supply
voltage, clock frequency and body bias based on warning
signal. The present test chip does not contain any adaptive
controller. However, an adaptive frequency controller has been
implemented externally inside FPGA which can switch to the
different phase shifted clock based on number of warning
signals. When warning signal is detected, the data is sampled
properly by the FF. We need to wait for many clock cycles
(for example 10,000 cycles) as in [31]. Then, the number of
occurrence of warning signal is counted. If the count value is
less, then it is just a very low probability effect such as SER.
If the count value is more, then controller will increase the
supply voltage by 10mV after 10,000 cycles to avoid further
warning. Some representative controllers are presented in [27],
[28], [31] for the interested reader.
We have compared the design with and without frequency
controller inside FPGA. The design without the frequency
controller can operate till 1.1 V with critical path delay of
2.03ns. The design with frequency controller can operate
till 0.8V with critical path delay of 5.09ns. The controller
automatically switches to different phase shifted clock and
operate till supply voltage upto 0.8V. In future, attempt would
be made to design adaptive controller inside the chip.
VII. CONCLUSION
This paper proposes a new metastability immune warning
detection sequential using the concept of delayed data in the
edge detection circuit and a traditional FF. It also presents how
to use the warning FF for dynamic voltage and frequency
scaling in ASIC which typically lacks an error recovery
mechanism unlike a processor. A test chip is fabricated in
65 nm technology node to show the usage of the FF in
dynamic voltage and frequency scaling applications in ASICs.
The results across supply voltage, phase-shifted clocks and
number of buffer in delayed data shows the effectiveness of our
approach. The measured results demonstrate that the proposed
circuit can track the critical path delay of 2.4 ns to 7.5 ns at
warning voltage of 1.15 V to 0.72 V respectively. The future
work includes the design of adaptive controllers with warning
FF in real designs.
ACKNOWLEDGEMENT
The VLSI chip in this study has been fabricated in the
chip fabrication program of VLSI Design and Education
Center (VDEC), the University of Tokyo in collaboration with
STARC, e-shuttle, Inc., and Fujitsu Ltd.
REFERENCES
[1] L. S. Nielsen, C. Niessen, J. Spars, and K.V. Berkel, “Low power
operation using self-timed circuits and adaptive scaling of supply
voltage,” IEEE Trans. on VLSI systems, vol. 2, no. 4, pp. 391–397,
Dec. 1994.
[2] G. Wei and M. Horowitz, “A low power switching power supply
for self-clocked systems,” in Proc. of International Symposium
on Low Power Electronics Design, 1996, pp. 313–318.
[3] V. Gutnik and A. Chandrakasan , “Embedded power supply for
low power DSP,” IEEE Trans. VLSI Systems, vol. 5, pp. 425–435,
Dec. 1997.
[4] J. Tschanz et al.,“Adaptive frequency and biasing techniques for
tolerance to dynamic temperature-voltage variations and aging,”
in Proc. of IEEE ISSCC, 2007, pp 292–293.
[5] S. Das, D. Roberts, S. Lee, S. Pant, D. Blaauw, T. Austin, K.
Flautner, and T. Mudge, “A self-tuning DVS processor using
delay-error detection and correction,” IEEE Journal of Solid-State
Circuits, vol. 41, no. 4, pp 792–804, April 2006.
[6] T. Sato, Y. Kunitake, “A simple flip-Flop circuit for typical case
designs for DFM,” in IEEE Proc. of International Symposium on
Quality Electronics Design, 2007, pp. 539–544.
[7] S. Das, C. Tokunaga, S. Pant, W. Ma, S. Kalaiselvan, K. Lai, D.
M. Bull, and D. T. Blaauw, “RazorII: In situ error detection and
correction for PVT and SER tolerance,” IEEE Journal of Solid-
State Circuits, vol. 44, no.1, pp. 32–48, Jan. 2009.
[8] D. Blaauw, S. Kalaiselvan, K. Lai, W. Ma, S. Pant, C. Tokunaga,
S. Das, D. Bull, “RAZOR-II: In situ error detection and correction
for PVT and SER tolerance,” in Proc. of IEEE ISSCC, 2008 pp.
400–401.
[9] K. A. Bowman, J. W. Tschanz, N. S. Kim, J. C. Lee, C. B.
Wilkerson, S. L. Lu, T. Karnik, V. K. De, “Energy-efficient
and metastability-immune timing error detection and instruction-
replay-based recovery circuits for dynamic-variation tolerance,”
in Proc. of IEEE ISSCC, 2008, pp. 402–403.
14
[10] K. A. Bowman, J. W. Tschanz, N. S. Kim, J. C. Lee, C. B.
Wilkerson, S. L. Lu, T. Karnik, and V. K. De, “Energy-efficient
and metastability-immune resilient circuits for dynamic variation
tolerance,” IEEE Journal of Solid-State Circuits, vol. 44, no. 1,
pp. 49–63, Jan 2009.
[11] T. Nakura, K. Nose, and M. Mizuno, “Fine-grain redundant logic
using defect-prediction Flip-Flops,” in Proc. of IEEE ISSCC,
2007, pp. 402–403.
[12] K. A. Bowman et. al, “A 45 nm Resilient Microprocessor Core
for Dynamic Variation Tolerance,” IEEE Journal of Solid-State,
vol. 46, no. 1, pp. 194–208 Jan 2011.
[13] D. Bull, S. Das, K. Shivashankar, G. Dasika, K. Flautner, and
David Blaauw, “A Power-Efficient 32 bit ARM Processor Us-
ing Timing-Error Detection and Correction for Transient-Error
Tolerance and Adaptation to PVT Variation”, IEEE Journal of
Solid-State Circuits, vol. 46, no. 1, pp. 18–31, Jan 2011.
[14] S. Baloch, T. Arslan, and A. Stoica, “Design of a Single Event
Upset (SEU) Mitigation Technique for Programmable Devices,”
in Proc. of ISQED 2006.
[15] H. K. Alidash and V. G. Oklobdzija, ”Low-Power Soft Error
Hardened Latch,” in Proc PATMOS LNCS 5953, pp. 256-265,
2010.
[16] Y. Kunitake, T. Sato, H. Yasuura, and T. Hayashida, “Possibilities
to Miss Predicting Timing Errors in Canary Flip-flops,” in Proc.
IEEE MWSCAS, 2011.
[17] J. Blome, S. Feng, S. Gupta, and S. Mahlke, “Self-calibrating
Online Wearout Detection,” in Proc. IEEE/ACM International
Symposium on MICRO, 2007.
[18] T. Kehl, “Hardware Self-Tuning and Circuit Performance Moni-
toring,” in Proc. ICCD, 1993, pp. 188–192.
[19] B. P. Das, B. Amrutur, H. S. Jamadagni, N. V. Arvind, and
V. Visvanathan, “Voltage and Temperature-Aware SSTA Using
Neural Network Delay Model,” IEEE Trans. on Semiconductor
Manufacturing, vol. 24, no. 4, pp. 533–543, Nov 2011.
[20] M. Meijer, B. Liu, R. V. Veen and J. Gyvez, “Post-Silicon Tuning
Capabilities of 45nm Low-Power CMOS Digital Circuits,” in
Proc. of IEEE Symposium on VLSI Circuits, 2009, pp. 110–111.
[21] M. Agarwal, B. C. Paul, M. Zhang and S. Mitra, “Circuit failure
prediction and its application to transistor aging,” in Proc. of IEEE
VLSI Test Symposium, 2007, pp. 277–286.
[22] M. Agarwal, et al., “Optimized circuit failure prediction for aging:
practicality and promise,” in Proc. of IEEE International Test
Conference, 2008, pp. 1–10.
[23] M. Omana, D. Rossi, N. Bosio and C. Metra., “Novel low-cost
aging sensor,” in Proc. of the ACM International conference on
Computing frontiers, 2010, pp. 93–94.
[24] B. Rebaud, M. Belleville, E. Beigne, C. Bernard, M. Robert, P.
Maurine, N. Azemard, “Timing slack monitoring under process
and environmental variations: Application to a DSP performance
optimization,” Elsevier Microelectronics Journal, vol. 42, pp.
718–732, 2011.
[25] B. P. Das and H. Onodera, “Warning prediction sequential for
transient error prevention,” in Proc. of the IEEE 25th International
Symposium on Defect and Fault Tolerance in VLSI Systems, 2010,
pp. 382–390.
[26] K. Hirose, Y. Manzawa, M. Goshima, and S. Sakai, “Delay-
compensation flip-flop with in-situ error monitoring for low-
power and timing-error-tolerant circuit design,” Japanese Journal
of Applied Physics, vol. 47, no. 4, pp. 2779–2787, 2008.
[27] H. Fuketa, M. Hashimoto, Y. Mitsuyama, and T. Onoye, “Adap-
tive performance compensation with in-situ timing error predic-
tion for subthreshold circuits,” in Proc. of CICC, 2009, pp. 215–
218.
[28] N. Shah, R. Samanta, M. Zhang, J. Hu, D. Walker, “Built-in
proactive tuning system for circuit aging resilience,” in Proc. of
IEEE International Symposium on Defect and Fault Tolerance of
VLSI Systems, 2008, pp. 96–104.
[29] (Oct 2011) Alter PLL user guide. [Online]. Available: www.altera.
com/literature/ug/ug altpll.pdf
[30] Q. Liu and S. S. Sapatnekar,“Capturing post-silicon variations
using a representative critical path,” IEEE Trans. on Computer-
Aided Design of Integrated Circuits and Systems, vol. 29, no. 2,
pp. 211–221, Feb 2010.
[31] U. Y. Ogras, R. Marculescu, and D. Marculescu, “Variation-
Adaptive Feedback Control for Networks-on-Chip with Multiple
Clock Domains,” in Proc of DAC, 2008, pp. 614-619.
[32] A. Dasdan and I. Hom, “Handling Inverted Temperature Depen-
dence in Static Timing Analysis,” ACM Transaction on Design
Automation of Electronic Systems, vol. 11, no. 2, pp. 306–324,
April 2006.
[33] E. Salman, R. Jakushokas, E. G. Friedman, R. M. Secareanu,and
O. L. Hartin, “Methodology For Efficient Substrate NoiseAnalysis
in Large Scale Mixed-Signal Circuits,” IEEE Transactions on Very
Large Scale Integration (VLSI) Systems,vol. 17, no. 10, pp. 1405–
1418, Oct 2009.
Bishnu Prasad Das received the M.Sc. degree in
electronics from Sambalpur University, Orissa, India
in 1999, the M.Tech. degree in computer application
from ISM, Dhanabad, India, in 2002 and the Ph.D.
degree in electronics from Indian Institute of Sci-
ence, Bangalore, India, in 2009.
Bishnu Prasad Das is currently a Post Doctoral Re-
searcher at Department of Electrical and Computer
Engineering, Carnegie Mellon University, Pitts-
burgh, USA, from June 2012. He was a Post Doc-
toral Researcher at Kyoto University, Kyoto, Japan,
from Oct 2009 to May 2012. He has worked at Texas Instruments, Bangalore,
India during his Ph.D. work under Texas Instruments University Programme.
He was a topper and gold medalist in M.Sc. His research interests include
error tolerant circuit design, on-chip test structure for variability measurement,
automatic standard cell library generation in scaled technology node and
modeling under process, voltage and temperature variation and design for
manufacturability.
Hidetoshi Onodera (M’87 - SM’12) received the
B.E., and M.E., and Dr. Eng. degrees in Electronic
Engineering, all from Kyoto University, Kyoto,
Japan. He joined the Department of Electronics,
Kyoto University, in 1983, and currently a
Professor in the Department of Communications
and Computer Engineering, Graduate School of
Informatics, Kyoto University. His research interests
include design technologies for Digital, Analog, and
RF LSIs, with particular emphasis on low-power
design, design for manufacturability, and design for
dependability.
Dr. Onodera served as the Program Chair and General Chair of
ICCAD and ASP-DAC. He was the Chairman of the IPSJ SIG-SLDM
(System LSI Design Methodology), the IEICE Technical Group on VLSI
Design Technologies, the IEEE SSCS Kansai Chapter, and the IEEE CASS
Kansai Chapter. He is currently the Chairman of IEEE Kansai Section. He
served as the Editor-in-Chief of IEICE Transactions on Electronics and IPSJ
Transactions on System LSI Design methodology.
