Measuring deep metastability and its effect on synchronizer performance by Kinniment DJ et al.
1028 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 9, SEPTEMBER 2007
Measuring Deep Metastability and Its Effect on
Synchronizer Performance
David J. Kinniment, Life Fellow, IEEE, Charles E. Dike, Keith Heron, Gordon Russell, Member, IEEE, and
Alexandre V. Yakovlev, Member, IEEE
Abstract—Present measurement techniques do not allow syn-
chronizer reliability to be measured in the region of most interest,
that is, beyond the first half cycle of the synchronizer clock. We
describe methods of extending the measurement range, in which
the number of metastable events generated is increased by four
orders of magnitude and events with long metastable times are
selected from the large number of more normal events. The rela-
tionship found between input times and the resulting output times
is dependent on accurate measurement of input time distributions
with deviations of less than 10 ps. We show how the distribution of
to clock times at the input can be characterized in the presence
of noise and how predictions of failure rates for long synchronizer
times can be made. Anomalies such as the increased failure rates
in a master–slave synchronizer produced by the back edge of the
clock are explained and demonstrated.
Index Terms—Measurement, metastability, reliability, synchro-
nization.
I. INTRODUCTION
GLOBALLY asynchronous, locally synchronous (GALS)systems-on-chip that use intellectual property (IP) blocks
from different vendors with unrelated clocks may use many
synchronizers to resynchronize the data transmitted from one
IP block to another. Stoppable clocks have been proposed as
a way of avoiding the time penalty resulting from the need to
synchronize data, but large clock trees may make stopping the
clock impractical because of the delay involved in the clock
tree before the clock actually stops [1], [2]. For this reason,
synchronization is likely to take an increasingly large propor-
tion of the on-chip communication time in future generations of
process technology. First the reliability required per synchro-
nizer is higher, since the number of synchronizers on chip is
higher. Second, process, voltage, and temperature variations af-
fect synchronizers disproportionately, [3] and, third, clock and
data rates will increase. All of these lead to long synchroniza-
tion times.
Manuscript received May 17, 2006; revised November 11, 2006. This work
was supported in part by Intel Corporation, by the Engineering and Physical Sci-
ences Research Council (EPSRC), U.K., research under Grant EP/C007298/1,
and by Agilent Technologies through equipment support.
D. J. Kinniment, G. Russell, and A. V. Yakovlev are with the University
of Newcastle, Newcastle upon Tyne, NE1 7RU, U.K. (e-mail: david.kinni-
ment@ncl.ac.uk).
C. E. Dike is with Intel Corporation, Hillsboro, OR 97124 USA (e-mail:
charles.e.dike@intel.com).
K. Heron, deceased, was with the University of Newcastle, Newcastle upon
Tyne, NE1 7RU, U.K. (e-mail: david.kinniment@ncl.ac.uk).
Digital Object Identifier 10.1109/TVLSI.2007.902207
Synchronization of data sent using one clock and received by
another requires a minimum of 0–1 receiving processor cycles
to synchronize and this is likely to increase to several cycles
at 65 nm and below because of the time taken by a metastable
output from the synchronizer to resolve to a high or low level.
For reliability of at least a year in a system with many synchro-
nizers, each synchronizer needs to have reliability much greater
than a year. An estimation of the mean time between failures
can be made from the time allowed for resolution of metasta-
bility , the clock frequency , the data transition frequency
and two parameters specific to the synchronizer circuit. These
are , sometimes called the metastability window and the re-
solving time constant . We can make this estimate as follows.
If the data input goes high sufficiently far in advance of the
clock edge, the synchronizer output will always go high and if
it is significantly after the clock it will always go low. If the two
edges are close enough, the high or low outcome is affected by
circuit noise and is nondeterministic. Here, we will define the
separation between data and clock which gives an exactly equal
probability of a high or low outcome as the balance point. In
the absence of noise, an input exactly at the balance point would
take an infinite time to resolve. The synchronizer response from
metastability is usually exponential [4]. Thus, for inputs a time
away from the balance point, where is less than the
metastability window, the relationship between resolution time
and is given by
(1)
Customarily, is measured from the normal propagation
delay , but it may be convenient to measure it from the
balance point, in which case
and
(2)
A small change in the input time will, therefore, cause a
change in the output time
(3)
If the resolution time is longer than the time allowed for syn-
chronization, the synchronizer may fail as a result of an unde-
fined output level. The number of failure events caused by the
1063-8210/$25.00 © 2007 IEEE
Authorized licensed use limited to: Newcastle University. Downloaded on June 08,2010 at 12:20:55 UTC from IEEE Xplore.  Restrictions apply. 
KINNIMENT et al.: MEASURING DEEP METASTABILITY AND ITS EFFECT ON SYNCHRONIZER PERFORMANCE 1029
data edge occurring less than from the balance point in a
total time depends on the clock rate and the data rate and is
(4)
From (4), the mean time between each failure event is
(5)
Using (1) and (5), MTBF can also be expressed in terms of
known system and circuit parameters
(6)
The values of the circuit parameters are found in practice by
plotting against . From (6), the slope of this graph
is . can be found from projecting the graph back to the
axis, where , since at that point
(7)
Rather than measuring MTBF, it may be easier to measure
the number of events resolving between and . From (6),
the number of events resolving after is
Events/second (8)
By subtracting the number of events resolving after ,
where is small, we can get the number resolving between
and
Events (9)
Though (6) is an approximation in the deterministic region
where noise is not significant [4], it is possible to use it to es-
timate the amount of time required by the synchronizer for
longterm reliability. With in the order of 10 pS when is
measured from the normal propagation delay, reliabilities of
more than 10 s (4 months) can only be achieved at clock and
data rates of 1 GHz by allowing a time of longer than for
metastability to settle, [2].
The resolving time constant of a synchronizer as measured by
the slope of events against , often appears to vary as a function
of , [4], [5]. Its starting value is sometimes higher than its final
value and sometimes lower depending upon the initial condi-
tions of the circuit. Typically circuits in which both the true and
the inverse outputs of start low will initially take longer to re-
solve to a high than to a low [4], but because longterm resolution
times are unaffected by the initial transient, the initial slope is
faster than the final . Changes in data and clock inputs during
metastability can also affect resolution times.
Other important effects are those for circuits with multiple
time constants, whose metastability trajectories may be oscil-
latory rather than exponential [7]–[11] and the influence of the
back edge of the clock in a master–slave flip-flop, which trans-
fers a metastable state from the master to the slave latch if res-
olution times are longer than half a cycle. Present measurement
techniques do not allow these effects to be measured at the point
of interest, because at each rising clock edge there is a uniform
Fig. 1. Two-oscillator metastability measurement.
distribution of data arrival times between zero and the clock
or data period, whichever is the smaller. Most of these input
times give normal output propagation times and only an ex-
tremely small number give the very long metastable responses
that would extend into the second half cycle of the clock. This
means that impracticably long data collection times would be
needed to make reliable measurements of MTBF in this region.
Thus, all reliability projections are based on an extrapolation of
a simplified MTBF formula, which may not take into account
important effects.
In this paper, we show how measurements can be made up
to seven orders of magnitude in MTBF beyond current limits.
Instead of using uniformly spread inputs, we concentrate all the
input stimuli close to the region of interest and filter out un-
wanted responses so that most of the data collected is relevant
to very long metastable responses rather than to a very small
proportion. This enables us to measure MTBF values up to 3
years and also to demonstrate the effect of the back edge of the
clock in a synchronizer.
In Section II, we describe the most common measurement
method and present new techniques aimed at increasing the
number of metastable events recorded in a given time. In
Section III, this method is extended by selecting only those
events lasting longer than a given time; and a correction for
jitter and noise in the measurement equipment is described. In
Section IV, we show how the back edge of the clock can affect
reliability and we present data for a master–slave flip-flop that
demonstrates this effect.
II. MEASURING METASTABILITY
Normally metastability measurements make use of two
asynchronous oscillators driving the and clock inputs of a
master–slave flip-flop under test.
If the data oscillator has a period of 99.9 ns and the clock 100
ns, the rising clock edge may or may not produce a change in
the output . Fig. 1 shows the situation when the input is low
on the first clock edge and then goes high very close to the next
edge causing a change in the output.
We can only observe metastability if the input is different
on successive clock edges and even then, only if it changes
very close to the second clock edge causing metastability to
occur. If the data and clock oscillators are not locked together,
all overlap times between 0 and the 99.9 ns cycle time for data
and clock are generated with equal probability. To observe the
delay due to metastability, the change from low to high is
Authorized licensed use limited to: Newcastle University. Downloaded on June 08,2010 at 12:20:55 UTC from IEEE Xplore.  Restrictions apply. 
1030 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 9, SEPTEMBER 2007
Fig. 2. Event histogram for a 74F5074.
used to trigger the recording of each clock rising edge for a po-
tentially metastable event. Only those events where clock and
data overlap by less than the difference between the two oscil-
lator periods 100 ns 99.9 ns can be observed. This is because
they are the only events which generate a change in . These
events are then presented as a histogram of the number of events
resolved between and and should follow the form of (9),
where in this case is the time from the clock to the output.
A typical histogram is shown in Fig. 2 and other examples ap-
pear in [4] and [5]. In Fig. 2, the -axis represents time, with
the triggering output as the reference. When an event is de-
tected by the output change, the trigger time (about 24 ns
after the change) to clock rising edge time is recorded. Since
the clock occurs before the change increasing metastability
time is shown from right to left in the resulting histogram. The
-axis is the number of events with a peak of 80 000 at 20.5 ns
and the decreasing number of events recorded shows in the dis-
crete levels of 0, 1, and 2 events at around 17 ns. Here, 20.5 ns
corresponds to a delay of 24 20.5 or 3.5 ns.
Unfortunately, events which result in a much longer than
normal propagation delay (deep metastability) rarely occur.
Such an event will require much less than a 100 ps overlap
between data and clock. Less than 100-ps overlaps will occur
in only 1 in 1000 of the clock cycles, so truly metastable events
happen much less often than the 100 s implied by 100 ns
1000. Even then not all of these events can be collected if a
general-purpose digital oscilloscope is used to collate the data.
Because the oscilloscope must store, process, and display the
histogram, there will be a significant dead time between suc-
cessive recorded events that limits the actual events recorded.
In our case, the recording rate while the histogram was being
constructed was less than 1 per 100 s, or less than 1 in 1000 of
those generated. Equation (6) shows that with
and 100 ps, an MTBF of around 5 min requires a synchro-
nization time of . In an experiment measuring the number
of failure events, only 1 in 1000 of the events may be recorded,
so more than 90 h are needed to measure MTBF values of only
minutes. Increasing the data and clock frequencies can improve
the number of low probability events recorded, but it is not
practical to characterize the synchronizer much beyond .
In our measurements, we have increased the probability of a
metastable event by ensuring that the data transition is always
within a small time 100 ps from the balance point. Since
there is a transition within 100 ps on every 100-ns clock cycle
time rather than only one in 100 ns/100 ps 1000 cycles as
with the conventional arrangement there are now 1000 times as
many metastable events. Now, either the same measurements
can be made in a much shorter time, or more deep metastable
events can be recorded in the same time. In practice, we use a
delay locked loop to hold the data input at a point that gives a
0.5 probability of a high transition in the output and a 0.5
probability that it stays low.
For input times within less than 0.2 ps of the balance point,
the circuit outputs are nondeterministic and whether the output
ends up high or low depends on the noise [4], [5] as well as the
input time. This technique gives many very long output times.
A schematic of the test setup is shown in Fig. 3. Here, a
10-MHz clock is passed through two closely matched paths to
the data and clock inputs of the device under test. Adjusting
the supply voltage to open collector inverters varies the delay
in the path to the data input. If the data rising edge is slightly
slow when compared with the clock edge, will be low on
most clocks. If it is fast, will be mostly high. A slave flip-flop
records whether the device under test has resolved high or low
and an analog integrator is used to average the proportion of high
to low outputs from the slave. The integrator consists of an op-
erational amplifier with its reference input held at a voltage ap-
proximately halfway between the logic high and logic low levels
of the slave flip-flop and . Each high slave output
causes the integrator output to fall slightly and each low output
causes it to rise. The arrangement forms a delay locked loop in
which the voltage supply to the inverters in the data path is in-
creased if it is too slow and reduced if it is too fast. When the
device under test is near the balance point its final output is de-
termined by noise and is random. In these circumstances the in-
tegrator output will change after each clock period by an amount
, where is the integrator time con-
stant. Altering the reference voltage enables the proportion of
highs and lows to be set manually. This works by increasing the
amount added to the integrator output when the slave output is
low and reducing it when it is high. Normally we set this input
voltage to so that the system settles to a steady
state where 50% of flip-flop outputs are high and 50% are low,
but it is possible to vary the proportions simply by increasing or
reducing the voltage. A higher input forces the loop to compen-
sate by reducing the proportion of high slave outputs and vice
versa. A separate input to the data delay path supply voltage
(not shown) enables a high-speed waveform to vary the delay
by around 100 ps giving the delay distribution with time in
picoseconds between clock and data shown in Fig. 4.
This histogram was obtained using an Agilent Technologies
54855A Infiniium oscilloscope and an approximately sinusoidal
input variation source. It shows a 100 ps variation around a bal-
ance point of 205 ps, where the time axis shown is the difference
between the data and clock rising edges.
In Fig. 5, the data collection oscilloscope was triggered from
the rising outputs and a histogram of the number clock inputs
is shown against a time scale of 500 ps per major division.
To demonstrate feasibility of the method, we chose a
74F5074, a master–slave flip-flop specially designed to give
a fast, controlled metastability response. While it is an old
design, it is easily available and comparisons can be made
with published data. In Fig. 5, the oscilloscope is in color
grade mode, so that the density of traces at a particular point
is represented by the color of the pixel at that point on the
Authorized licensed use limited to: Newcastle University. Downloaded on June 08,2010 at 12:20:55 UTC from IEEE Xplore.  Restrictions apply. 
KINNIMENT et al.: MEASURING DEEP METASTABILITY AND ITS EFFECT ON SYNCHRONIZER PERFORMANCE 1031
Fig. 3. DLL controlling data and clock overlap.
Fig. 4. D to clock event distribution with time in picoseconds.
Fig. 5. Display histogram for clock to Q rising events at 500 ps/div.
display. A histogram of the trace density along the horizontal
line is also shown in this figure, which represents the number
of events passing through the pixels concerned.
In the two-oscillator measurement method, the number of
input events per picosecond is constant with respect to to
clock time. In the to clock histogram of Fig. 4, the number
of input events per picosecond varies with the to clock time,
so for our method it is necessary to correct for the variation.
The correction can be done, by noting that for exactly half of
the input events on the to clock histogram the final output
Fig. 6. Output time to input time.
is high and for half, remains low. When is earlier than
the balance point is more likely to rise than to remain low
and when it is later it is more likely to remain low. Thus, if
the cumulative number of input events on the to clock his-
togram is plotted against time and the event axis normalized to
between 1 and 1, an exact value for the balance point time
can be found where the graph crosses zero. In this case, the bal-
ance point is 205 ps. Because only half the input events cause
an output event, the cumulative number of events on the output
histogram must be normalized to between 0 and 1. The corre-
spondence between times and times can now be found from
the fact that, for a large enough number of events, the number
of input events closer to the balance point than the time must
equal the number of output events with an output time longer
than the time.
Fig. 6 shows the normalized cumulative number of input
events plotted against input time on the left and the normalized
cumulative number of output events against output time on the
right. In Fig. 6, the number of events where the propagation
delay is greater than 6 ns is very small, but as the propagation
delay falls to near the normal delay of 3.9 ns, the number of
events rapidly increases to include all of those measured. If we
take any proportion of events , where this proportion
can be associated with output times longer than the coor-
dinate of on the right-hand graph, and input times shorter
than the coordinate of on the left-hand graph. A small
Authorized licensed use limited to: Newcastle University. Downloaded on June 08,2010 at 12:20:55 UTC from IEEE Xplore.  Restrictions apply. 
1032 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 9, SEPTEMBER 2007
Fig. 7. Standardized metastability characteristic showing 4.18 ns, 55-ps point.
increase in to links to input times and output
times within a very narrow range, and therefore, links the input
time to the output time. In Fig. 6, has been identified
where below 0.5 all the output events have metastability times
lasting longer than 4.18 ns and the input events occur with the
input between the balance point and 260 ps.
The 0.5 point is, therefore, associated with a unique input time
of 55 ps and a unique 4.18 ns output time.
We can now plot against output time to give Fig. 7. This
graph shows that inputs less than 55 ps away from the balance
point will normally give an output lasting longer than 4.18 ns, so
if the synchronization time is set to 4.18 ns failures will occur
in a synchronizer with inputs less than 55 ps. It is important
to note that the graph of Fig. 7 is statistical in nature; it is not
possible to measure any single input to much better than 1 ps
because of the presence of noise. Each point on the graph repre-
sents more than one event and is characterized by the number of
events that occurred after the output time given by its coordi-
nate. The coordinate of the point is the input time just greater
than the same number of input events. Because input event times
are distributed uniformly in this graph over the input event time
range, typically 100 ps, random noise has no effect on the input
distribution, there are still a constant number of events per pi-
cosecond. The accuracy of the graph depends on the number of
events, the more events in the range, the greater the effective ac-
curacy and the smaller the input times that can be plotted.
A significant advantage of this technique is that it allows
the results to be presented in an easily understood standardized
form, independent of oscillator frequency or number of events.
With knowledge of the clock and data frequencies any point on
the -axis, , can also be converted to give the mean time
between failures in a system using (5).
III. DEEP METASTABILITY
If the source of data delay variation is removed, the number
of input events close to the balance point will be increased and,
therefore, the ability to measure longer metastable times cor-
respondingly enhanced. Fig. 8 shows the distribution of input
times that results from this change.
Fig. 8. Event histogram with only noise variation on the input.
Here, the standard deviation from the central value as mea-
sured by the oscilloscope is about 12 ps and the distribution is
very similar to a distribution that would be produced by random
noise alone. This is partly because the flip-flop output value,
high or low, is at least partly determined by internal thermal
noise and partly because there is a significant noise element in
the oscilloscope measurements.
The measurement noise can be estimated by producing a his-
togram of the clock waveform when it is also the source of the
oscilloscope trigger. At the actual trigger point voltage the time
deviation in the histogram is very low, but at a higher or lower
voltage it spreads out to around 9 ps. The specification of the
54855A for this type of measurement is 9.2 ps.
Because of the relatively large measurement noise compo-
nent, we cannot reliably use Fig. 8 to assign input times to
output times. To overcome this problem, we changed the ref-
erence input of the integrator in the DLL of Fig. 3 to produce a
range of different proportions of high and low outputs from the
device under test. If instead of having a reference input voltage
of giving 50% high values we set the voltage
to a value of , a slave output at now
gives an integrator input of , while an output
at gives an integrator input of .
The result is that a low output reduces the delay path three
times as much as a high output increases it, so on average, there
must be three times as many high outputs as low outputs in order
to keep the delay at the balance point. With 75% high values, the
center point of the distribution moves earlier by exactly the
time required to shift 25% of the events from the 0 side of the
balance point to the 1 side as shown in Fig. 9.
We measured the time shifts required to give different prob-
abilities of high outputs and plotted them on a graph showing
% high points against time shift, as shown in Fig. 10. If we as-
sume that the time of input events follows a normal distribution,
we can compare this graph with distributions having different
values of standard deviation.
The line with the closest fit to the points on Fig. 10 represents
the cumulative probability of a high output for a random input
time deviation of 7.6 ps, so we can conclude that the actual dis-
tribution has a deviation of this value. The corrected input time
distribution corresponding to this is shown in Fig. 11.
Authorized licensed use limited to: Newcastle University. Downloaded on June 08,2010 at 12:20:55 UTC from IEEE Xplore.  Restrictions apply. 
KINNIMENT et al.: MEASURING DEEP METASTABILITY AND ITS EFFECT ON SYNCHRONIZER PERFORMANCE 1033
Fig. 9. Shifting the input distribution.
Fig. 10. Measurement of actual D distribution.
For comparison, the original histogram measured on the os-
cilloscope is also shown, demonstrating that measurement noise
makes a significant contribution to the raw data. Combining the
oscilloscope deviation of 9.2 ps with 7.6 ps as the root sum of
squares gives a result of 11.9 ps, very close to that observed.
This method of generating and measuring time distributions
at the picosecond level in the presence of noise allows us to take
our measurements another order of magnitude further, as shown
in Fig. 12, by the 7.6 ps line. Fig. 11 also shows the results of
triggering the oscilloscope only on outputs that appear after
6.5 ns. We do this by clocking the output into two flip-flops,
in this case one at 6.5 ns and the other at 50 ns. The oscilloscope
is triggered by a delayed clock only if the first flip-flop gives a
low and the second a high.
Since digital oscilloscopes collect data continuously, it is
possible to measure events occurring well before the actual
trigger—in this case 60 ns before. Because only a small number
of events last longer than 6.5 ns, the trigger rate is 1000 times
Fig. 11. Corrected input time distribution.
Fig. 12. 100-ps variation (as in Fig. 7), 7.6 ps deviation, and deep metastability
plots.
slower on average and almost all meaningful events are cap-
tured. There are now far more useful events which, therefore,
give greater time accuracy and enable us to go a further three
decades down in input time. Unfortunately, we do not know
exactly how many of these input events lead to output events
over 6.5 ns, but we can count the actual number of triggers to
the oscilloscope in a separate fast counter. Normalization of
the output histogram avoids the need to know precisely how
many of the triggers are converted into oscilloscope traces.
Input events causing rising output times longer than 6.5 ns
have a very low probability and the probability of any rising
Authorized licensed use limited to: Newcastle University. Downloaded on June 08,2010 at 12:20:55 UTC from IEEE Xplore.  Restrictions apply. 
1034 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 9, SEPTEMBER 2007
clock edge leading to a trigger during the measurement period
is given by
Output Trigger Rate
Clock Rate (10)
Where the output trigger rate is that measured by the fast
counter. Around the balance point, the normalized curve of total
input events against input time on the left-hand side of Fig. 6 is
very linear, that is
Total Input Events
Input Time
(11)
Both and are constants here. By a similar argument to
that presented in Section II, any change in the number of output
events due to a change in the output time is reflected in a cor-
responding change in input time. In this case, the input curve is
linear, so it is only necessary to multiply the total output events
at any output time on the right-hand side of Fig. 6 by the con-
stant to get the equivalent input time.
The result of changing the normalized events scale into time
is shown in the deep metastability curve of Fig. 12 that reaches
down to almost 10 s. The graph shows all three measurement
techniques, 100-ps deviation, 7.6-ns noise, and deep metasta-
bility, applied to a 74F5074 device run at 10 MHz for approxi-
mately 1000 s in each mode. The resulting range of is from
10 s to 10 s and 10 s represents an MTBF of 10 s (11
days) at the frequencies used. Existing methods as represented
by Fig. 2 are not capable of MTBF measurements of more than
a few minutes. It is interesting to note that the value of given
by for this rather complex bipolar device has at least three dif-
ferent values, 350 ps between 4 n and 6 ns, 120 ps between 6.2
and 6.8 ns, then 140 ps beyond 7 ns. This last value cannot be
seen using conventional measurement methods. The fairly dis-
tinct breakpoint at 6 ns corresponds to an input of about 1 ps and
is, therefore, in the deterministic region [5].
To confirm that our results are not artifacts of the measure-
ment method, we also measured a more conventional device.
Fig. 13 shows the plots of output time against input time overlap
for a 74ACT74 from Philips. This is a CMOS master–slave cir-
cuit, with a of about 350 ps. The results are fairly typical of a
CMOS device having a normal clock delay of around 3.3 ns
and an initial fast slope, which is probably the result of asym-
metrical initial conditions [4]. The slope of the curve is constant
in this case over a fairly wide range.
IV. CLOCK BACK EDGE
The techniques described before also allow measurement of
the synchronizer reliability into the region after the back edge
of the clock. The theory of Section I takes no account of the in-
ternal construction of an edge triggered flip-flop, but we believe
that there is often a significant increase in the failure rate in the
second half of the clock cycle. We now extend this theory by
examining how edge triggered flip-flops are constructed. Two
similar master and slave level triggered latch circuits are usu-
ally cascaded to form a single edge triggered circuit as shown
in Fig. 14.
Fig. 13. CMOS master–slave device.
Fig. 14. Edge triggered synchronizer.
Here, the master and slave can both be reset so that both out1
and out2 are low. When the clock is low, the master is trans-
parent and any change in in1 is copied through to out1 with a
delay determined mainly by internally large signal gate de-
lays. In normal operation, in1 does not change within the setup
time before the clock rising edge, or the hold time after the rising
edge, thus out1 is steady when the master latch goes opaque and
input changes no longer have any effect. At around the same
time as the clock rising edge, the slave clock falls and so the
slave goes transparent. Now the out1 level is transferred to out2
with a delay .
If the circuit is used as a synchronizer the input can go high
at any time and so a rising edge on the input in1 which occurs
just before the clock rising edge may cause the master to go
into metastability. If the metastability is resolved well before the
falling edge of the clock, the change in out1 is copied through
the transparent slave to out2 with the normal delay and if it
is resolved after the falling edge out2 is unaffected. Because the
change in out1 and, consequently, in2 may happen very close to
the falling clock edge of the master, an input change can fall in
the metastability window of the slave. There is then a low, but
finite probability that metastability in the master can produce
metastability in the slave.
Authorized licensed use limited to: Newcastle University. Downloaded on June 08,2010 at 12:20:55 UTC from IEEE Xplore.  Restrictions apply. 
KINNIMENT et al.: MEASURING DEEP METASTABILITY AND ITS EFFECT ON SYNCHRONIZER PERFORMANCE 1035
Fig. 15. Waveforms of master and slave in metastability.
The two latch circuits are normally similar, so that the
metastability time constant , in both is about the same and it
is often assumed that the total resolution time of a metastable
output, follows the exponential form described in (2):
(12)
where is the input time relative to the balance point. In
fact, this is not quite true, as the following analysis shows. In
Fig. 15, the waveforms of the master and slave are shown when
both are close to metastability. In Fig. 15, the rising edge of the
clock occurs at a time and all times are measured with
reference to this time. If the slave output occurs at a time,
late in the first half cycle of the clock, its timing depends on the
input time before the balance point. The closer is
to the balance point of the master, the later is . We can plot
, against for the latch on its own by observing that
(13)
From this it is possible to find the output time of the slave
. There are the following two cases.
1) happens well before the back edge of the clock, so
that the slave is transparent when its input changes. In this
case, the constant delay of the slave is simply added to
, so is given by
(14)
2) The slave may go metastable. In this case, must be
very close to the absolute slave balance point time ,
such that the slave input time is given by:
, where is also close to . This
gives a metastable response from the slave
(15)
Here, is small and small changes in affect
according to (3). We will assume that the master input time
changes by a small amount to so that the
slave input time changes from to
(16)
The new slave output time can now be calculated from (2)
and (16)
(17)
Using the fact that in this case is very close to , we
can write
(18)
This can be simplified to
(19)
Here, the balance point has changed. In case 1, the balance point
was the input that gave equal probability of a high and low out-
puts in the master, but in case 2, it is the point that gives equal
probability of a high and low outputs in the slave. The balance
point for the master–slave combination is, therefore, the input
time that gives a slave input exactly at the balance point of the
slave
(20)
When compared with a simple latch, the balance point for
the combination is shifted by this amount. Any plot of the input
times against output times must, therefore, be measured from
this point. The change makes little difference for values of
, since , but it
becomes important when .
Fig. 16 illustrates the input time against output time that may
be expected from a master–slave flip-flop. Case 1, on the left,
shows the relationship of (14) with a slope of , normally seen
between and when . On the right,
the slope of the curve given by (19) when
remains the same, but there is a displacement which is given by
the difference between the two cases
or simplifying
(21)
Authorized licensed use limited to: Newcastle University. Downloaded on June 08,2010 at 12:20:55 UTC from IEEE Xplore.  Restrictions apply. 
1036 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 9, SEPTEMBER 2007
Fig. 16. (top) Case 1: Slave transparent. (bottom) Case 2: Slave metastable.
Fig. 17. Jamb latch.
Here, it can be seen that the offset depends on the circuit con-
stants , and .
We can confirm this result by SPICE simulation. Fig. 17
shows a commonly used synchronizer circuit, the Jamb latch.
A master–slave flip-flop built from two jamb latches may
have the latch output taken either from Node A or Node B. If
Node A is used, a single inverter with a low threshold provides
a high output when the data is high and the clock is low, but
for Node B two inverters must be used, the first with a high
threshold to give a low output at when the data is high and
the clock is low and, the second, to provide the high output for
the slave latch. We simulated this circuit using the parameters
for a 0.18- m process and plotted the versus output time
characteristic for nodes A and B. This is shown in Fig. 18.
For input times of 100 ps or larger, the two outputs in Fig. 18
have a shorter delay for than for , for example when
the input is 113 ps before the clock, the output is 97 ps after
Fig. 18. Input time versus output time for jamb latch.
the clock for , a total delay of 210 and 147 ps after
for so 260 ps. However, when the input is less than
10 ps, the delays are very similar. Intuitively, it is obvious that
the large signal delay through the path has one more in-
verter than the path and will be approximately one inverter
delay longer. The simulation shows that when the clock goes
high and the latch is metastable, there is very little voltage dif-
ference between nodes A and B, so the two delays are similar
when metastability is resolved.
Using the method described in Section I, we can deduce the
values of , , and from the following simulations.
For Node 40 ps, 20 000 ps, and 201 ps.
For Node 40 ps, 20 000 ps and 250 ps.
Here, the figures assume that output times are measured
from the clock rather than the typical output time of around 250
ps.
The back edge offset (21), for is, therefore, 47 ps and
for , 1 ps. 47 ps is similar to the delay expected in an in-
verter in this technology. Further simulation for a master–slave
flip-flop using Node A and with a clock back edge at 1 ns gives
the input output characteristic of Fig. 19.
The difference between and the projection of the deep
metastability slope , back to the point where gives
an estimate of the offset.
Though we do not know the circuit details of the 74ACT74,
we measured the effect of the back edge of the clock on this
CMOS device. At 10 MHz, on the rising edge of the clock,
metastability occurs in the master, but the slave is transparent
for the first half cycle (50 ns). When the clock high pulse width
was reduced to 5.5 ns keeping the period at 100 ns and then to
4.5 ns (minimum allowed in the data sheet is 5 ns), we got the
graphs of Fig. 20. In Fig. 20 there is a higher probability of long
metastable times which is linked to the low going edge of the
clock. The effect is also observable on the output waveforms
shown in Fig. 21, where the clock pulse is about 5 ns in width
and the scale is 2 ns per division. This shows that the circuit
conditions change when the clock goes low. When the slave is
transparent (between 0 and 2.8 ns after the back edge) there is a
bend in the waveform that is not present in trajectories after
Authorized licensed use limited to: Newcastle University. Downloaded on June 08,2010 at 12:20:55 UTC from IEEE Xplore.  Restrictions apply. 
KINNIMENT et al.: MEASURING DEEP METASTABILITY AND ITS EFFECT ON SYNCHRONIZER PERFORMANCE 1037
Fig. 19. Jamb latch outA back edge offset.
Fig. 20. Effect of the back edge of the clock.
2.8 ns following the low going transition of the clock edge. The
starting point of the rise of the output trajectories after the
clock are also delayed by the 0.4 ns seen after the back edge in
Fig. 20.
In the 74F5074, the effect is much greater than the 74ACT74.
Here, the low going clock transition being associated with a
1.5–2 ns increase in output time for events after the back edge
even though there are no anomalous output trajectories, either
before or after the transition. This additional delay is clearly
observable in the back edge measurements of Fig. 22, where
Fig. 21. Effect of 5-ns clock pulse width on 74ACT74.
Fig. 22. 74F5074 Input time versus output time for different back edge times.
the input output time characteristics for a long clock high pulse
(50 ns) are compared with those for shorter pulses of 4 and 5 ns.
An estimate of the offset given by (21) for a 74F5074 is shown
in Fig. 23, but it can only be an estimate, since the value of is
not known accurately. The reason why the 74F5074 shows such
a large increase, is that the difference between the transparent
slave propagation time at 150 ps input (less than 3.6 ns) and
what it would have been had the slave been metastable (about
5.2 ns) is at least 1.6 ns. The long pulse does not affect the
metastability resolution time, but the short ones add up to 2 ns
to metastability resolution times after the back edge. Typically,
an inverter delay in 74F technology is approximately 3 ns, but
the 74F5074 has been designed to produce a very advantageous
value of . The additional delay produced by the back edge of
the clock of 1.6–2 ns is comparable to an inverter delay, but
equivalent to 10–15 and will significantly affect the projected
reliability of a 74F5074 one clock period synchronizer.
V. CONCLUSION
By reducing typical input times from 100 ns to around
10 ps, a four-order of magnitude increase in the probability
of metastable events has been achieved, allowing much more
of the event histogram of a synchronizer to be observed. Dif-
ficulties in measuring the distribution of input times in the
presence of picosecond level noise in the measuring equipment
can be overcome by the technique of measuring the shift in
the distribution as a function of the proportion of high and low
outputs. This technique also allows a range of time difference
Authorized licensed use limited to: Newcastle University. Downloaded on June 08,2010 at 12:20:55 UTC from IEEE Xplore.  Restrictions apply. 
1038 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 9, SEPTEMBER 2007
Fig. 23. Delay change in 74F5074.
distributions to be created by altering the integrator time con-
stant and to be accurately measured in the presence of noise.
Limitations in the repetition rate of the measuring equipment
can reduce the rate of collection of useful output data to as little
as 1 data point collected in 1000 generated. This can be over-
come by generating only events in the region of interest (deep
metastability) so that the repetition rate required is low and
normal propagation time events do not obscure the others. Thus,
a total of seven orders of magnitude improvement over conven-
tional methods of measuring metastability is possible using the
methods described.
Our results show approximately double the useful range of
input times, 10 to 10 s and up to 11 days MTBF in the
case of the 74F5074. This achieved with data collection times
of only 1000 s and longer data collection times would allow
MTBF values of up to 3 years to be measured in days. Increasing
the range of output times has also revealed anomalies in the
slope and has enabled the effect of the back edge of the clock to
be observed for the first time. In the circuits we measured, the
back edge of the clock increased the resolution time because
metastability in the opaque mode of the slave of a master–slave
flip-flop response introduces more delay than the large signal
delay of the transparent mode. Our results show that as much as
15 can be added to resolution times, reducing reliability by a
factor of more than 10 000. Both of these results are important in
safety critical applications. The results are presented in a form,
input time against output time, which is more meaningful than
the usual events scale and is easily converted to MTBF when the
clock and data rates are known.
We aim now to show how these methods can be used in an
on-chip version by replacing the analog variable delay elements
by a digital version based on a current-starved inverter, [12],
[13] and the integrator with a limiting counter which can be in-
cremented and decremented by programmable amounts. This
will enable the proportion of high and low outputs to be varied
and, therefore, the actual distribution of input times to be mea-
sured off chip.
REFERENCES
[1] K. Y. Yun and A. E. Dooply, “Plausible clocking based heterogeneous
systems,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 7, no.
4, pp. 482–487, Dec. 1996.
[2] R. Dobkin, R. Ginosar, and C. Sotirou, “Data synchronization issues
in GALS SoCs,” in Proc. ASYNC, 2004, pp. 170–179.
[3] J. Zhou, D. J. Kinniment, G. Russell, and A. Yakovlev, “A robust
synchronizer circuit,” in Proc. IEEE Comput. Soc. Annu. Symp. VLSI
(ISVLSI), 2006, pp. 442–443.
[4] D. J. Kinniment, A. Bystrov, and A. V. Yakovlev, “Synchronization
circuit performance,” IEEE J. Solid-State Circuits, vol. 37, no. 2, pp.
202–209, Feb. 2002.
[5] Y. Semiat and R. Ginosar, “Timing measurements of synchronization
circuits,” in Proc. ASYNC, 2003, pp. 68–77.
[6] C. Dike and E. Burton, “Miller and noise effects in a synchronizing
flip-flop,” IEEE J. Solid-State Circuits, vol. 34, no. 6, pp. 849–855,
Jun. 1999.
[7] T. J. Chaney and C. E. Molnar, “Anomalous behavior of synchronizer
and arbiter circuits,” IEEE Trans. Comput., vol. C-22, no. 4, pp.
412–422, Apr. 1973.
[8] D. J. Kinniment and J. V. Woods, “Synchronization and arbitration cir-
cuits in digital systems,” Proc. IEE, vol. 123, no. 10, pp. 961–966, Oct.
1976.
[9] T. Sakurai, “Optimization of CMOS arbiter and synchronizer circuits
with submicrometer MOSFET’s,” IEEE J. Solid-State Circuits, vol. 23,
no. 8, pp. 901–906, Aug. 1988.
[10] C. H. van Berkel and C. E. Molnar, “Beware the 3-way arbiter,” IEEE
J. Solid-State Circuits, vol. 34, no. 6, pp. 840–848, Jun. 1999.
[11] O. Maevsky, D. J. Kinniment, A. Yakovlev, and A. Bystrov, “Anal-
ysis of the oscillation problem in tri-flops,” in Proc. ISCAS, 2002, pp.
381–384.
[12] P. Dudek et al., “A high-resolution CMOS time-to-digital converter
utilizing a Vernier delay line,” IEEE J. Solid-State Circuits, vol. 35, no.
2, pp. 240–247, Feb. 2000.
[13] M. Mota and J. Christiansen, “A four-channel self-calibrating high-
resolution time to digital converter,” in Proc. IEEE Int. Conf. Electron.,
Circuits Syst., 1998, pp. 409–412.
David J. Kinniment received the M.Sc. degree in
electrical engineering and Ph.D. degree in computer
science from Manchester University, U.K., in 1963
and 1968, respectively, where he was part of the
teams involved in the design of the asynchronous
ATLAS and MU5 computers.
He is currently an Emeritus Professor with the
University of Newcastle, Newcastle Upon Tyne,
U.K., where he works with Microelectronic Systems
Design Research Group at the School of Electrical,
Electronic, and Computer Engineering. Between
1964 and 1979, he was an Assistant Lecturer, Lecturer, and Senior Lecturer
in the Computer Science Department, Manchester University. In 1979, he was
appointed to the Chair of Electronics at Newcastle University, and was Head of
the Electrical and Electronic Engineering Department from 1982 to 1990 and
1996–1998. His current research interests include IC design and asynchronous
networks on chip.
Prof. Kinniment is a member of the IET.
Charles Dike received the B.S.E.E. and M.S.E.E. de-
grees from Brigham Young University, Provo, UT, in
1977 and 1984, respectively.
He joined Intel Corp., Beaverton, OR, in 1992 and
presently works for the Design Automation and Col-
lateral Department for the Ultra-Mobile Group, Hills-
boro, OR. He was an Integrated Circuit Design Engi-
neer with Signetics Corp., Orem, UT, for 11 years.
While there, he developed expertise in metastability
theory and asynchronous arbitration. He worked in
bipolar, BiCMOS, and CMOS technologies. His re-
search interests include asynchronous logic, ideas related to synchronization,
metastability, and networks on chip. He has received 14 patents and is the au-
thor or coauthor of several articles.
Authorized licensed use limited to: Newcastle University. Downloaded on June 08,2010 at 12:20:55 UTC from IEEE Xplore.  Restrictions apply. 
KINNIMENT et al.: MEASURING DEEP METASTABILITY AND ITS EFFECT ON SYNCHRONIZER PERFORMANCE 1039
Keith Heron received the B.Sc. degree from Liv-
erpool University, Liverpool, U.K., in 1965, and the
M.Sc. degree from Newcastle University, Newcastle
upon Tyne, U.K., in 1970. He was pursuing the Ph.D.
degree from the School of Electrical, Electronic and
Computer Engineering, University of Newcastle
when he died in June 2007.
From 1979 to 2002, he worked as a Programing
Advisor for the Computer Science Department,
Newcastle University, during which time he
co-authored a number of papers on computer
architecture and design.
Gordon Russell (M’84) received the B.Sc. and Ph.D.
degrees in electrical and electronic engineering from
the University of Strathclyde, Glasgow, U.K., in 1970
and 1977, respectively.
Between 1975 and 1977, he was a Research
Fellow in the Department of Computer Science
and a Post Doctoral Fellow in the Department of
Electrical Engineering, the University of Edinburgh,
Edinburgh, Scotland, involved in the design of
CAD tools for LSI circuit design and the computer
modeling of charge coupled devices, respectively.
In 1979, he joined the academic staff of the School of Electrical, Electronic,
and Computer Engineering, the University of Newcastle, Newcastle upon
Tyne, U.K. His research interests include concurrent error detection (CED)
techniques, asynchronous design and test, BIST for SoC, on-chip time
measurement circuits, development of dynamic burn-in systems for circuits
used in hazardous environments. He has written a considerable number of
technical articles on testing and design for testability and has been involved as
a co-author/co-editor of several books related to testing and other aspects of
CAD for VLSI. He has given a series of invited lectures world wide on testing
and design for testability.
Dr. Russell is a member of the Institution of Electrical Engineers (IEE), U.K.
Alexandre V. Yakovlev (M’98) was born in 1956 in
Russia. He received D.Sc. from Newcastle Univer-
sity, Newcastle Upon Tyne, U.K., in 2006, and the
M.Sc. and Ph.D. degrees from St. Petersburg Elec-
trical Engineering Institute, St. Petersburg, Russia, in
1979 and 1982 respectively.
Since 1991, he has been at the Newcastle Univer-
sity, where he worked as a Lecturer, Reader, and Pro-
fessor with the Computing Science Department until
2002, and is now heading the Microelectronic Sys-
tems Design Research Group with the School of Elec-
trical, Electronic, and Computer Engineering. He was with St. Petersburg Elec-
trical Engineering Institute, where he worked in the area of asynchronous and
concurrent systems since 1980, and in the period between 1982 and 1990 held
the positions of an Assistant and an Associate Professor with the Computing Sci-
ence Department. He first visited Newcastle as a Postdoctoral British Council
scholar in 1984–1985 for research in VLSI and design automation. After coming
back to Britain in 1990, he worked for one year with University of Glamorgan,
Wales, U.K. His current research interests and publications include the field of
modeling and design of asynchronous, concurrent, real-time, and dependable
systems-on-a-chip. He has published four monographs and more than 200 pa-
pers in academic journals and conferences and has managed over 20 research
contracts. He has chaired programme committees of several international con-
ferences and is currently a chairman of the Streering Committee of the Confer-
ence on Application of Concurrency to System Design.
Dr. Yakovlev is a member of the Institution of Electrical Engineers (IEE),
U.K.
Authorized licensed use limited to: Newcastle University. Downloaded on June 08,2010 at 12:20:55 UTC from IEEE Xplore.  Restrictions apply. 
