All-digital self-adaptive PVTA variation aware clock generation system for DFS by Pérez Puigdemont, Jordi et al.
All-digital self-adaptive PVTA variation aware clock generation
system for DFS
Jordi Pe´rez-Puigdemont, Antonio Calomarde and Francesc Moll
Dept. of electronic engineering, Univ. Polite`cnica de Catalunya, Barcelona, Spain
jordi.perez-puigdemont@upc.edu
February 23, 2015
Abstract
An all-digital self-adaptive clock generation system
capable of adapt the clock frequency to compensate
the effects of PVTA variations on the IC propaga-
tion delay and satisfy an externally set propagation
length condition is presented. The design uses time-
to-digital converters (TDCs) to measure the prop-
agation length and a variable length ring oscillator
(VLRO) to synthesize the clock signal. The VLRO
naturally adapts its frequency to the PVTA varia-
tions suffered by its logic gates while the TDCs are
used to track these variations across the chip and
modify the VLRO length in order to adapt the clock
frequency to them. The system measurements, for a
45nm FPGA, show that it adapts the VLRO length,
and therefore the clock frequency, to satisfy the prop-
agation length condition. Measurements also prove
the system capabilities to act as a dynamic frequency
scaling clock source since the propagation length con-
dition value act as a frequency selection input and a
strong linear relation between the input value and the
resultant clock period is present.
1 Introduction
As transistor technology scales down the effects of
process, voltage, temperature and ageing (PVTA)
variations impact substantially on the integrated cir-
cuit (IC) performance, reliability and power con-
sumption [1]. The rise in the transistor character-
istics uncertainty added to the rising complexity of
the ICs leads to an increment of the effort needed
to determine the main circuit parameters, such as
the maximum operation frequency or supply voltage,
that result in the desired IC yield. This effort in-
crement is translated finally as more design cost and
time. Until now this problem has been treated by
adding extra margins to the IC parameters [2], e.
g. lowering the clock frequency or rising the supply
voltage, but now this margins may undermine the
technology scaling gains [3].
In the recent years, to tackle the variability is-
sue, researchers have proposed to adapt the clock
frequency to the PVTA variations suffered by the IC.
Some propose to detect timing in the datapath and to
then correct them and/or adapt IC performance [4]
but this technique implies a high architectural com-
plexity. Other authors propose to use PVTA or delay
sensors to adapt the clock frequency [5] which results
in a slow responding system. In the same line others
propose to use sensors to detect fast PVTA variations
to stop the clock until the perturbation ends [4] but,
if the variation lasts, the system could be stopped for
a long time. We propose a closed loop adaptive sys-
tem based on a variable length ring oscillator (VLRO)
as clock source, which naturally adapts its frequency
to dynamic or static PVTA variations, and time-to-
digital converters (TDCs) as propagation delay sen-
sors, more precisely as logic propagation length sen-
sors, used to tune the length of the VLRO to ensure
the adaptation condition: the TDCs worst output
1
should be equal to the input setpoint value [6]. By
this way the proposed system can adapt continuously
the clock frequency to PVTA variations which can be
homogeneous or heterogeneous along the die as well
as operate as a dynamic frequency scaling system by
changing the input setpoint value.
The article is organized as follows: in section 2
we show in a detailed way the proposed design for
the adaptive system. In section 3, we discus the field
programmable gate array (FPGA) implementation of
the system. In section 4 we present the measurements
of the adaptive system implementation on a FPGA
analysing its responses to homogeneous and hetero-
geneous static delay variations. Finally conclusions
are drawn in section 5.
TDC1
Crs
Clk
Crs1
CrsN
Crs*
Min
Pass L
Decoder
SelectPass
VLRO
Select
Clk Econfig
BUFG
SP
Crs*
LVLRO
Control
Clk
Econfig
Setpoint
n
n
n
n+k
2n+k-2
TDCN
Crs
Clk
n
Figure 1: Scheme of the self-adaptive clock genera-
tion system.
2 System overview
The operation of an adaptive system can be explained
very roughly as three step process that consists in,
firstly, sensing a physical magnitude. Secondly com-
paring the sensing output with the desired objective.
And, finally, perform the needed actions to minimize
the difference between the measured magnitude and
the objective. These processes are performed contin-
uously in real time, either in sequence or simultane-
ously.
The proposed system is based on three different
blocks that perform the adaptive process sequen-
tially: a time-to-digital converter (TDC), a control
block and a variable length ring oscillator (VLRO).
First the TDCs sense the propagation length, it is
how deep a transition can travel through the logic
path during one clock cycle, at different parts of the
die. This logic depth is directly related with the clock
period and inversely related with the logic gates prop-
agation delay. After that the control block compares
the worst of the TDC readings with the objective
value, or setpoint, and decides the new VLRO length
(which has to be decoded as the VLRO control vec-
tors) that minimizes the adaptation error. And fi-
nally the VLRO is the circuit that synthesises the
clock signal with the desired period, which has to be
distributed to the whole die through the clock tree.
Fig. 1 shows the self-adaptive clock generation sys-
tem scheme.
The different adaptive system components will be
discussed in detail in the next subsections: Sec. 2.1
for the TDC, Sec 2.2 for the control block and Sec.
2.3 for the VLRO.
2.1 Time-to-digital converter
The TDC used in the adaptive system is a circuit that
outputs the number of stages in a delay line traversed
by a positive transition within one clock period [7].
The TDC sensor schematic is depicted in Fig. 2. It
is made of N TDC stages, each one has a delay stage
made of k logic gates and a register that captures
the state of the delay stage output to the Tap stage
output. Also there is a offset delay, made of m logic
gates. Varying m and k we can bias the output and
the resolution of the sensor. Our system TDC also
has a finite state machine (FSM) that is used to con-
trol its operation: injecting a rising pulse into the
delay line, enabling the capture registers in the next
clock cycle and waiting a given number of cycles until
the process start again. The FSM is needed to syn-
chronize all the TDCs scattered along the die between
them and with the control block and the VLRO. For
this reason all the FSM share the same reset signal
and produce the same output signals. Finally the
TDC is completed whit a binary encoder that sim-
ply outputs the number of stages traversed by the
injected transition during one clock cycle.
Local PVTA variations affect the delay of each
TDC in the circuit. Each TDC output will therefore
be determined by the local PVTA conditions, as well
2
as by the common clock period. This behaviour can
be observed in Fig. 3 where the measured output of a
TDC sensor, implemented in a Xilinx Spartan(R)-3E
(XC3S500E-4FG320C), is depicted. In this TDC the
offset delay is made of 16 look up tables (LUTs) while
each delay stage consist in 8 cascaded LUTs. The
propagation delay is varied by modifying the FPGA
core supply voltage: high voltage supply implies less
propagation delay.
D Q
CE
FF
R
Tap[N-1:0]
Tap[0]
Encoder
Nlog2(N)
Crs[log2(N)-1:0]
D Q
R
FF
Clk
Reset
m k
TDC
stage 0
Pin Pout
Clk
R
Tap
TDC
stage 1
En
Tap[1] Tap[N-1]
Stage
delayOffset
delay
Pin Pout
Clk
R
Tap
TDC
stage
N-1
En
FSM
Capture
Trig
Clk
Config
R
Figure 2: Time-to-igital converter (TDC) schematic.
The purpose of the TDC sensor is to determine the
number of delay stages crossed by a rising edge during
one single period.
2.2 Control block
The control block is the circuit in charge of selecting
the adequate length of the VLRO given the worst
TDC output and the setpoint values. Its schematic
can be viewed in Fig. 4. The control block calculates
the adaptive error (Err in Fig. 4) as the difference
between the setpoint value (SP) and the number of
stages crossed (Crs∗). If we have more than one sen-
sor the Crs∗ value is the minimum of all the TDC out-
puts which represents the worst case. Then the error
value is added to the last length of the VLRO (LPrev)
in order to obtain the new VLRO length (LVLRO). Af-
ter calculating the LVLRO value it has to be decoded
as the VLRO configuration vectors Pass and Select.
The operation of the control block is also governed
by an instance of the same FSM used in the TDC
sensor. It enables the registers that keep the value of
20 40 60 80 100 120 140
0
5
10
15
Clock period (ns)
TD
C 
ou
tp
ut
 
 
Vsup=1.2V
Vsup=1.1V
Vsup=1.0V
Vsup=0.9V
Vsup=0.8V
Figure 3: Output of a TDC sensor implemented on
a Xilinx Spartan 3E FPGA for different clock period
and different propagation delays induced by reducing
the FPGA core voltage.
the setpoint and the last VLRO length at the same
time when the registers of the TDC sensors capture
the delay line state. A given number of clock periods
after this, during one clock cycle, the FSM enables
through the Econfig signal the VLRO control reg-
isters. This waiting period gives enough time to the
signal to travel the adaptive system worst path: from
the Tap registers output in the TDCs (Fig. 2) to the
VLRO control registers. By this way this path oper-
ates as a multi-cycle path.
2.3 Variable length ring oscillator
A VLRO acting as the clock generator has been cho-
sen because it naturally adapts its oscillating fre-
quency to the PVTA variations suffered by its logic
gates. Therefore, the VLRO will automatically adapt
the clock period to the spatially homogeneously de-
FSM
Capture
Trig
Clk
Config
R Q D
CE
FF
R
A
B
A+B
A
B
A-B
Q D
CE
FF
R
SP
Crs*
Reset
Clk
L V
LR
O
L
Prev
Err
Econfig
PassL
Decoder
Select
Pass
Select
Figure 4: Adaptive system control block schematic.
3
 ✁✂✄☎☎✄☎✂ ✆✝ ✞✟✠✟✡☛ ☞
Figure 5: System internal signals, during a whole setpoint patern, under the induced propagation delay
static spatial variations. These signals have been measured with a logic analyzer.
lay variations, this is the variations that affect all the
die in the same way statically or dynamically.
The VLRO used in the adaptive system is a fully-
digital glitch-free VLRO [8]. It is important to stress
that the registers that keep the value of the Pass
and Pass control vectors are directly triggered by the
clock output of the VLRO before it is injected in the
global clock distribution tree. This will result in the
presence of two shifted clocks with the same period:
the VLRO local clock, only used by the VLRO reg-
isters, and the global clock, used by the rest of the
adaptive system and the other registers in the die.
This clock diversity is addressed using the EConfig
enable signal for the VLRO control registers.
3 FPGA implementation
The system is implemented on a Xilinx Spartan-6(R)
FPGA (XC6LX16-CS324). The adaptive system uses
4 TDC sensors. The setpoint and the number of TDC
crossed stages are 4 bit width signals. This leads to a
TDC delay line of 15 stages plus the offset stage. To
study the system adaptation to static spatial varia-
tions the length of the offset and delay stages is differ-
ent for every TDC in order to emulate the propaga-
tion delay variability across the chip. To analyse the
produced clock response to different delay scenarios
the propagation delay of the TDC stages are varied
in the same way on all the TDCs. By this way we
are able to observe how the clock period is adapted
to the sensors suffered delay.
The VLRO length use a 5 bit width signal letting to
a VLRO with 32 stages each one made of one LUT.
The FSM used in the TDCs and the control block
waits for 5 clock periods after the Capture signal is
enabled before setting to high the Config signal.
The VLRO block instances have been placed man-
ually in order to assure a proper performance of the
oscillator while we have led the software tools to place
and route the TDC blocks, as well as the rest of the
system blocks.
4 Experimental measurements
The measurements of the adaptive system are focused
on the system operation and analysis of the clock sig-
nal and its period as well as on the system ability
to adapt the clock period to a dynamically changing
objective specifications. To do so we change the set-
point value every 10µs starting at 14 and decreasing
it by one until it reaches 1, at this point the signal is
changed to 14, to 1, again to 14 and finally returns
to 1. After this the signal is increased by one until it
reaches 14 and the cycle starts again. We deliberately
avoid 15 an 0 setpoints values, so that these unused
setpoints act as boundaries to the VLRO length set
by the adaptive system.
In Fig. 5 the system signals measured with a a logic
4
analyser are depicted, showing the system adaptation
to the induced static spatial delay variation. This
variation is emulated by design setting different offset
and delay stage lengths for each TDC: for TDC0, 16
LUTs and 8 LUTs; for TDC1 14 LUTs 7 LUTs; for
the TDC2 12 LUTs and 6 LUTs; and for the TDC3
10 LUTs and 5 LUTs.
In Fig. 5 the different TDCs outputs (Crs i) are
shown, being TDC0 the slowest one. For this reason,
the Crs∗ value corresponds to Crs 0, and the adap-
tive system changes the VLRO length such that Crs∗
achieves the setpoint value. Also one can observe in
Fig. 5 that, in some cases, the same setpoint condi-
tion is satisfied by two different VLRO lengths (e. g.
when the setpoint is equal to 7).
In Fig. 6 depicts the system signals during a set-
point change from 1 to 14 showing how the VLRO
length is varied until the worst TDC output satisfies
the setpoint condition.
To analyse how the clock period varies as function
of the propagation length sensed by the TDCs and
the setpoint value. The setpoint value is varied dy-
namically following the described pattern in above.
Three propagation delay scenarios are emulated by
setting the different lengths for the TDCs offset and
delay stages. the longer the stages the bigger the
propagation delay.
In Fig. 7 the clock signal period for the three de-
scribed configurations along with the setpoint value
are depicted. The figure clearly shows the correla-
tion between the clock period and the setpoint value
Time
139.036 us 141.552 us
setpoint 1 14
Crs* 1 13 14
Crs_0 1 13 14
Crs_1 2 15
Crs_2 2 15
Crs_3 4 15
Length VLRO 4 17 18
Sample Number
69518 70776
Waveform-1 03/27/14 13:59:27
Figure 6: System internal signals under the induced
propagation delay static spatial variations showing in
detail how the system adapts the VLRO length, and
hence the clock period, to an abrupt setpoint change.
which is the expected behaviour. Another expected
result is also confirmed by the depicted measurements
in Fig. 7: given the same setpoint values, the clock
with the smallest period is produced by the config-
uration which induces less propagation delay to the
TDCs (stage = 8 LUTs and offset = 16 LUTs). While
the largest clock period is produced by the slowest
TDC configuration (stage = 10 LUTs and offset = 20
LUTs). And the other configuration (stage = 9 LUTs
and offset = 18 LUTs) always produces a clock sig-
nal with a period between two previously commented
configurations.
In Fig. 7 is also depicted a metastability of the
clock signal period. For the fastest sensor configura-
tion (stage = 8 LUTs and offset = 16 LUTs) when
the setpoint is equal to 3 the clock period changes
between two values. This adaptive system response
is caused because the system can not generate a
clock signal that exactly satisfies the setpoint value,
the worst TDC output oscillates between two values
above and below the setpoint. This metastability is
related with the relative difference between the prop-
agation delay of a TDC stage with the period differ-
ence when the VLRO length is modified by one unit.
This metastability can be suppressed by implement-
ing a control block with some no-adaptation margins
or designing it capable of identifying it.
To study more quantitatively the generated clock
10
20
30
40
50
Time (µs)
Cl
oc
k 
pe
rio
d 
(ns
)
0 50 100 150 200 250 300
0
5
10
15
Se
tp
oi
nt
 v
al
ue
Stage=8LUTs
Offset=16LUTs
Stage=9LUTs
Offset=18LUTs
Stage=10LUTs
Offset=20LUTs
Figure 7: Setpoint value and clock period value for
the three different induced propagation delay config-
urations over time.
5
0 5 10 15
5
10
15
20
25
30
35
40
45
50
55
Setpoint value
C
lo
c
k
p
e
ri
o
d
(n
s
)
Stage=8LUTs
Offset=16LUTs
Stage=9LUTs
Offset=18LUTs
Stage=10LUTs
Offset=20LUTs
y=2.678x+7.874
y=2.429x+6.794
y=2.811x+9.224
Figure 8: Clock period principal components for
three different delay configurations as function of the
setpoint value.
period the most frequent period values are plot in Fig.
8 for every delay configuration. The data show that,
for some setpoints, the system produce two clock sig-
nal periods. This can be explained with the period
metastability and the dual VLRO dual length set-
point condition fulfilment previously mentioned in
the current article. The period follow, as function
of the setpoint value, a linear relation for the three
delay configurations. The period data and the de-
rived linear equations confirm the behaviour pointed
in Fig. 7: for sensor experiencing more propagation
delay the clock period produced is bigger than the
produced for sensors with less propagation delay.
5 Conclusions
In this article we proposed an all-digital self-adaptive
clock generation system that is capable of adapting
the clock period to spatially inhomogeneous PVTA
variations. The self-adaptive systems is built us-
ing only digital blocks, even the clock synthesis cir-
cuit, which is a previously proposed VLRO oscilla-
tor. The adaptive system all-digital nature makes it
easily portable between different technologies. The
analysis of the internal signals show that the system
autonomously adapt the length of the oscillator to
satisfy the setpoint condition.
Since the setpoint can be set by an external sig-
nal and the produced clock period shows a linear re-
sponse to the setpoint value the system can be used
to provide dynamic frequency scaling capabilities as
shown by the generated clock measurements. The
existence of a setpoint input can be also used to in-
tegrated the proposed adaptive system within a dat-
apath with timing error detection capabilities. The
error detection system could determine, every given
amount of time, the minimum setpoint value that
operate the datapath without timing errors and the
system will maintain it until the next error check.
This integration would bring an IC with a clocking
system that determines its maximum operation fre-
quency and modifies it to allocate the delay variations
induced by PVTA.
Acknowledgment
This work has been partially funded by the Spanish
MINECO and ERDF (TEC2013-45638-C3-R).
References
[1] K. Bowman, S. Duvall, and J. Meindl, “Impact
of die-to-die and within-die parameter fluctua-
tions on the maximum clock frequency distribu-
tion for gigascale integration,” IEEE Journal of
Solid-State Circuits, vol. 37, no. 2, pp. 183–190,
2002.
[2] K. A. Bowman, S. G. Duvall, and J. D. Meindl,
“Impact of die-to-die and within-die parameter
fluctuations on the maximum clock frequency dis-
tribution for gigascale integration,” IEEE Journal
of Solid State Circuits, vol. 37, no. 2, pp. 183–190,
2002.
[3] X. Wang, A. Brown, N. Idris, S. Markov, G. Roy,
and A. Asenov, “Statistical threshold-voltage
variability in scaled decananometer bulk HKMG
mosfets: A full-scale 3-D simulation scaling
study,” Electron Devices, IEEE Transactions on,
vol. 58, no. 8, pp. 2293–2301, 2011.
[4] K. Bowman, J. Tschanz, S. Lu, P. Aseron,
M. Khellah, A. Raychowdhury, B. Geuskens,
6
C. Tokunaga, C. Wilkerson, T. Karnik, and
V. De, “A 45 nm resilient microprocessor core
for dynamic variation tolerance,” Solid-State Cir-
cuits, IEEE Journal of, vol. 46, no. 1, pp. 194–
208, Jan 2011.
[5] C. R. Lefurgy, A. J. Drake, M. S. Floyd, M. S.
Allen-Ware, B. Brock, J. A. Tierno, and J. B.
Carter, “Active management of timing guardband
to save energy in power7,” in Proceedings of the
44th Annual IEEE/ACM International Sympo-
sium on Microarchitecture, ser. MICRO-44. New
York, NY, USA: ACM, 2011, pp. 1–11.
[6] J. Pe´rez-Puigdemont, A. Calomarde, and F. Moll,
“Variation tolerant self-adaptive clock generation
architecture based on a ring oscillator,” in SOC
Conference (SOCC), 2012 IEEE International,
2012, pp. 387–392.
[7] A. Drake, R. Senger, H. Singh, G. Carpenter,
and N. James, “Dynamic measurement of critical-
path timing,” in Integrated Circuit Design and
Technology and Tutorial, 2008. ICICDT 2008.
IEEE International Conference on, june 2008, pp.
249 –252.
[8] J. Perez-Puigdemont, F. Moll, and A. Calo-
marde, “All-digital simple clock synthesis through
a glitch-free variable-length ring oscillator,” Cir-
cuits and Systems II: Express Briefs, IEEE
Transactions on, vol. 61, no. 2, pp. 90–94, Feb
2014.
7
