Formal Verification and In-Situ Test of Analog and Mixed-Signal Circuits by Yin, Leyi 1983-
FORMAL VERIFICATION AND IN-SITU TEST




Submitted to the Office of Graduate Studies of
Texas A&M University
in partial fulfillment of the requirements for the degree of
DOCTOR OF PHILOSOPHY
Approved by:
Chair of Committee, Peng Li
Committee Members, Gwan Choi
Jose Silva-Martinez
Duncan M. Walker
Head of Department, Costas N. Georghiades
December 2012
Major Subject: Computer Engineering
Copyright 2012 LEYI YIN
ii
ABSTRACT
As CMOS technologies continuously scale down, designing robust analog and
mixed-signal (AMS) circuits becomes increasingly difficult. Consequently, there are
pressing needs for AMS design checking techniques, more specifically design verifi-
cation and design for testability (DfT). The purpose of verification is to ensure that
the performance of an AMS design meets its specification under process, voltage and
temperature (PVT) variations and different working conditions, while DfT techniques
aim at embedding testability into the design, by adding auxiliary circuitries for test-
ing purpose. This dissertation focuses on improving the robustness of AMS designs
in highly scaled technologies, by developing novel formal verification and in-situ test
techniques.
Compared with conventional AMS verification that relies more on heuristically
chosen simulations, formal verification provides a mathematically rigorous way of
checking the target design property. A formal verification framework is proposed
that incorporates nonlinear SMT solving techniques and simulation exploration to ef-
ficiently verify the dynamic properties of AMS designs. A powerful Bayesian inference
based technique is applied to dynamically trade off between the costs of simulation
and nonlinear SMT. The feasibility and efficacy of the proposed methodology are
demonstrated on the verification of lock time specification of a charge-pump PLL.
The powerful and low-cost digital processing capabilities of today’s CMOS tech-
nologies are enabling many new in-situ test schemes in a mixed-signal environment.
First, a novel two-level structure of GRO-PVDL is proposed for on-chip jitter test-
ing of high-speed high-resolution applications with a gated ring oscillator (GRO) at
the first level to provide a coarse measurement and a Vernier-style structure at the
second level to further measure the residue from the first level with a fine resolution.
iii
With the feature of quantization noise shaping, an effective resolution of 0.8ps can
be achieved using a 90nm CMOS technology. Second, the reconfigurability of recent
all-digital PLL designs is exploited to provide in-situ output jitter test and diagnosis
abilities under multiple parametric variations of key analog building blocks. As an
extension, an in-situ test scheme is proposed to provide online testing for all-digital
PLL based polar transmitters.
iv
To my wife, Yongfeng, and my parents
v
ACKNOWLEDGMENTS
This material is based upon work supported by the National Science Foundation
under Grant No. 1117660, and the Semiconductor Research Corporation and Texas
Analog Center of Excellence under Contract 2008-HJ-1836. The support of the C2S2
Focus Center, one of six research centers funded under the Focus Center Research
Program (FCRP), a Semiconductor Research Corporation subsidiary, is also gratefully
acknowledged.
My advisor Dr. Peng Li has been continuously tutoring and guiding me with
my research work. The first thing I learned from him is the way of thinking which
is both creative and rigorous. Simple as it sounds, it is never easy to do. Thanks
to him, I have done a list of research works, and most importantly I have built up
my confidence in research gradually in the last four years. As for detailed research
projects, I also benefited a lot from his insightful ideas and suggestions. Therefore, I
would like to express my gratitude and respect to him, not just as a graduating PhD
student to his advisor but also as a growing young man to his tutor in life.
Though students come and go, I always feel our research group is a big, warm
family, whose members help each other on work as well as on life. They made my
PhD life no longer a lonely stressful march, but an exciting joyful group trip. So I
would like to thank all the colleagues in Dr. Li’s research group, especially Yue Deng
and Yongtae Kim, who have directly cooperated with me on research.
Also, I would like to thank my wife, Yongfeng, who has sacrificed so much and
supported me so much. Her optimistic life attitude has influenced me to be courageous
when dealing with challenges in work or life.
Last but not least, thanks to my parents who made what I am now. Now when




I INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . 1
A. Formal verification of AMS circuits . . . . . . . . . . . . . 4
B. High-resolution on-chip jitter measurement . . . . . . . . . 5
C. In-situ test of all digital PLLs and polar transmitters . . . 6
II SMT-BASED FORMAL VERIFICATION OF AMS CIRCUITS 8
A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 8
B. Hybrid systems . . . . . . . . . . . . . . . . . . . . . . . . 12
C. NL-SMT based verification . . . . . . . . . . . . . . . . . . 14
1. Formulation of NL-SMT constraints . . . . . . . . . . 15
a. Initial space constraints . . . . . . . . . . . . . . 15
b. Hybrid dynamics constraints . . . . . . . . . . . . 16
c. Other constraints . . . . . . . . . . . . . . . . . . 19
2. Basic NL-SMT approach . . . . . . . . . . . . . . . . 19
a. State-space discretization . . . . . . . . . . . . . . 19
b. Box mergence . . . . . . . . . . . . . . . . . . . . 21
c. Basic flow of invoking NL-SMT solver . . . . . . . 22
D. Simulation-assisted NL-SMT . . . . . . . . . . . . . . . . . 25
1. Simulation-assisted NL-SMT flow . . . . . . . . . . . . 26
2. Stop condition . . . . . . . . . . . . . . . . . . . . . . 28
3. Statistical framework . . . . . . . . . . . . . . . . . . 32
4. Bayesian inference . . . . . . . . . . . . . . . . . . . . 34
a. Principle of Bayesian inference . . . . . . . . . . . 34
b. Bayesian inference for θ . . . . . . . . . . . . . . 35
5. Computation of E[N
(k)
new]|H . . . . . . . . . . . . . . . 37
E. PLL lock time verification . . . . . . . . . . . . . . . . . . 40
1. Charge pump PLL . . . . . . . . . . . . . . . . . . . . 41
2. SMT constraints for lock time . . . . . . . . . . . . . 43
3. Fast forwarding . . . . . . . . . . . . . . . . . . . . . 45
F. Experimental results . . . . . . . . . . . . . . . . . . . . . 48
G. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
H. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
vii
CHAPTER Page
III HIGH-RESOLUTION ON-CHIP JITTER MEASUREMENT . . 59
A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 59
B. Proposed structure . . . . . . . . . . . . . . . . . . . . . . 62
1. Vernier delay line . . . . . . . . . . . . . . . . . . . . 63
2. Gated ring oscillator . . . . . . . . . . . . . . . . . . . 64
3. The proposed GRO-PVDL structure . . . . . . . . . . 68
C. Circuit implementation . . . . . . . . . . . . . . . . . . . . 71
1. GRO . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2. Counters . . . . . . . . . . . . . . . . . . . . . . . . . 76
3. PVDL . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4. DFFs . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5. GRO-PVDL . . . . . . . . . . . . . . . . . . . . . . . 82
6. DSP unit . . . . . . . . . . . . . . . . . . . . . . . . . 86
a. Coarse code generator . . . . . . . . . . . . . . . 88
b. VDL decoder . . . . . . . . . . . . . . . . . . . . 89
c. Fine code generator . . . . . . . . . . . . . . . . . 90
D. Experimental results . . . . . . . . . . . . . . . . . . . . . 93
1. Delay mismatch analysis . . . . . . . . . . . . . . . . . 97
2. Specification comparison . . . . . . . . . . . . . . . . 101
E. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
IV IN-SITU TEST OF ALL DIGITAL PLLS . . . . . . . . . . . . . 104
A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 104
B. Principle of jitter estimation and diagnosis . . . . . . . . . 106
1. Noise model . . . . . . . . . . . . . . . . . . . . . . . 106
2. Transfer function analysis . . . . . . . . . . . . . . . . 108
C. BIST scheme . . . . . . . . . . . . . . . . . . . . . . . . . 113
1. Reconfigurable loop filters . . . . . . . . . . . . . . . . 113
2. TDC calibrator . . . . . . . . . . . . . . . . . . . . . . 115
3. Hardware overhead . . . . . . . . . . . . . . . . . . . . 117
D. Simulation results . . . . . . . . . . . . . . . . . . . . . . . 118
1. Setup of simulation environment . . . . . . . . . . . . 118
2. Monte Carlo analysis . . . . . . . . . . . . . . . . . . 119
E. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
V IN-SITU TEST AND CALIBRATION OF
ALL DIGITAL POLAR TRANSMITTERS . . . . . . . . . . . . 124
A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 124
viii
CHAPTER Page
B. ADPLL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
1. Architecture . . . . . . . . . . . . . . . . . . . . . . . 127
2. Two-point modulation . . . . . . . . . . . . . . . . . . 127
3. Requirements of WCDMA . . . . . . . . . . . . . . . 129
C. RF BIST for EVM . . . . . . . . . . . . . . . . . . . . . . 129
1. z-domain model . . . . . . . . . . . . . . . . . . . . . 130
2. Noise analysis . . . . . . . . . . . . . . . . . . . . . . 132
3. BIST principle . . . . . . . . . . . . . . . . . . . . . . 134
4. BIST scheme . . . . . . . . . . . . . . . . . . . . . . . 137
5. The optimization of the branch filter . . . . . . . . . . 138
D. DCO gain calibration . . . . . . . . . . . . . . . . . . . . . 139
1. DCO gain mismatch . . . . . . . . . . . . . . . . . . . 140
2. DCO gain calibration . . . . . . . . . . . . . . . . . . 141
E. Simulation results . . . . . . . . . . . . . . . . . . . . . . . 143
1. Simulation platform . . . . . . . . . . . . . . . . . . . 143
2. Simulation results for DCO gain calibration . . . . . . 144
3. Simulation results for EVM BIST . . . . . . . . . . . 145
F. Implementation issues . . . . . . . . . . . . . . . . . . . . 148
G. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
VI CONCLUSIONS AND FUTURE DIRECTIONS . . . . . . . . . 150
A. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 150
B. Future directions . . . . . . . . . . . . . . . . . . . . . . . 150




I Valid locations of state variables . . . . . . . . . . . . . . . . . . . . 43
II Effective resolutions for different kmis. . . . . . . . . . . . . . . . . . 101
III Comparison of Specifications. . . . . . . . . . . . . . . . . . . . . . . 102
IV Transfer functions from noise sources to output phase noise and
digital signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
V An example of configuration setup. (LF1 is bypassed for Config. 3) . 115
VI Hardware overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
VII Transfer functions from noise sources to output phase noise and
digital signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
VIII The comparison of the noise contributions (low frequency range:
100Hz-100kHz; high frequency range: 100kHz-13MHz). . . . . . . . . 134
IX The systematic errors and the EVM sensitivities to the digital
signatures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
X EVM degradation due to DCO gain mismatch (PM path only, no
random noise). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141




1 Power supply voltage of future CMOS technologies predicted by ITRS. 2
2 Product stage vs. checking techniques. . . . . . . . . . . . . . . . . . 2
3 Organization of the subtopics. . . . . . . . . . . . . . . . . . . . . . . 3
4 1-bit ∆− Σ ADC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5 Hybrid automaton of 1-bit ∆− Σ ADC. . . . . . . . . . . . . . . . . 13
6 Transient trajectories of 1-bit ∆− Σ ADC. . . . . . . . . . . . . . . 14
7 Step-by-step evolution of hybrid automaton. . . . . . . . . . . . . . 16
8 The split of NL-SMT problem. . . . . . . . . . . . . . . . . . . . . . 20
9 Box mergence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
10 Search of reachable boxes. . . . . . . . . . . . . . . . . . . . . . . . 26
11 Flow of simulation-assisted approach. . . . . . . . . . . . . . . . . . 27
12 Simulation exploration and NL-SMT check. . . . . . . . . . . . . . . 27
13 Statistical view of simulation exploration. . . . . . . . . . . . . . . . 33
14 E[N
(k)
new|H ] vs k (mB∗=100, m=1,000, α0=0.001): (a) n=100 (b)
n=100,000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
15 Block diagram of charge pump based PLL. . . . . . . . . . . . . . . 41
16 Timing diagram of PFD. . . . . . . . . . . . . . . . . . . . . . . . . 42
17 PFD hybrid automaton. . . . . . . . . . . . . . . . . . . . . . . . . 42
18 Timing diagram of fast forwarding. . . . . . . . . . . . . . . . . . . 46
xi
FIGURE Page
19 Flow of fast forwarding. . . . . . . . . . . . . . . . . . . . . . . . . . 47
20 From flow to constraints. . . . . . . . . . . . . . . . . . . . . . . . . 48
21 Reachable space of φd computed by the proposed reachability
analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
22 Runtime vs sampling density. . . . . . . . . . . . . . . . . . . . . . . 50
23 The numbers of samples of simulations for dynamic and static
Stop conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
24 The numbers of samples of simulations for dynamic Stop condition
with different k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
25 Runtime vs initial space (with simulation assistance and Bayesian
inference). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
26 Block diagram of GRO-PVDL structure. . . . . . . . . . . . . . . . 62
27 VDL: (a) structure, (b) timing diagram.
*START(i): START delayed by i·τ1, STOP(i): STOP delayed by
i·τ2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
28 Equivalence of VDL: (a) structure, (b) timing diagram. . . . . . . . 64
29 GRO: (a) structure, (b) timing diagram. . . . . . . . . . . . . . . . 66
30 Equivalent timing diagram of GRO. . . . . . . . . . . . . . . . . . . 67
31 Frequency spectra of without/with quantization noise shaping. . . . 68
32 The proposed GRO-PVDL structure. . . . . . . . . . . . . . . . . . 69
33 The timing diagram of GRO-PVDL. . . . . . . . . . . . . . . . . . . 69
34 GRO phase shift: (a) gated inverter, (b) simplified model, (c)
waveform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
35 Simulated phase shift vs ϕdisab of a three-stage inverter-based
GRO. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
xii
FIGURE Page
36 Gating phase shift of three-stage inverter-based GRO. . . . . . . . . 73
37 Gated delay cell with CCI: (a) symbol, (b) schematic. . . . . . . . . 74
38 Simulated phase shift vs ϕdisab of a three-stage CCI-cell-based
GRO. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
39 Simulated phase shift vs EN /nEN rising/falling time of a three-
stage inverter-based GRO. . . . . . . . . . . . . . . . . . . . . . . . 75
40 Asynchronous counter. . . . . . . . . . . . . . . . . . . . . . . . . . 76
41 Phase tracking based counting structure (single-ended version). . . . 77
42 Phase transition of three-stage GRO. . . . . . . . . . . . . . . . . . 78
43 Over/under-counting due to counter/decoder input mismatch. . . . 79
44 GRO counting structure with latch sharing. . . . . . . . . . . . . . . 79
45 Glitch illustration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
46 Differential delay cell: (a) symbol and gate-level schematic (b)
schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
47 (a) Gate-level schematic of differential DFFs (b) schematic of dif-
ferential latch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
48 Schematic of PVDL and DFFs. . . . . . . . . . . . . . . . . . . . . 83
49 Uneven stage delays of the GRO, and the effective timing diagram. . 84
50 Timing diagram of the GRO-PVDL with dV. . . . . . . . . . . . . . 85
51 Block diagram of the DSP unit. . . . . . . . . . . . . . . . . . . . . 87
52 Implementation of coarse code generator. . . . . . . . . . . . . . . . 88
53 VDL decoder: (a) implementation (b) bubble suppression. . . . . . 91
54 Implementation of fine code calibration. . . . . . . . . . . . . . . . . 93
55 Layout of the entire GRO-PVDL structure. . . . . . . . . . . . . . . 94
xiii
FIGURE Page
56 GRO-PVDL measurement for 0.5pspp sin. input: (a) PSD (b)
transient view (after low-pass filtering). . . . . . . . . . . . . . . . . 96
57 PSD of the measurement for 0.5pspp sin. input (transistor noise
and power supply noise are NOT included in the simulation). . . . . 97
58 The histograms of (a) input jitter (b) measurement result (after
low-pass filtering). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
59 Delay mismatch of the GRO and the PVDL. . . . . . . . . . . . . . 99
60 Simulated PSD with different delay mismatches (kmis= 3%, 5%,
10%). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
61 All-digital PLL block diagram including BIST. . . . . . . . . . . . . 105
62 The phase noise spectrum of a typical oscillator. . . . . . . . . . . . 108
63 s-domain model of ADPLL including noise sources. . . . . . . . . . 109
64 BIST block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . 114
65 The composition of the signature power spectral density. . . . . . . 116
66 TDC resolution calibration. . . . . . . . . . . . . . . . . . . . . . . . 117
67 BIST estimation VS. directly measurement. . . . . . . . . . . . . . . 120
68 Estimated jitter compares with measured jitter . . . . . . . . . . . . 121
69 Relative estimation error. Error is averaged in each 1ps interval
of the output jitter. . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
70 The proposed BIST scheme VS. the BIST scheme in [1]. . . . . . . . 122
71 Average error of the diagnosis of the four main noise sources. . . . . 122
72 Diagram of an all-digital polar RF modulator. . . . . . . . . . . . . 125
73 The ADPLL architecture with two-point modulation. . . . . . . . . 128
xiv
FIGURE Page
74 The frequency deviation of a typical WCDMA modulation in one
WCDMA slot (667µs) with the bandwidth reduction technique. . . . 130
75 The z-domain model of the ADPLL including noise sources. . . . . . 131
76 The composition of the noise. . . . . . . . . . . . . . . . . . . . . . 133
77 The relationship between the EVM and the phase noise. . . . . . . . 135
78 BIST block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . 138
79 DCO core with LC tank and biasing network in [12]. . . . . . . . . . 140
80 The block diagram of the calibration scheme. . . . . . . . . . . . . . 142
81 The relationship between ∆FCW/∆NDTW and ∆DTW. . . . . . . 143
82 The event-driven simulation platform for the ADPLL. . . . . . . . . 144
83 The constellation graphs for the worst mismatch case. . . . . . . . . 145
84 The output EVM versus the sample step. . . . . . . . . . . . . . . . 146
85 Estimated EVM vs. simulated EVM. . . . . . . . . . . . . . . . . . . 147




As CMOS technologies continuously scale down, designing robust analog and mixed-
signal (AMS) circuits becomes increasingly difficult. Consequently, there are pressing
needs for AMS design checking techniques.
Although CMOS technology scaling is beneficial to achieving higher speed and
lower power for digital circuits, it is decreasing the reliability of AMS circuits. Because
the uncertainties of circuit electrical characteristics are increasing along with the
technology scaling, circuit performances are more likely to be statistically distributed
than to be deterministic values [2]. Also, reduced power supply voltage shrinks the
dynamic range of AMS circuits and makes circuit design more challenging. Fig. 1
illustrates the future power supply voltage predicted by the International Technology
Roadmap for Semiconductors (ITRS) [3]. Moreover, the ongoing design trends, such
as the proliferation of consumer electronic systems, move towards integration of more
functionalities on the same chip, requiring AMS modules to work in multiple modes,
and have more complex control interface and lower power consumption [4].
Despite of the above design challenges, most of today’s AMS circuits are still fully
custom designed by experienced designers. Therefore, design checking techniques be-
comes a pressing need to assist AMS designers to develop robust AMS circuits. Exist-
ing design checking techniques can be divided into two categories, design verification
and design for testability (DfT), as illustrated in Fig. 2. The purpose of verification
is to ensure that the performance of an AMS design meets its specification under pro-
cess, voltage and temperature (PVT) variations and different working conditions [5].
On the other hand, DfT techniques aim at embedding testability into the design, by
adding auxiliary circuitries for testing purpose [6]. Instead of directly checking AMS
2
      	 
     
                                             ! "     #$%%& '()& * +,   - 
  .  . 
  	 /   0 )1 2   
  
 
  - 3    3   3 3   4 3   5 3   
 3   - 3   . 3   	
Fig. 1. Power supply voltage of future CMOS technologies predicted by ITRS.
designs, DfT provides the option of checking a chip after it is manufactured, and
thus faults can be discovered in product test or in field. More specifically, those DfT
techniques for in-field test are also known as in-situ test.6 7 8 9 : 6 ; < : 6 = 9 > ? 8 @ A = B 8 7 = C > D A C < E B 7 7 : 6 7 = > F ? = : G < 7 : 6 7B H : B I = > 9H = J : A = ? = B 8 7 = C > = > F 6 = 7 E 7 : 6 77 : B > K E : 6 ; < : 6 = 9 > ? C A 7 : 6 7 8 @ = G = 7 L
Fig. 2. Product stage vs. checking techniques.
This dissertation focuses on improving the robustness of AMS designs in highly
scaled technologies, by developing novel formal verification and in-situ test techniques.
Compared with conventional AMS verification that relies more on heuristically chosen
simulations, formal verification provides a mathematically strict way of checking the
target performance [7]. Moreover, formal verification techniques are more suitable
to be implemented as automatic verification tools. On the other hand, though the
3
idea of in-situ test has been applied for many years, the powerful and low-cost digital
processing capabilities of today’s CMOS technologies are enabling many new in-situ
test schemes in a mixed-signal environment.
The dissertation is composed of 3 subtopics: formal verification of AMS circuits
(Chapter 2), high-resolution on-chip jitter measurement (Chapter 3) and in-situ test of
all digital phase locked loops (PLLs) and polar transmitters (Chapter 4 and Chapter
5). Their relationships are shown in Fig. 3. Note that Chapter 3 gives a general
solution for the jitter testing in AMS circuits, while Chapter 4 and Chapter 5 are
applicable to specific types of AMS systems.M N O P Q R S T U V W Q V X S U TS U Y R S Z [ Z Q R Z\ ] ^ _ ` a b Q ^ S \ S V ` Z S ] U W S W a Z SR S _ [ a ` Z S ] U Y ` R R S R Z Q PU ] U a S U Q ` ^ Y O N c T Y ^ Q R ] [ Z ] U] U Y V W S d e S Z Z Q ^ Z Q R Z R f R Z Q _ Y R d Q V S \ S VZ Q V W U S g [ Q Rh i W ` d Z Q ^ j k l m n Y o p q R Z ^ [ V Z [ ^ Qh i W ` d Z Q ^ r k M p s q q e S Z Z Q ^ Z Q R Zh i W ` d Z Q ^ t k u o N Z Q R Z ] \ ` a a P S T S Z ` ad ] a ` ^ Z ^ ` U R _ S Z Z Q ^ Rh i W ` d Z Q ^ v k
Fig. 3. Organization of the subtopics.
4
A. Formal verification of AMS circuits
Traditionally, errors in hardware are discovered empirically in the design stage, by
verifying them under different situations. The most popular method for verifying an
IC design is simulation. The disadvantage of simulation based verification is that it
is difficult to obtain total confidence in the correctness of a design of any complexity.
For example, the initial states and inputs of an analog circuit are continuous in their
values while simulations could only sample discrete points in the continuous space. In
contrast, formal verification is an alternative that mathematically proves if a design
functions as required. More specifically, formal verification carries out a decision
procedure to check whether a mathematical model for the design satisfies some given
properties in the specification.
Formal verification of digital systems has found great success in practice [8]. In
contrast, AMS circuits operate in continuous or hybrid state spaces and have far
more complex analog characteristics and performances. As such, formal verification
of complex AMS circuits remains as a significant challenge. Nevertheless, the success
of its digital counterpart has made formal analog verification a subject of growing
research interest.
This dissertation proposes a methodology that leverages SAT modulo theory
(SMT)-based Satisfiability techniques to tackle the challenges arising from the inher-
ent analog and/or hybrid natures of AMS systems. This work is largely motivated by
recent advancements on nonlinear SMT (NL-SMT) solvers capable of solving the SAT
problem with large Boolean combinations of nonlinear arithmetic constraints involv-
ing transcendental functions [9,10]. The NL-SMT-based technique can be applied to
yield conservative check of dynamic design properties. To accelerate the technique, a
simulation-assisted SAT approach is also proposed that simultaneously exploit the ef-
5
ficiency of simulation and the conservativeness of SAT. A powerful Bayesian inference
based technique is developed to dynamically tradeoff between the costs of simulation
and NL-SMT. This allows intelligent on-the-fly determination of optimal number
of simulation runs that gives rise to the minimum total runtime of the simulation-
assisted SAT approach. The feasibility and efficacy of the proposed methodology are
demonstrated on conservative verification of dynamic properties of a charge-pump
PLL.
B. High-resolution on-chip jitter measurement
Timing precision, measured in the form of jitter, is extremely crucial for a broad
range of high-speed high-precision digital and analog ICs. Jitter is one of the most
important performances for the clock data recovery (CDR) in I/O circuitry as well
as the clock generation in high-speed digital signal processing circuitry, where phase
locked loops (PLLs) or delay locked loops (DLLs) are employed [11]. As an example,
today’s on-chip serial-links operate at a data rate of multi-Gb/s [12]. The clock jitter
in serial-link transceivers degrades the transmitted and received data margin. It may
also cause the received data to fall outside the design boundary. Moreover, from the
perspective of RF applications, jitter performance is also a key concern because clock
jitter will turn into the phase noise of wireless signals [1]. Hence, high-resolution
jitter characterization is an important way to detect performance degradations or
even malfunctions.
Traditionally, jitter is measured using external testing equipment. The state-of-
the-art time interval analyzers (TIA) provide femto-second resolution. The achievable
resolution is limited by the distortion and the noise injected along the on-chip to off-
chip signal propagation path. In this regard, low cost on-chip solutions with high
6
resolution are particularly appealing, because signal distortion/noise can be largely
alleviated by measuring the jitter right on the chip. More importantly, without the
need for any expensive external equipment, in-situ jitter characterization allows built-
in test and monitoring of design performance, and provides the option of self healing
and correction in the events of jitter incurred failures.
In the dissertation, a novel structure of GRO-PVDL is proposed for on-chip jitter
measurement. The GRO-PVDL is a two-level structure: the first level is a gated
ring oscillator (GRO) providing a coarse measurement; and the second level further
measures the residue from the first level with a fine resolution. The raw resolution
of the GRO is improved through a Vernier-style structure at the second level. With
the feature of quantization noise shaping, an even finer effective resolution can be
achieved. Implemented with a commercial 90nm CMOS technology, the GRO-PVDL
can achieve a sampling frequency of 200MHz and an effective resolution of 0.8ps.
C. In-situ test of all digital PLLs and polar transmitters
While digital testing is aiming at catastrophic and processing/manufacturing errors,
the target of AMS testing is the functionality within acceptable upper and lower
performance limits. AMS circuits have a nominal behavior and an uncertainty range
due to PVT variations. The error or deviation from the nominal behavior must be
measured with an extremely high precision to meet the requirements of today’s high-
resolution applications. The test cost is further raised when AMS circuits under test
are part of a complex SoC rather than stand-alone components.
In this dissertation, the reconfigurability of recent all-digital PLL designs is ex-
ploited to provide novel in-situ output jitter test and diagnosis abilities under multiple
parametric variations of key analog building blocks. Digital signatures are collected
7
and processed under specifically designed loop filter configurations to facilitate low-
cost high-accuracy performance prediction and diagnosis, by systematically analyzing
the interaction between the analog blocks and the digital blocks.
As an extension, an in-situ test scheme is proposed to provide online testing for
all-digital PLL based polar transmitters. Multiple digital signatures are collected by
adding a branch digital filter optimized for the maximum sensitivities to nonidealities,
which provides testing results on the fly. The test signatures are processed using sim-
ple digital processing to provide an estimate for error vector magnitude (EVM), a key
RF performance measure for the transmitter. Additionally, a digital self-calibration
scheme is proposed to eliminate the EVM degradation due to large wide-band digi-
tally controlled oscillator (DCO) gain mismatch. It is shown that a proper exploration
of digital implementation style is instrumental for facilitating novel low-cost built-in
test and calibration solutions for mixed-signal and RF applications.
8
CHAPTER II
SMT-BASED FORMAL VERIFICATION OF AMS CIRCUITS
As introduced in the first chapter, formal verification techniques for AMS circuits are
a subject of growing research interest. In this chapter, a formal verification framework
is presented that integrates nonlinear SAT modulo theory (SMT) solvers and Bayesian
inference guided simulation exploration, aiming at the verification of AMS transient
behaviors.
A. Introduction
The ongoing technology and design trends move towards integration of more func-
tionality on the same chip, leading to the development of mixed-signal SoCs. Coupled
with the increasing complexity of analog and mixed-signal (AMS) ICs, these trends
have made the efficient verification of AMS circuits a pressing need. Formal verifica-
tion of digital systems has found great success in practice. In contrast, analog and
mixed-signal circuits operate in a continuous state space and have far more complex
analog characteristics and performances. As such, verification of complex analog and
mixed-signal ICs remains as a significant challenge. The success of its digital counter-
part has nevertheless made formal analog verification a subject of growing research
interest.
A number of approaches have been proposed for formal verification of analog
circuits and a survey can be found from [13]. Among these techniques, theorem-
proving based methods such as [14] check the design properties by applying proof
rules, equivalence checking compares the outputs of two different models (e.g. SPICE
vs. behavioral) for a given set of input conditions [15,16]. There also exist techniques
that perform state-space exploration by converting continuous dynamics to approxi-
9
mated discrete models [17, 18]. State-space exploration can also be accomplished by
using a popular class of reachability analysis originated from verification of hybrid
systems [7,19,20]. These methods overapproximate the reachable state based on a ge-
ometrical representation such as polyhedra in the multi-dimensional space. Recently,
an elegant reachability analysis technique is specifically developed for phase lock loops
(PLLs) [21]. One of the key ideas in the approach is to overapproximate the switching
times of the charge pump and perform reachability analysis using linear continuous
models with uncertain parameters. In somewhat different directions, the monotonic
property of MOSFET devices and numerical computation are used to find all DC so-
lutions of a ring oscillator for verifying start-up conditions [22]. Boolean satisfiability
(SAT) based circuit-level analog verification has also been demonstrated [23].
The presented work is largely motivated by recent advancements on automated
reasoning of large Boolean combinations of nonlinear arithmetic constraints involv-
ing transcendental functions [9, 10]. Different from the SAT engine employed in [23],
which can only operate in the Boolean and linear domains, techniques in the lat-
ter category are built upon a tight integration of recent Davis-Putnam-Logemann-
Loveland (DPLL)-style SAT solving techniques with interval-based arithmetic con-
straint solving within a SAT modulo theory (SMT) framework. These techniques
have the potential to process large constraint systems with Boolean combinations of
multiple thousand arithmetic nonlinear constraints over thousands of variables [10].
For convenience, these types of techniques are referred to as NL-SMT.
The aforementioned NL-SMT techniques can handle nonlinear device/circuit
characteristics, one inherent property of analog operations, making them a poten-
tially appealing choice for analog and mixed-signal (a.k.a hybrid) circuit verification.
However, practical limitations of solving capability still exist when such SMT solvers
are employed for some of challenging AMS verification tasks.
10
For dynamic properties of nonlinear circuits, it is envisioned that modeling ab-
straction is required to render the transient verification (through reachability analysis)
practical. Techniques such as [15, 16] may be used to build conservative behavioral
models to account for factors such as modeling error and parameter variations for a
large AMS circuit. NL-SMT can then be applied to the behaviorial models to yield
conservative check of dynamic design properties. However, acceleration techniques
are still desired. To this end, a simulation-assisted SAT approach is proposed that
simultaneously exploit the efficiency of simulation and the conservativeness of SAT.
Simulation-assisted SAT can dramatically reduce the number of invoked NL-SMT
calls, leading to large verification speedups. A powerful Bayesian inference based
technique is developed to learn from the simulation history and dynamically trade-off
between the costs of simulation and NL-SMT. This allows for intelligent on-the-fly
determination of optimal number of simulation runs that gives rise to the minimum
total runtime of the simulation-assisted SAT approach. To be able to flexibly model
arbitrary nonlinear dynamics and the resulting reachable state space, the reachable
state space is tracked using a collection of hyper cubes with adjustable discretization
resolutions.
Compared with the verification tool fSPICE in [23] that also employs SAT solv-
ing techniques, the proposed approach has two obvious advantages. Because fSPICE
relies on linear SAT solvers, nonlinear models have to be conservatively represented by
interval combinations. In order to achieve a given accuracy and at the same time keep
the runtime scalable, fSPICE has to introduce heuristic techniques of abstraction re-
finement and non-uniform splitting [23] and repeatedly invokes linear SAT solvers to
find one solution. In the simulation-assisted NL-SMT approach, however, to find one
solution with a given accuracy, the NL-SMT solver only needs to be invoked once,
and the functions of abstraction refinement and non-uniform splitting are handled by
11
the NL-SMT solver in a more efficient way because its efficiency is optimized in the
nonlinear SMT solving algorithms.
Moreover, the proposed approach is especially efficient for verifying transient
behaviors. fSPICE and other SAT-based hybrid verification technique simply treat
the transient verification problem as a huge DC verification problem by unrolling
the transient behaviors over the time, which may be very computationally intensive
or infeasible. In contrast, the proposed approach provides the option to split the
problem into small subproblems such that the scale of the transient verification can
be controlled. More importantly, with the assistance of random simulation to explore
the reachable space, the efficiency of verification can be largely improved. In other
words, we are applying SAT solvers in a more proper way, to check conservativeness
rather than to find solutions.
The basic NL-SMT based reachability analysis approach is general in the sense
that it can be applied to any analog and mixed-signal (hybrid) circuits modeled using
many different types of nonlinear dynamic models (albeit practical capacity limita-
tions exist). The application of this approach is demonstrated on a very challenging
example, lock time verification of a charge-pump PLL. The generality of the approach
forces explicit tracking of discrete switching events, avoiding potential overapproxi-
mations incurred otherwise. With additional PLL-specific speedup techniques, we
demonstrate the successful lock time verification through the use of NL-SMT that
explicitly tracks the nonlinear dynamics of the circuit over a large number of time
steps and discrete switching events.
12
B. Hybrid systems
Hybrid system is a concept originally used in control theory. A hybrid system is a
dynamic system that exhibits both continuous dynamics (flow) and discrete dynam-
ics (jump) behaviors. A hybrid automaton is a mathematical model for precisely
describing hybrid systems [24].
Definition 1. A hybrid automaton is a tuple H = (X,M, J, F, I), where:
• X ⊆ Rn is an ordered finite set of continuous variables;
• M is a finite set of discrete states;
• F ∈M×Rn → Rn assigns a vector field to each mode, the continuous dynamics
in mode m is ẋ = fm(x);
• J ∈ M × Rn → M × Rn is the jump relation, a jump is triggered by a guard
condition and followed by a reset action.
• I ⊆ M × Rn is the initial condition
More general forms of hybrid automaton also includes inputs, nondeterministic
evolutions and even stochastic effects. AMS circuits are apparently within the cate-
gory of hybrid systems, such as the 1-bit analog-to-digital convertor (ADC) shown in
Fig. 4.
If we assume that the input level keeps unchanged at vin, then the ADC’s tran-
sient behavior can be described by the automaton shown in Fig. 5. Here t is time,
q = L/H represents the 2 discrete states of the ADC, v1 is the integrator output (a
continuous variable), k is the integrator gain and vc is the threshold voltage of the
comparator. When q = L(H), the voltage level of the 1-bit DAC output is vL(vH).
For vL = 0, vH = 1V , vc = 0.5V and the sampling clock frequency is 10MHz, the
transient trajectories are given in Fig. 6. The dark solid line shows the v1 trajectory
13
w xy z { | } ~  {  ~      ~  {  ~             {  { ~ |    {   {   z    }y z   {     | ~ | z  |    {         w       z }    
Fig. 4. 1-bit ∆− Σ ADC.
      ¡ ¢ £ ¤  £  ¥ ¦ § § £ ¤ ¦ ¨ © £ ª « ¬ £ ­ ® ¦ ¯° ¦ ± £ ¤ ¦ ¨ © £ ª « ¬ £ ² ® ¦ ¯° ¦ ³¦£ ¤ ¦ § ´ µ ¶° ¦ ± § ·  ·     ¡ § ·  ·     ¡      ¡ ¢ £ ¤ ¸ £  ¥ ¦ §© ¹ º ¹  ¹ » ¼ ½ ¾ º ¿ ¹  ¹ ¾ º À ®       ¡ ¢ £ ¤ ¸ £  ¥ ¦ §       ¡ ¢ £ ¤  £  ¥ ¦ §© Á Â » Ã ¿ ®© Ã Ä À Ä  ®
Fig. 5. Hybrid automaton of 1-bit ∆− Σ ADC.
14
corresponding to its initial value of 0.9V , while the light solid lines are a bunch of v1
trajectories when its initial value ranges from 0.8V to 1V .




















Fig. 6. Transient trajectories of 1-bit ∆− Σ ADC.
Note that in this example the occurrence of discrete transitions is synchronized
with the sampling clock with a period of Tclk. Generally speaking, both synchronous
and asynchronous discrete transitions may exist for AMS circuits, with the former
triggered by clock edges and the latter caused by sudden switches of analog signals.
C. NL-SMT based verification
As previously mentioned, the recent NL-SMT solver [10] is able to solve satisfiability
problems composed of boolean combinations of multiple arithmetic nonlinear con-
straints. In this section, an NL-SMT based framework is introduced for verifying the
transient specifications of AMS circuits.
15
1. Formulation of NL-SMT constraints
Transient verification of hybrid systems like AMS circuits can be conducted by per-
forming reachability analysis. Reachability analysis is to analyze the reachable space
(both discrete and continuous) of the system, when system starts from some uncer-
tain initial state I and/or follows some uncertain dynamics F and J (e.g. for the
above ADC the gain the integrator k is uncertain due to process variation). A typical
target of the analysis is to ensure the safety of a hybrid system by checking if any
trajectory will enter a predefined bad/dangerous region within a given time. In order
to perform reachability analysis using NL-SMT solver, hybrid automaton needs to be
represented by NL-SMT constraints.
a. Initial space constraints
The entire initial space I is scattered among different discrete modes. In the ith
discrete mode mi, continuous variables x are initially constrained within a region
defined by Si(x) ./ 0, where ./ stands for =, <, >, ≤ or ≥, and Si is a vector of




{m(t)|t=0 = mi ∧ Si(x(t)|t=0) ./ 0}, (2.1)
where m(t) and x(t) are discrete modes and continuous variable space that are reach-
able at time t.
Referring back to the previous ∆−Σ ADC, its discrete states areM = {m1,m2}
(m1 represents q = L and m2 represents q = H), and the integrator output voltage
16
v1 is the continuous variable. An example of its initial state could be:
{m(t)|t=0 = m1 ∧ v1 ≥ 0.1 ∧ v2 ≤ 0.2}
∨ {m(t)|t=0 = m2 ∧ v21 − 0.3v1 + 0.02 ≤ 0},
(2.2)
representing an initial space that distributes in both discrete mode m1 and discrete
mode m2. For m1, the initial continuous space is the integrator output voltage v1
between 0.1V and 0.2V, while the initial continuous space for m2 is constrained by
v21 − 0.3v1 + 0.02 ≤ 0.
b. Hybrid dynamics constraints
First let us consider transient simulation of hybrid systems, where discrete modem(t)
and continuous variable x(t) are calculated with a time step of ∆t. ∆t should be small
enough to track both continuous dynamics and discrete dynamics. For simplicity, ∆t
is fixed in this paper. As illustrated in Fig. 7, for each ∆t, hybrid dynamics are
separated into 2 steps: first continuous flow and then discrete jump. According to
continuous dynamics, x evolves from x(t0) to x
∗(t0 + ∆t). If x
∗(t0 + ∆t) hits any
guard condition, the corresponding reset action will be taken.Å Æ Ç È ÉÊ Æ Ç È É Å Æ Ç È Ë Ì Ç ÉÊ Æ Ç È Ë Ì Ç ÉÅ Í Æ Ç È Ë Ì Ç ÉÌ Ç
Fig. 7. Step-by-step evolution of hybrid automaton.
17
The derivatives of continuous variables are determined by the value of x as well
as the current discrete mode m.
ẋ(t) = F ẋ(m(t),x(t)), (2.3)









For simulation, numerical computation will kick in here to solve x∗(t0 + ∆t). In
NL-SMT approach, instead, we formulate a set of constraints as:
∨
i
{m(t0) = mi −→
F ẋ(mi,x







where A→ B is equivalent to ¬A ∨B.
For example, if currently the previous ADC is in discrete modem1(m2), meaning
its output q = L(q = H), then the continuous flow is described by the derivatives
of the continuous variables v̇1 = k(vin − vL) (v̇1 = k(vin − vH)). Therefore the
corresponding SMT constraints are:









After the continuous flow, there may also be discrete jumps. We note G
(j)
i as
the jth guard condition for the ith discrete mode mi, and R
(j)
i as the corresponding
reset action for the discrete jump. In fact, G
(j)
i is a set of continuous space, and R
(j)
i
is a mapping from x∗(t0 + ∆t) to m(t0 + ∆t) and x(t0 + ∆t). Particularly, G
(0)
i is
noted as the condition of no discrete jump in mi, and therefore R
(0)
i does not change
18
anything. Then discrete jump can be represented by the following constraints:
∨
i,j:i≥1,j≥0{m(t0) = mi ∧ x∗(t0 +∆t) ∈ G
(j)
i −→





Also, we take the ADC as an example, whose discrete jumps only happen at its
sampling clock edges. Therefore its guard condition should include time t ≥ Tclk for
all the discrete jumps, where Tclk is the sampling clock period. And if any discrete
jump is triggered, time t is reset to 0, indicating the start of a new sampling clock
cycle. Meanwhile, the discrete jump is also determined by the integrator output
voltage v1 at the clock edges. If the ADC is currently in discrete mode m1(m2), a
discrete jump is only triggered when t ≥ Tclk and v1 ≥ vc(v1 < vc), where vc is the
threshold voltage of the comparator. So the constraints for discrete jumps are the
disjunctions of the following constraints:
{m(t0) = m1 ∧ t∗(t0 +∆t) < Tclk −→
m(t0 +∆t) = m1 ∧ t(t0 +∆t) = t∗(t0 +∆t) ∧ v1(t0 +∆t) = v∗1(t0 +∆t)},
(2.8)
{m(t0) = m1 ∧ t∗(t0 +∆t) ≥ Tclk ∧ v∗1(t0 +∆t) < vc −→
m(t0 +∆t) = m1 ∧ t(t0 +∆t) = 0 ∧ v1(t0 +∆t) = v∗1(t0 +∆t)},
(2.9)
{m(t0) = m1 ∧ t∗(t0 +∆t) ≥ Tclk ∧ v∗1(t0 +∆t) ≥ vc −→
m(t0 +∆t) = m2 ∧ t(t0 +∆t) = 0 ∧ v1(t0 +∆t) = v∗1(t0 +∆t)},
(2.10)
{m(t0) = m2 ∧ t∗(t0 +∆t) < Tclk −→
m(t0 +∆t) = m2 ∧ t(t0 +∆t) = t∗(t0 +∆t) ∧ v1(t0 +∆t) = v∗1(t0 +∆t)},
(2.11)
{m(t0) = m2 ∧ t∗(t0 +∆t) ≥ Tclk ∧ v∗1(t0 +∆t) ≥ vc −→
m(t0 +∆t) = m2 ∧ t(t0 +∆t) = 0 ∧ v1(t0 +∆t) = v∗1(t0 +∆t)},
(2.12)
19
{m(t0) = m2 ∧ t∗(t0 +∆t) ≥ Tclk ∧ v∗1(t0 +∆t) < vc −→
m(t0 +∆t) = m1 ∧ t(t0 +∆t) = 0 ∧ v1(t0 +∆t) = v∗1(t0 +∆t)}.
(2.13)
Note that time t is a special continuous variable, because its continuous derivative is
always ṫ = 1, and the corresponding constraint is t∗(t0 +∆t) = t(t0) + ∆t.
c. Other constraints
The verification target could also be formulated into NL-SMT constraints depending
on its form. For typical targets like avoiding bad region, the corresponding NL-SMT
constraints are similar to those for initial conditions.
In addition, to include the effect of process variation in verification, device pa-
rameters p are considered as a special kind of continuous variables, and therefore
NL-SMT constraints could be formulated for them accordingly. The specialty is that
p is independent of discrete states and continuous variables, and does not change over
the time: p(t0 +∆t) = p(t0).
2. Basic NL-SMT approach
For the purpose of transient verification, an NL-SMT based approach is proposed to
find all reachable space conservatively.
a. State-space discretization
As described above, reachability analysis can be formulated into NL-SMT constraints.
If transient verification is bounded over a time duration tmax, then an obvious way to
generate the NL-SMT problem is to make conjunction of (1) initial space constraints,
(2) verification target constraints and (3) hybrid dynamics constraints which are
“unrolled” from time 0 to tmax with a step of ∆t. Such idea has been applied in
20
[23] using a SAT solver that accepts Boolean and linear constraints. However, the
worst-case cost of solving NL-SMT problems increases exponentially with the problem
dimension. As a result, it would be infeasible to run the verification over a large
number of steps. A compromised approach is to split a big NL-SMT problem into a
series of NL-SMT subproblems, as illustrated in Fig. 8. Each subproblem is limited
to a few time steps so that the problem scale can be handled by NL-SMT solver. This
strategy reduces the total cost to linear in tmax. For simplicity, every subproblem is
limited to 1 time step in this work.Î Ï Ð Ñ Ò Ó Ð Ô Õ Ö × Î Ï Ð Ñ Ò Ó Ð Ô Õ Ö × Ø Ù
Ú Û Ü Ý Ö Õ Þ Ü Õ Ñ ß Þ Ï Ð Ñ Ò Ó Ð Ô Õ Ö àá Ü á Ü
Fig. 8. The split of NL-SMT problem.
The reachable space at the ending of each subproblem needs to be saved as the
initial conditions of the subsequent subproblem. To save the reachable space, the
entire state-space is discretized into fixed-grid boxes. A box that contains a reachable
point is considered as a reachable box. The reachable space can be conservatively
saved as a set of reachable boxes. Note that over-approximation of reachable space
is introduced to ensure conservativeness. The size of boxes could be adjusted on the
fly for tradeoff between computational cost and over-approximation. Although other
shapes have been shown to be effective for some specific dynamics, e.g. zonotopes
are successfully applied for linear dynamics [21], boxes (or high-dimensional cubes)
21
are still the most flexible choice for general nonlinear dynamics. The condition, that











i (t)) is the lower(upper) bound of the jth reachable box in the ith
discrete mode mi at time t.
b. Box mergence
Mergence of boxes, as an auxiliary technique, can help accelerate solving NL-SMT
problems. Through box mergence, as illustrated in Fig. 9, the number of conjunc-
tion/disjunction clauses in constraints is reduced, and thus solver runtime is saved.
In the implementation, a simple greedy algorithm is adopted for box mergence, as
shown in Alg. 1. Note that different merging algorithms will lead to different sets of
merged boxes.
â ãä å æç è é ê ë è ì è ë í è î ï è ðñ ò ó ô õ ó ö õ ó ÷ õ ó ø õ ó ù ä å æú é û è ë ì è ë í è î ï è ðñ ò ó ô ö ÷ õ ó ø ù â ã
Fig. 9. Box mergence.
The greedy algorithm involves 2 lists of boxes, one storing unmerged boxes (Lum)
and the other storing merged boxes (Lm). Initially, Lum stores the set of boxes to be
22
merged and Lm is empty. Each box bi in Lum is visited one by one, and if bi can be
merged into any boxes b′j in Lm, then b
′
j in Lm will be updated with bi ∪ b′j , the box
merged from bi and b
′
j . If not, bi is inserted into Lm. After all boxes in Lm are visited,
the algorithm compares the number of boxes in Lm and in Lum. If the numbers are
the same, which means that the last run of visit does not result in any mergence
and so the algorithm stops. Otherwise, Lum dumps all its content and takes all the
content of Lm, after which Lm is cleared to be an empty list again. Then another run
of visit to the boxes in Lum will be performed. This process repeats until the size of
Lm equals the size of Lum after a run of visit.
c. Basic flow of invoking NL-SMT solver
The goal of each subproblem is to find all the reachable space at its ending time.
Starting from an empty set of reachable box Brch, when Brch is returned it should be
filled with boxes that conservatively cover the reachable space. This task needs to be
accomplished by means of NL-SMT solver.
The iSAT NL-SMT solver [10] employed in this work is based on a tight inte-
gration of DPLL algorithm and the interval constraint propagation (ICP) technique.
Based on interval arithmetic, interval constraint propagation locates the intervals
containing all solutions to the problem constraints. To practically force the solver to
terminate, we set a threshold δ corresponds to the discretization resolution (i.e. box
size) such that a real variable is no longer considered in the decision process if the
interval length of the variable is less than δ. When the solver returns, it either returns
one possible solution or UNSAT. Note here that the solver provides a guarantee on
unsatisfiability, i.e. an UNSAT result indicates that there is indeed no solution to the
problem constraints.
Given such properties of the solver, a flow of invoking NL-SMT solver is given
23
Algorithm 1 Greedy box mergence
while TRUE do
Lm = ∅;
for each box bi ∈ Lum do
if a box b′j ∈ Lum that can be merged with bi then
b′j = b
′
j ∪ bi; //merge
else
insert bi into Lm;
end if
end for








Algorithm 2 Basic NL-SMT approach
Brch = ∅;
repeat
formulate constraints for initial space;
formulate constraints for hybrid dynamics;
formulate constraints for srch 6∈ Brch;
invoke NL-SMT solver;
if a reachable solution srch is found then
locate box b that contains srch;
Brch = Brch ∪ {b};
end if




in Alg. 2. This algorithm is invoked for every subproblem, or at each time step if
1 subproblem covers 1 time step. At first, the set of reachable box Brch is empty.
Then 3 types of the NL-SMT constraints are formulated, including for initial space,
for hybrid dynamics, and for srch 6∈ Brch. The initial space is actually the reachable
space of the last subproblem. srch 6∈ Brch means that srch is not inside the reachable
space Brch. Note that these 3 types of constraints are implicitly conjunct to each
other. With initial space and hybrid dynamics formulated as constraints, the solver
will return a solution srch that is in the reachable space. Then the box that contains
srch is added into Brch. If the solver is invoked repeatedly, more reachable points
will be found and Brch will get larger. In each round, most importantly, constraints
must also be formulated for srch 6∈ Brch, which is the negation of Eq.(2.14). This
can keep each new solution out the reachable boxes already found. In Fig. 10, for
example, supposing boxes 1 and 2 are already found, they would be blocked for next
NL-SMT invoke so that new solution could only appear in boxes 3, 4 and 5. Finally,
when all the reachable boxes are found, the last invoking solver will return UNSAT,
providing a guarantee on conservativeness. Note that for this flow the number of
NL-SMT invoking is equal to the total number of reachable boxes plus one, because
every invoking returns a new reachable box, except the last invoking that confirms
the conservativeness.
D. Simulation-assisted NL-SMT
Though the above NL-SMT based approach is capable of conservatively finding all
reachable boxes, the number of NL-SMT invokes is equal to the number of reach-
able boxes plus one, which can be very large. Considering it is still very costly to
invoke NL-SMT solver that frequently, the basic NL-SMT approach is obviously not
26ü ý ü þ ü ý ü þ ÿ   ü             	 
                                    	 	               
Fig. 10. Search of reachable boxes.
efficient for transient verification. This stimulates us to search additional assistance.
In this section, simulation-assisted NL-SMT is proposed that dramatically increases
verification speed and keeps conservativeness at the same time.
1. Simulation-assisted NL-SMT flow
A key observation here is that transient simulation can find a solution with much
lower cost than NL-SMT solver. And if the solution is located in a box that is not
visited before, then it is just as valuable as the solution given by NL-SMT solver.
Therefore we can quickly explore the reachable state space by means of simulation
from samples within the initial space. Note that the sampling strategy could be
simply uniformly random. More sophisticated sampling techniques could be adopted
to further increase the coverage of simulation-based exploration. On the other hand,
no matter how dense or how wise the sampling strategy is, the coverage can hardly be
100% for complex dynamics, i.e. there is no conservativeness guarantee for simulation-
based exploration.
Fig. 11 compares the pros and cons of the above 2 approaches, and suggests a
simulation-assisted NL-SMT approach. Majority of reachable boxes are found in the
27        ! "  # $ !  %  &'   !( ) *         ! "  + , - . / 0 , 1 2 23 0 / 4 / 5 -6 7 5 8 9 5 0 : ; <6 5 ; 7 - 9 ,6 / ; = - 5 4 / : 0 . / 0 , ; : 6 4> 9 5 8 ? 5 @ - 9 A 2 B C D E8 ? 9 8 F > 9 5 8 ? 5 @ - 96 7 5 8 96 7 5 8 9
Fig. 11. Flow of simulation-assisted approach.
first stage, through simulations starting from points xini that are randomly sampled
in initial boxes Bini. In the next stage, NL-SMT solver kicks in to find the remaining
reachable boxes that are missed in the first stage. In Fig. 12, for example, if random




4, then when NL-SMT check begins,






4. Consequently, only 2 times of invoking NL-
SMT solver is needed, the first for finding box b′2 and the second for confirming
conservativeness.
G H G I J K L M N O N K O P O N Q I J K L M N O N K O P O N Q QR S Q T S Q TS T UVG H G R W X G Y ZV [
Fig. 12. Simulation exploration and NL-SMT check.
28
The benefit of simulation assistance is that most reachable spaces can be found
through simulations, and therefore the chance of invoking NL-SMT solver can be
remarkably decreased. Although simulations might have a lot of “waste”, i.e. simu-
lations from different samples repeatedly visit solutions in the boxes that are already
found, yet the cost of simulation is so much smaller than NL-SMT invoking that the
total time consumption is still much smaller than basic NL-SMT approach.
The simulation-assisted NL-SMT approach is summarized in Alg.3. Note that
simulation-based exploration stops when condition of stopping simulation, namely
Stop, is true. In practice, the runtime cost of simulation could be so small that Stop
can be checked for every hundred or even more samples of simulation. Moreover, the
determination of the stop condition Stop is very important to runtime performance.
A systematic way of defining Stop is given in next section.
The fundamental principle of simulation assisted NL-SMT verification is to find
reachable spaces by simulations quickly, and then rely on NL-SMT solver to ensure
the conservativeness. The best-case scenario will be that all the reachable spaces
from one starting box can be found by simulation, and NL-SMT solver needs to
be invoked only once to confirm the conservativeness. In this sense, the proposed
simulation-assisted NL-SMT flow is combining the best of the two worlds: simulation
and verification.
2. Stop condition
There is an important question related to the above simulation assisted NL-SMT flow:
when should we stop simulations and start NL-SMT based conservativeness check?
One intuitive answer is: if we have a larger(smaller) initial space or higher(lower)
state-space dimension, then we should run more(less) samples of simulation. This
29




randomly sample xini in Bini;
run simulation from xini and visit xrch;
locate box b that contains xrch;
Brch = Brch ∪ {b};
until condition of stopping simulation Stop is true
/*NL-SMT conservativeness check*/
repeat
formulate constraints for initial space;
formulate constraints for hybrid dynamics;
formulate constraints for srch 6∈ Brch;
invoke NL-SMT solver;
if a reachable solution srch is found then
locate box b that contains srch;
Brch = Brch ∪ {b};
end if




answer suggests a definition of Stop, which can be mathematically expressed as:
Ns > NBIρ
d (2.15)
where Ns is the number of samples of simulation that are already run for the current
subproblem or time step, NBI is the number of boxes in initial space, d is the state-
space dimension and ρ is sampling density, a user chosen parameter. This gives
a condition of stopping simulation that changes for different time steps, since in
reachability analysis NBI is changing along the time. However, Eq.(2.15) is still
called the static Stop condition, because it is non-adaptive and hence non-optimal as
it does not track the evolution of hybrid dynamics within a subproblem.
To seek a more powerful dynamic scheme for stopping simulation, we have the
the following useful observation. It is likely to find new reachable boxes in first several
simulation samples while this becomes more difficult in the later phase of simulation
sampling as a bulk of reachable boxes have already been found. This prompts us to
monitor the numbers of new boxes found in recent samples and stop simulation when
only a small number of new boxes have been discovered in the recent history.
We put our intuition in a more rigorous manner by developing the following sta-
tistical learning approach. The algorithm for simulation-assisted NL-SMT approach is
rewritten as Alg.4. The NL-SMT conservativeness check part of Alg.4 is not detailed
since it is the same as that of Alg.3. Note that the Stop condition is re-evaluated for
every k random samples of simulation. So first of all, we need to estimate the number
of new boxes that will be found if k additional samples of simulation are run, namely
N
(k)
new, based on the history of simulation results. More specifically, the objective is to
calculate E[N
(k)
new|H ], the expectation of N (k)new given an observation H that represents
the history of previous samples. Note that H only covers the previous samples for
the current subproblem or time step.
31
Algorithm 4 Simulation-assisted NL-SMT approach with dynamic stop condition
Brch = ∅;
/*random simulation exploration*/
H = ∅; //sampling history
repeat
/*randomly sample k points*/
for i = 1→ k do
randomly sample xini in Bini;
run simulation from xini and visit xrch;
locate box b that contains xrch;
Brch = Brch ∪ {b};
update H with xini and b;
end for
evaluate Stop with the sampling history H ;








new|H ] is already obtained, then the Stop condition can be de-
rived with E[N
(k)
new|H ]. If simulation is stopped at this point of time, these unvisited
boxes must be covered by running the SAT solver E[N
(k)
new|H ] times. Therefore, stop-
ping simulation is only beneficial if the runtime cost of k simulations is more than
compensated by that of the SAT runs:
k · τsim > E[N (k)new|H ] · τsmt (2.16)
where the left side is the runtime of k samples of simulation, and the right side is
the total runtime of finding E[N
(k)
new|H ] new boxes by NL-SMT, τsim and τsmt are
the estimated runtime of one simulation run and one invoking of NL-SMT solver,
respectively. Hence, a practical stopping condition is as follows:
E[N (k)new|H ] < κ, (2.17)
where κ = kτsim/τsmt.
In the flow described above, the key is the link from the sampling history H to
E[N
(k)
new|H ], i.e. to calculate the posterior expectation of the number of new boxes
to be found in the next k samples of simulation. Actually, when using the term of
“posterior expectation”, we are implicitly assuming that it is a statistical problem
within the framework of Bayesian inference.
3. Statistical framework
In order to apply the standard flow of Bayesian inference, the process of random sam-
pling and simulation needs to be modeled in a statistical framework. When randomly
sampling a point xini in the initial space as the starting point of the simulation, the
simulation result is a point xrch in the state space, representing the state of the hybrid
system at the end of the current subproblem or time step. Given all the valid space
33
is discretized into m boxes b1, ..., bm, xrch will fall in one box (brch) in the candidate
box set B = {b1, ..., bm}: xrch ∈ brch ∈ B. Note that the candidate box set B can
be conservatively pruned to be smaller than the entire valid space, according to the
system dynamics. Because xini is randomly sampled, the visited box brch is actually
a random variable whose value can be any element in the candidate box set B. For a
fixed initial space and fixed system dynamics, P(brch = bi), the probability that the
visited box is bi, is also fixed as long as the starting point xini is sampled with a fixed
probability distribution in the initial space. For simplicity, xini is uniformly sampled
in the initial space in our implementation. The fixed probability distribution of brch
is noted as:
θ = {θ1, ..., θm}, (2.18)
where θi is the probability of visiting bi, i.e. brch = bi, with θi ≥ 0 and
∑m
i=1 θi = 1.
This statistical view of the sampling and simulation process is illustrated in Fig. 13.\ ] ^ _ `a b c b de e b e f ^ g ^ hi j k l m n o j p qr s t m p u n o j p q g c g \ ] ^ h _ c ` hv wx y w z { ^ d\ ] ^ d _ c ` d
Fig. 13. Statistical view of simulation exploration.
It is also noted that each time the sampling of xini always follows uniform distri-
bution, so θ, the probability distribution of the visited box brch, is not only fixed but
also independent form each other for multiple samples. Therefore, the sequence of vis-
ited boxes brch is a sequence of independent and identically distributed (IID) random
34
variables. From statistical point of view, the sampling and simulation process is anal-
ogous to rolling an m-side dice with uneven probability distribution θ = {θ1, ..., θm},
where θi is the probability of having the ith side on the top.
In our context, the probability distribution θ, that describes the possibilities of
a sample of simulation visiting each candidate box, is an unknown parameter and
our target is to make Bayesian inference about it from the history of sampling and




a. Principle of Bayesian inference
The principle of Bayesian inference [25] is briefly given as follows. If we assume
that A is one explanation for the observed history B, and C summarizes the prior
assumptions, then Bayes’ rule states:
P(A|BC) = P(A|C)P(B|AC)
P(B|C) . (2.19)
Before having the observation B, only C is known, but afterwards BC (B and C) is
known. In response to the observation B, Bayes’ rule suggests updating the proba-
bility of the explanation A being true, from P(A|C) into P(A|BC). Here P(A|C) is
called the prior probability and P(A|BC) is the posterior probability.
Supposing that A1,A2, ... are exhaustive and mutually exclusive explanations
(exactly one of Ai is true while the rest false), then the posterior probability of Ai






If the exhaustive and mutually exclusive explanations are no longer countable,
35
but are infinitely many, then the discrete probability distribution of A turns into
the probability density of a, where a is one point in the continuous range of the
explanation. And Bayes’ rule leads to:
p(a|BC) = p(a|C)P(B|aC)∫
p(a|C)P(B|aC)da , (2.21)
where p(a|C) and p(a|BC) are the prior and posterior probability densities of the
explanation a, respectively. For convenience, the prior assumption C is usually not
explicitly written, so Bayesian inference can be summarized by:
p(a|B) = p(a)P(B|a)∫
p(a)P(B|a)da . (2.22)
b. Bayesian inference for θ
Recalling that our target is to make inference about θ, through Eq. (2.22), we can
start from an initial guess of θ, and include information from the observed history,
to make a new guess of θ. Ideally speaking, if the observed history is infinitely long,
then the new guess of θ will be infinitely close to its true value, no matter how off
the initial guess is. Here the initial and new guesses of θ are the prior and posterior
probability densities of the parameter θ, respectively noted as p(θ) and p(θ|H),
where H is the observed history. For the outcome of n samples, the history H can
be defined as:
H = {h1, ..., hm}, (2.23)
where hi is the count of visiting box bi within the n samples and
∑m
i=1 hi = n. Note
that the order of sampling outcome is not reflected in H , because the sampling





The probability densities/distribtions in Eq. (2.24) should be clearly distinguished.
The prior and posterior probability densities in the Bayesian inference refer to the
probability densities of the parameter θ, and the parameter θ itself is the probability
distribution of a single sampling outcome brch, i.e. P(brch = bi|θ) = θi.
To conduct Bayesian inference, P(H|θ) in Eq. (2.24) should also be known,
which means the probability of distribution of the history H conditionally on θ
(the probability distribution of a single outcome brch). For the statistical model
of the sampling and simulation process, the probability of observing H from the
outcomes of n samples, conditionally on θ, is given by the multinomial distribution:














i=1 hi = n;
0, otherwise.
(2.25)
Theoretically, now the probability density θ can be updated from any reasonable
initial guess with the sampling history, by means of Bayesian inference. However, the
computation of Eq. (2.24) is usually too difficult to implement, or has very high cost
even it can be implemented. To ease the computation, we choose a specific type of
probability densities as the prior probability density of θ, namely Dirichlet probability
density [26]. For the Bayesian inference of Eq. (2.24) with P(H|θ) being multinomial
distribution, Dirichlet probability density has the property of conjugacy: if the prior
probability of θ is Dirichlet, then the posterior probability of θ is also Dirichlet. This
way, the mathematical computation can be largely eased.
Supposing the parameter θ has a Dirichlet probability density, then its proba-












which is also noted as θ ∼ Diri(α), where α = {α1, ..., αm} (αi > 0 for any i) is a






Noting that Dirichlet probability density is a function of α, so to choose a Dirichlet
probability density as the prior probability density of θ is to choose the value of
α. For the prior probability density of θ ∼ Diri(α), the corresponding posterior
probability density of θ is θ|H ∼ Diri(α+H), which can be derived through Bayesian
inference with the sampling history H [26]. The posterior probability density θ|H ∼
Diri(α+H) can be easily obtained by replacing αi in Eq. (2.26) with αi + hi:
p(θ|H) = Γ(
∑m
i=1 αi + hi)
∏m





Note that the parameter α plus the history H together determines the pos-
terior probability density of θ, so α is called prior strength, representing the prior
assumptions for θ. Intuitively, the value of H is larger for longer observation history
because there will be more counts of visiting candidate boxes, and thus H will be
more dominant in α+H , causing less influence of the prior strength α on the poste-
rior probability density of θ. This intuition fits the learning mechanism of Bayesian
inference.
5. Computation of E[N
(k)
new]|H
By conducting Bayesian inference starting from a Dirichlet prior probability density,
the posterior probability density of θ can be obtained. But recall that our final
target is to compute the posterior expectation of the number of new boxes that will
be visited in the next k sampled (a.k.a. E[N
(k)
new]) from the probability distribution of
38
each single sampling outcome (a.k.a. θ). If k is small, then the calculation of E[N
(k)
new]
is trivial. Taking k = 1 as an example, we have:
E[N (1)new] = 1 ·
∑
i:bi∈B∗







where B∗ is the set of unvisited boxes, which can be derived by removing the visited
boxes in the candidate box set B. However, the direct calculation of E[N
(k)
new] be-
comes infeasible when k gets big. Fortunately, the following theorem provides a more
practical solution.









i is the probability that box bi will be visited in the next k samples at
least once, and the right side of Eq. (2.30) is the sum of V
(k)
i for all the boxes in the
unvisited boxes set B∗.
The proof of Thm.1 is provided in Appendix. Thm.1 suggests that E[N
(k)
new] can
be calculated by means of first calculating V
(k)
i , which is much easier to compute.
Given θ is the probability distribution of a single sampling outcome, the probability





new] can be written as the functions of θ:
V
(k)









[1− (1− θi)k]. (2.32)
Finally we can compute the posterior expectation E[N
(k)
new|H ]. Supposing the
prior probability density of θ is θ ∼ Diri(α), and given a sampling history H , then
39
the posterior probability density of θ is θ|H ∼ Diri(α + H). So the posterior
expectation E[N
(k)
new|H] can be given by:
E[N (k)new|H ] =
∫
E[N (k)new](θ)p(θ|H)dθ. (2.33)







































(1− θi)kp(θ|H)dθ1dθ2 · · ·dθm. (2.36)
Substituting the expression of p(θ|H) in Eq. (2.28) into Eq. (2.35), and after a
lengthy but trivial derivation, Eq. (2.35) can be transformed into:





























new|H ] is actually a function of α, H and B∗, because the posterior ex-
pectation of the number of new boxes that will be visited in the next k samples
is determined by (1) the prior assumption about the probability distribution of the
outcome of a single sample, (2) the history of sampling outcomes and (3) the set of
unvisited boxes.
Based on Eq. (2.37), to further ease the computation, we choose α = {α0, α0, ...}
40
with all its element the same. And considering
∑
j hj = n (n is the total number of
previous samples), E[N
(k)
new|H ] can be given by:




(m− 1)α0 + n+ i− 1
mα0 + n+ i− 1
], (2.38)
where mB∗ is the number of unvisited boxes and m is the number of all the candidate










































new|H] vs k (mB∗=100, m=1,000, α0=0.001): (a) n=100 (b) n=100,000.
E. PLL lock time verification
In this section, the proposed NL-SMT verification methodology is applied to the
verification of PLL lock time.
41
1. Charge pump PLL
The charge pump based PLL studied in this work is shown in Fig. 15. The voltage-
controlled oscillator (VCO) output is fed back through a 1/N frequency divider. The
phase/frequency difference between ref and div is detected by a phase frequency
detector (PFD), whose output controls the current of charge pumps (CPs). Through
negative feedback, the PLL output frequency should be finally locked to around N
times the reference frequency.| } ~       |                   ~ ~         | }          ¡  ¢  £ ¤¥ ¦ § ¨ ©  ª     «   ¬ ­ ®       
Fig. 15. Block diagram of charge pump based PLL.
A typical implementation of the PFD is composed of 2 D flip-flops (DFFs). If
ref (div) is taking the lead, ref (div) first gives a rising edge. Consequently, DFF1(DFF2)
output up(dn) becomes ’H’, and further turns on the upper(down) CP to pump icp
into(out of) the loop filter. The CP is turned off when div(ref ) catches up by also
giving a rising edge to reset both DFFs output to be ’L’. The corresponding timing
diagram is shown in Fig. 16. In practice, the duration when both up and dn are ’H’ is
so short that such situation can be safely neglected. The discrete transition of PFD
can be described by the hybrid automaton in Fig. 17, where φr and φd are the phase
42
of reference clock and the divided clock, respectively.¯ ° ±² ³ ´µ ¶² · ¸¯ ¹
Fig. 16. Timing diagram of PFD.º » ¼ ½ ¾º » ¿ À Á º » ¼ ½ ¾º » ¿ À Á º » ¼ ½ ¾º » ¿ À ÁÂ Ã À ÄÅ Æ À ÇÈ À È Â Ã À ÇÅ Æ À ÄÈ À È »Â Ã À ÇÅ Æ À ÇÈ À ÁÉ Ê Ë Ê É Ê ÌÉ Ê º Í ¼ ½ ¾º Í ¿ À Áº Í ¼ ½ ¾º Í ¿ À Áº Í ¼ ½ ¾º Í ¿ À Á
Fig. 17. PFD hybrid automaton.
The loop filter is a linear RC network including 2 internal node voltages v1 and
v2. v1 is also the control signal of VCO. Nonlinearity of the VCO control curve is







f v1 + c
(0)
f (2.39)






f are polynomial coefficients.






























































where Icp is the charge pump current, fr is the reference frequency.
Table I lists the valid value/range of the discrete states and the continuous vari-
ables. Nevertheless, in the implementation, the continuous space has only 3 dimen-
sions (v1, v2, φr) for simplicity, because φr has no uncertainty and independent of
other continuous variables.




{up,dn} {H,L} {L,L} {L,H}
2. SMT constraints for lock time
The proposed NL-SMT based verification flow is applied to verify a key performance
of PLL, lock time. PLL is considered as locked, if the phase difference between
div and ref remains within a small interval [∆φ
L
, ∆φL] for at least time tmin (or
at least kmin successive time steps). A lock time specification requires PLL to get
locked in less than time Tmax after a change in division ratio or a phase/frequency
perturbation. A specific set of SMT constraints should be formulated to check the
lock time specification.
44
A set of auxiliary variables are introduced: ksuc (integer), Lock (boolean), Pass
(boolean). ksuc serves as a counter that records the number of successive time steps
that phase difference remains in the small interval, and its initial value ksuc(0) should
be set to 0. If the current phase difference is in [∆φ
L
,∆φL], then ksuc increments
by one, otherwise, ksuc is reset of 0. Once ksuc reaches is kmin, Lock is set to true,
indicating PLL is currently locked. If Lock is true and the current time t has not
reached Tmax, then Pass is set to true indicating the lock time specification is suc-
cessfully verified. The initial values of Lock and Pass are both false. Obviously, it is
meaningless to continue reachability analysis after Tmax. Therefore, the reachability
analysis finishes either with “pass” when Pass(t) = true, or with “fail” for t > Tmax.
To update these variables when reachability analysis moves forward, the following
SMT constraints are added as part of dynamic constraints:
[mod[−π,π](φd(t0 +∆t)− φr(t0 +∆t)) ≥ ∆φL
∧mod[−π,π](φd(t0 +∆t)− φr(t0 +∆t)) ≤ ∆φL]
−→ ksuc(t0 +∆t) = ksuc(t0) + 1,
(2.42)
[mod[−π,π](φd(t0 +∆t)− φr(t0 +∆t)) < ∆φL
∨mod[−π,π](φd(t0 +∆t)− φr(t0 +∆t)) > ∆φL]
−→ ksuc(t0 +∆t) = 0,
(2.43)
ksuc(t0 +∆t) ≥ kmin ←→ Lock(t0 +∆t), (2.44)
Lock(t0 +∆t) ∧ (t0 +∆t < Tmax) ←→ Pass(t0 +∆t), (2.45)
where mod[−π,π] is modulo function with its output ranges from −π to π, A ↔ B
stands for (A→ B) ∧ (B → A), and A→ B is equivalent to ¬A ∨B.
45
3. Fast forwarding
Although the hybrid automaton of charge pump PLL has only 3 discrete states and
4 continuous variables, its reachability analysis is not trivial for existing verification
techniques [21]. Because most reachability analysis techniques focus on improving the
efficiency of handling continuous dynamics, while for lock time verification, hundreds
and thousands of discrete switches will occur before charge pump PLL gets locked,
suggesting a large number of splitting the reachable space. The situation gets even
more challenging when nonlinear dynamics are considered. To ease the computational
complexity, a fast forwarding technique is exploited particularly for the reachability
analysis of charge pump PLL.
Notice that continuous variables can be analytically expressed between any two
discrete switches. For example, the analytical forms of v1 and v2 can be obtained by











































where t0 is the starting time of the current continuous behavior. The analytical form
of φd can also be derived using symbolic analysis engine:
φd(t) = Fφd( t, v1(t0), v2(t0), φd(t0),








which is too long to be listed here.
46
Based on this observation, instead of moving forward with a small time step,
fast forwarding can be applied over one reference period Tref , as illustrated in Fig.
18. Each fast forwarding starts from right after a discrete switch due to φr crossing
2π (named as φr switch) and stops when the next φr switch is finished, covering a
duration of Tref . Discrete switches due to φd crossing 2π (named as φd switches) are
scattered between 2 neighboring φr switches. Since analytical forms of the continuous
variables do not exist across discrete switch, symbolic analysis is only applied between
discrete switches. Several segments of symbolic analysis might be carried out during
a fast forwarding, depending on the number of φd switches. The ending values of
continuous variables in one segment become the starting values for the next segment.Î Ï Ð Ñ Ò Ó Ô Õ Ï Ô Ö × Ø Ù Ó Ú Û Ô ÜÝ Þ ß Ô Ó Ð Ð à á Ý â ß Ô Ó Ð Ð à áÞ ã äå æ ç è Ó é × ß Ï Ø Ï é æ Ð × Ð Ñ × ç ÛÑ ê Þ ë ì ìí î ï
Fig. 18. Timing diagram of fast forwarding.
While φr switches are synchronous, φd switches are asynchronous. The number
and moments of asynchronous φd switches need to explored on-the-fly. The explo-
ration flow is in shown in Fig. 19. The period starts right after the last switch of
φr, and the initial time is saved as ti and update t0 with ti. At the beginning of the
loop, it is checked if there is any φd switches before the end of the current period.
This is done by substituting t = ti + Tref into Eq. (2.48) and checking if φd(t) is
larger than 2π. φd(t) < 2π means no φd switches will happen in the rest of this
period. In this case, the flow will jump out of the loop and fast forward to ti + Tref ,
47
using Eq. (2.46)-(2.48), and finally a φr switch is applied. On the other hand, if the
checking result is φd(t) ≥ 2π, then the flow will find out the moment of the φd switch
by solving t in Eq. (2.48) with φd(t) = 2π and the solution is saved as t
(d)
cross. Note
that this solving process will be implicitly handled by the NL-SMT solver. Then the
flow will fast forward to t
(d)
cross and apply a φd switch. After updating t0 with t
(d)
cross,
the loop is restarted from the beginning. Note that in this flow, the value of Icp is
always updated according to the current values of up and dn.ð ñ ò ó ñ ó ô õ ö ñò ÷ ñ ø ó ù ú û ü ô ñ ý ö þ ø û ÿ   ö ø ý  ô ÷ ù û ü ô ñ ý ö   ý ý  ó û     ñ ô      û   ÷   ó ü ò ó ù öñ    ø 	 ñ ú û ü ô ñ ýù ú û ü ô ñ ý ö
 ô   ù  û ü ô ñ ý ö     ø  ñ ñ  ú         ñ ô      û   ÷   ó ü ò ó  ñ   ñ  ú          ø 	 ñ  ú  ù  û ü ô ñ ý ö
Fig. 19. Flow of fast forwarding.
The above flow should be converted into SMT constraints, so that the dynamics
of fast forwarding can fit into the proposed NL-SMT-based verification framework. A
methodology of the conversion is given in Fig. 20. First, a loop statement is unrolled
into a tree of if-else statements. Given an upper bound R of rounds within the loop,
the if-else tree can be cut down to a length of R. In our case, R should be no less than
the number of φd switches. In the proposed simulation-assisted NL-SMT approach,
the value of R can be estimated from simulation results. To keep conservativeness,
a guard band could be added. Next, “if A then B, else C” statement is transformed
48
into constraint (A → B) ∨ (¬A → C). This transformation is applied to all the “if
A then B, else C” statements, and the resulted constraints are conjuncted together.
This way the SMT constraints for the fast forwarding dynamics are generated.           ! "  ! "    # "  # "     ! "   # "$   ! "
 
  ! " % & $ '   ! "
 
  ! " %
Fig. 20. From flow to constraints.
F. Experimental results
The simulation-assisted NL-SMT based reachability analysis is applied to check the
frequency hopping of PLL: to change the value of feedback frequency division N
and then verify PLL lock time. Given the reference clock frequency is 10MHz, if N
is changed from 36 to 100, the VCO frequency is expected to start from 360MHz
and finally get locked to 1GHz. The initial condition for v1 and v2 is set to v1 ∈
[0.7, 0.71], v2 ∈ [0.6, 0.61] to model their uncertainty. All the following experimental
results are obtained using a 4-core Intel CPU Q9450 processor running at 2.66 GHz
49
with 8 GB memory. ( ) * + , - ) . / 0 1 2 3 0 1 4 5 6 5 7 8 1 + 1 6 8 - 5 -9 : ; < + 5 7 5 1 6 - 7 1 7 0 = > ? @ A B > C ? D E9FG HIJ K L M ) + 7 0 N 1 / 6 ) - 5 O * 6 1 7 5 ) + 7 / 1 2 0 -P 9 : ;QG R S T U V W X Y Z9 9 : ; [ [ : ; \ \ : ;
Fig. 21. Reachable space of φd computed by the proposed reachability analysis.
Starting from φd ∈ [0, 2π), Fig.21 shows the reachable space of φd obtained by
reachability analysis, and as a reference the results of a 1000-sample Monte Carlo
simulation that starts randomly from the initial state space. If the lock condition is
set to |φd − φr| ≤ 0.01 × 2π lasting for at least 5 reference clock cycles, then PLL is
verified to get locked before 2.1µs.
To demonstrate the effectiveness of Bayesian inference technique, as a refer-
ence, we first run the reachability analysis with the static Stop condition defined in
Eq.(2.15). Fig. 22 lists the total runtime contribution from simulation and NL-SMT
as well as the number of NL-SMT solver invoking, versus different sampling densities
ρ. It can be seen the minimum runtime is around 1000 seconds. On the other hand,
the runtime of reachability analysis with the dynamic Stop condition is 873 seconds,
indicting its effectiveness in online learning and reduction of verification runtime.
Also note that for the Bayesian-based case, the number of invoking NL-SMT solver
50
] ^ _ ` a b c d e ] c d f g h f i j] ^ _ ` a b k ] _ c d f g h f i jl m n o n p q r o p o s t u v w x y z { | } ~ y u v w w x             ] ^ _ ` a   d  f  ^  c   ^  c   b k ] h ^ a  f            ¡¢£¤  ¥    ¥   ¦ § ̈©ª«̈¬ ` ­ f h c `  ` h f ®    _ c d f ¯  ° ± h f i    ± ¥ ²  °  ³    ²    ² ±  ± ² ¥ ´ p µ ¶ q n m · ¸ s m r n o ¹
Fig. 22. Runtime vs sampling density.
is 26. Considering the verification time is 25 reference cycles, most reachable space is
discovered by the Bayesian-based random simulation, and only one box is explored by
the NL-SMT solver. For the dynamic Stop condition based case, the Stop condition is
checked for every 1,000 samples of simulations (k = 1, 000), and Eq. (2.38) is used for
computing E[N
(k)
new|H ]. Though can be updated on the fly, the runtime of one simu-
lation run τsim and the runtime of invoking NL-SMT solver once τsmt are estimated
as fixed values for simplicity. According to the average runtime of 10,000 random
samples of simulation and the average runtime of 100 times of invoking NL-SMT
solver for the PLL case, the ratio between τsim and τsmt is set to be κ = 1.4× 10−7.
Fig. 23 compares the numbers of samples of simulation per time step between
dynamic and static Stop conditions over the time. It can be seen although at first the
numbers of samples of simulation are the same for the two Stop conditions. As time
goes on, PLL starts to get locked and all trajectories start to converge to a confined
51
º » » » » » » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Â Å ¾ Ç Á Ä Á Å ¾ Ã Ä ¿ Ä Á Â Ã Ä Å Æ Â Å ¾ Ç Á Ä Á Å ¾ÈÉÊ ËÌ Í » » » » » »Î » » » » » »Ï » » » » » »ÐÑËÒ ÑÊ ÓÔÕ ÓÐÑÉ ÐÖ × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä Ý å â ã æ ç è éá Ý ê ë ì Ü Ø ã ê â Ù Û ç í î ï ð Û ñò » » » » » »ó » » » » » »ô » » » » » »õËÒ ÑÈÓÖÕ Ò ËõÐÈö÷ ÉÊ × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä Ý å â ã æ ç ø ùá Ý ê ë ì Ü Ø ã ê â Ù Û ç ú î í û ð Û ñ» » » ü ó » ü Í » ü Ï » ü ý ò ò ü ó ò ü Í ò ü Ï ò ü ý ó ó ü ó ó ü Íþ ÔÓÿ Ð á â Ù Û    ð 
Fig. 23. The numbers of samples of simulations for dynamic and static Stop conditions.
region of the state space. Corresponding to the shrinking reachable space, the static
Stop condition blindly chooses to use less number of samples of simulation and to
stop simulation earlier. In contrast, the number of samples of simulation only slightly
decreases for the dynamic Stop condition, based on Bayesian inference that learns
from the sampling history. Finally, the total numbers of SMT invoking are 26 for the
dynamic Stop condition, versus 41 for the static Stop condition. And correspondingly
the dynamic Stop condition leads to a total runtime (including simulation and SMT)
of 873 seconds, while due to the high runtime cost of SMT solving the total runtime
is 5,780 seconds for the static Stop condition, which demonstrates the intelligence of
the dynamic Stop condition for achieving the minimum total runtime.
To find out the influence of k, the number of samples of simulation between two
dynamic Stop condition checks, Fig.24 compares the numbers of samples of simulation
per time step over the time for different values of k, from 1 to 105. The corresponding
total runtime ranges from 864 seconds to 891 seconds. Although a smaller k means
the dynamic Stop condition should be checked more often, Fig.24 shows the variation
52
  	 
                                               
      
                  ! " # $ % & ' # ( ) &* +   +,-              .     /     ! 0 1 & 2  3 4  3 5  3 
  3 6   3 4  3 5  3 
  3 6 4 4 3 4 4 3 5
Fig. 24. The numbers of samples of simulations for dynamic Stop condition with dif-
ferent k.
of k has little impact on the number of samples of simulation and the total runtime,
meaning the dynamic Stop condition is insensitive to k. Referring back to Eq. (2.16)
that mathematically expresses the dynamic Stop condition, its insensitivity to k sug-
gests that given the same sampling history H , the posterior expected number of
new boxes that will be discovered in the next k samples of simulation (E[N
(k)
new|H ])
is proportional to k. This proportional relationship fits our earlier observation that
E[N
(k)
new|H ] ≈ kE[N (1)new|H ] for a long sampling history. On the other hand, if k is
so large that it is comparable to the number of samples of simulation per time step,
then the stop moment of simulation may be delayed unnecessarily, leading to wasteful
simulation. In Fig. 24, the largest k = 105 is still much smaller than the number of
samples of simulation per time step, ranging from 5.3 × 106 to 6.4 × 106, and thus
little negative impact can be observed. Therefore, as a general strategy of choosing
k, a smaller k is preferred as long as the computation of Eq. (2.38) is much less costly
than k samples of simulations, meaning it does no harm to check the dynamic Stop
conditions as often as possible if the checking cost is affordable.
53
789 : ; ; ;: < ; ;=>?@A B7C D E ; ; ;E < ; ; F G H I H J K L I J I M NO P Q R S T U V W X S O P Q W Q RO P Q R S T U V W X S O P Q Q RO P Q R S T U V W X S O P Y RZ [@ \] F G H I H J K ^ U V W X H G I M _ ` J K a _ J b c Q R d;< ; ; ; e ; ; E ; e ; E ; e E E W S T U V W X S Q R
Fig. 25. Runtime vs initial space (with simulation assistance and Bayesian inference).
Fig. 25 shows the runtime scalability with respect to the size of initial state
space, with the Bayesian inference based simulation stopping scheme. Initial state
space increases with the initial uncertainty of the divided clock phase φd. Note that
the reachability analysis is performed for 2.5µs (25 reference clock cycles) and PLL
design parameters are C1=2.5pF, C2=0.6pF and R1=160kohm. In contrast to the
runtime results in Fig.25, the runtime of reachability analysis using basic NL-SMT
flow (without simulation assistance) is more than 5 hours even for the smallest initial
state space.
G. Summary
A nonlinear SMT based approach is presented for verification of dynamic properties of
nonlinear AMS designs. Towards enabling practical NL-SMT based AMS verification,
simulation-assisted SAT and Bayesian inference are leveraged to accelerate the veri-
fication. The feasibility and efficacy of the proposed methodology are demonstrated
on conservative verification of the lock time of a charge pump based PLL.
54
H. Appendix
For clarity, Thm.1 is rewritten here with all the conditions clearly stated. Given
a set of boxes B = {b1, ..., bM}, we randomly visit a box brch in B. Now we will
repeat the random visit for k times, and each time the probability distribution is a
vector θ = {θ1, ..., θM}, where θi is the probability of visiting box bi. Also each visit
is independent from other visits. Therefore, the sequence of visited boxes brch is a












B∗ ] is the expected number of boxes in B
∗ that will be visited in k visits,
i.e. the expected number of boxes in brch, without counting the repeated appearance
of previously visited boxes; and V
(k)
i is the probability that box bi is visited in k visits





i means the sum of V
(k)
i for all the boxes bi in B
∗.
Proof. The theorem can be proven by induction.











B∗ (i) is the number of visited boxes that are in B
∗, if the first and the only
visited box is bi. Obviously, N
(1)









1, bi ∈ B∗;
0, bi 6∈ B∗.
(2.51)
55




















θi · 1 +
∑
i:bi 6∈B∗






i is the same as the probability that box bi is hit by 1 visit, which









which states that the theorem is proven for k = 1.















i is also true.





B∗ ] = E[N
(k)









B∗ is the increment of the number of visited boxes that




B∗ are the numbers
of visited boxes that are in B∗ by k visits and k+ 1 visits, respectively. Substituting

















θi1 · · · θik+1∆N
(k+1)




B∗ (i1, ..., ik+1) is the increment of the number of visited boxes that are
in B∗ due to the (k+1)th visit, if the k+1 visits hit bi1 , ..., bik+1 in order. Obviously,
∆N
(k+1)
B∗ (i1, ..., ik+1) can only be 0 or 1. Depending on the result of bi1 , ..., bik+1 ,
however, there are three possible situations. The first is that bik+1 is in B
∗ and bik+1
is not hit by the previous k visits. In this case, ∆N
(k+1)
B∗ (i1, ..., ik+1) = 1. The second
situation is that bik+1 is in B
∗ but bik+1 is already visited in the previous k visits. In
this case, ∆N
(k+1)




B∗ (i1, ..., ik+1) = 0. These three situations are summarized by:
∆N
(k+1)












1, bik+1 ∈ B∗, ik+1 6∈ {i1, ..., ik};
0, bik+1 ∈ B∗, ik+1 ∈ {i1, ..., ik};
0, bik+1 6∈ B∗.
(2.58)










θi1 · · · θik∆N
(k+1)







θi1 · · · θik∆N
(k+1)







θi1 · · · θik∆N
(k+1)







θi1 · · · θik∆N
(k+1)







θi1 · · · θik∆N
(k+1)







θi1 · · · θik∆N
(k+1)
B∗ (i1, ..., ik+1),
(2.59)














θi1 · · · θik is the probability that box bik+1 is not
57










∗ θik+1(1−P(ik+1 ∈ {i1, ..., ik})),
(2.61)
where P(A) is the probability of event A, and it is obvious that the sum of the
probabilities of ik+1 ∈ {i1, ..., ik} and ik+1 6∈ {i1, ..., ik} should be 1. Because P(ik+1 ∈
{i1, ..., ik}) is the probability that box bik+1 is visited in the previous k visits at least
once, which is V
(k)
ik+1











Substituting the above expression of E[∆N
(k+1)








i + θi(1− V
(k)
i )]. (2.63)
On the other hand, V
(k+1)
i , the probability that box bi is visited in the k + 1
visits at least once, can be expressed by conditional probabilities as:
V
(k+1)
i = P(bik+1 = bi|{bi1 , ..., bik} 63 bi) ·P({bi1, ..., bik} 63 bi)
+P(bik+1 ∈ B|{bi1 , ..., bik} 3 bi) ·P({bi1 , ..., bik} 3 bi),
(2.64)
which is based on the observation that, the event that box bi is hit by the k+1 visits
at least once means either (1) bik+1 (the box that is hit by the (k + 1)th visit) is bi if
the previous k visits has not hit bi; or (2) bik+1 can be any box in B if the previous k
visits has already hit bi. Since the sequence of visited boxes is IID, we have:
P(bik+1 = bi|{bi1 , ..., bik} 63 bi) = P(bik+1 = bi) = θi (2.65)
and
P(bik+1 ∈ B|{bi1 , ..., bik} 3 bi) = P(bik+1 ∈ B) = 1, (2.66)
58
Substituting them back into Eq.(2.64), also considering the probabilities that the
previous k visits has already hit bi and that the previous k visits has not hit bi are
V
(k)
i and 1− V
(k)





i + θi(1− V
(k)
i ). (2.67)









and therefore the theorem is proven for k + 1 visits.
59
CHAPTER III
HIGH-RESOLUTION ON-CHIP JITTER MEASUREMENT
In-situ test takes a different approach from verification to enhance the design robust-
ness, by integrating testability into hardware designs. This chapter focuses on the
in-situ test of jitter, which is needed by today’s high-speed high-precision digital and
analog ICs. Note that the proposed in-situ testing structure can be applied to char-
acterize different types of jitter-related analog performance. A preliminary work of
the proposed structure is published in [27].
A. Introduction
Timing precision, measured in the form of jitter, is extremely crucial for a broad
range of high-speed high-precision digital and analog ICs. Jitter is one of the most
important performances for the clock data recovery (CDR) in I/O circuitry as well
as the clock generation in high-speed digital signal processing circuitry, where phase
locked loops (PLLs) or delay locked loops (DLLs) are employed [11]. As an example,
today’s on-chip serial-links operate at a data rate of multi-Gb/s [12]. The clock jitter
in serial-link transceivers degrades the transmitted and received data margin. It may
also cause the received data to fall outside the design boundary. Moreover, from the
perspective of RF applications, jitter performance is also a key concern because clock
jitter will turn into the phase noise of wireless signals [1]. Hence, high-resolution jitter
characterization is essential.
Traditionally, jitter is measured using external testing equipment. State-of-the-
art time interval analyzers (TIA) provide femto-second accuracy. However, not only
are they expensive, but also their achievable resolution is limited by the noise injected
along the on-chip to off-chip signal propagation path. In this regard, low cost high-
60
resolution on-chip solutions are particularly appealing. More importantly, in-situ
jitter characterization allows online monitoring of design performance, and further
provides the possibility of self diagnosis and healing in the events of jitter incurred
failures.
Nevertheless, providing low-cost high-resolution on-chip jitter characterization
remains a significant challenge to date. In [28] [29], an analog based approach is
developed to convert the jittered time duration into the voltage of a capacitor, and
an analog-to-digital convertor (ADC) is used to convert the voltage into digital code.
However, due to its analog operation, this approach is highly sensitive to process
variation and vulnerable to noise interference like power supply noise. In contrast,
delay line based technique directly converts the jittered time duration into digital
code, by means of a line of delay cells and D flip-flops (DFFs) [30]. Its time res-
olution is determined by the delay of a single delay cell. To further improve the
resolution, Vernier delay lines (VDL) techniques employs two delay lines, whose cell
delays are slightly different from each other and the delay difference is its time reso-
lution [31]. The drawback of VDL is its small measuring range. A similar technique,
Vernier ring oscillator (VRO) employs two ring oscillators, whose oscillation periods
are slightly different from each other [32] [33]. Compared with VDL, VRO pro-
vides larger measuring range, but its sampling frequency is much lower. Moreover,
a sophisticated calibration scheme is usually required for Vernier-style techniques to
compensate the mismatch between cell delays. Recently, gated ring oscillator (GRO)
has been proposed to achieve an effective resolution higher than a single delay cell,
through over-sampling and quantization noise shaping [34]. Merging the principles
of VRO and GRO, [35] developed a technique named Vernier GRO which involves
two Vernier-style GROs. And the already reduced quantization noise of the VRO is
further shaped by the GRO operation. However, its highest sampling frequency is
61
largely limited by the VRO sampling frequency, which makes it not suitable for high
speed applications.
In this work, a novel built-in jitter measurement technique is presented that well
fits the need for the jitter measurement of high-speed signals. The proposed structure
is composed of a GRO, a delay line and a digital signal processing (DSP) unit. The
GRO can provide a coarse measurement of the input time duration, and the resolution
is equal to the stage delay of the GRO. With the assistance of the delay line, the time
residue which is not converted in the coarse measurement will be measured with a finer
resolution. The cell delay of the delay line is slightly different from the stage delay
of the GRO. The combination of the delay line and the GRO forms a Vernier-style
structure, which converts the time residue in the coarse measurement into digital code
with a resolution much finer than the GRO stage delay. On top on that, the GRO’s
benefit of quantization noise shaping is inherited to the fine measurement result, and
therefore an even higher effective resolution can be achieved. Finally, the DSP unit is
responsible of decoding the outputs of the GRO and the delay line into digital codes,
as well as converting the fine code into the fractional part of the coarse code, where
the fine resolution is implicitly calibrated with respect to the coarse resolution on
the fly. Moreover, pipelined structures are applied in the DSP unit to enable a high
sampling frequency.
Note that the proposed technique is different from the Vernier GRO in [35].
Because the fine measurement is only applied for the time residue from the coarse
level in the two-level measurement structure, the proposed technique can achieve a
sampling frequency much higher than Vernier GRO, which makes it a better candidate
for high speed applications. Moreover, the proposed structure has a high tolerance for
the typical delay mismatch of 90nm technologies, which is demonstrated by behavioral
model simulations.
62
This chapter proceeds as follows. Sec.B introduces the proposed architecture of
built-in jitter measurement technique. Sec.C describes the circuit implementation.
In Sec.D, the experimental results for the proposed technique are presented using a
commercial 90nm CMOS process, and the influence of delay mismatch is analyzed.
The final section draws a brief summary.
B. Proposed structure
The block diagram of the proposed on-chip jitter measurement scheme is shown in
Fig. 26. Its target is to convert the input time into digital code. The input time is
given as the time difference between the rising edges of START and STOP.f g h i j f ik l m n o p q r q ps t u v t uw x hf g y x g t z { uj | | } ~   p q r q ps t u v t ui  j 
Fig. 26. Block diagram of GRO-PVDL structure.
The proposed structure is named as GRO-PVDL, because it has two levels: the
first level is a GRO that provides coarse measurement; the second level provides fine
measurement by a VDL-style structure. Unlike standard VDL which requires two
delay lines, only one delay line is needed at the second level and the GRO at the
first level serves as the other delay line. So the delay line at the second level is called
partial Vernier delay line (PVDL). This section first introduces VDL and GRO, and
then the principle of the proposed GRO-PVDL is explained.
63
1. Vernier delay line                                       
(a)
                 ¡ ¢ £  ¡ ¢ £ ¤ ¥ ¦ § §  ¡ ¢ £ ¤ ¨ ¦  ¡ ¢ £ ¤ © ¦    ¡ ª « ¡  ¡ ª « ¡ ¤ ¥ ¦  ¡ ª « ¡ ¤ ¨ ¦ ¬ ­ ®  ¡ ª « ¡ ¤ © ¦
(b)
Fig. 27. VDL: (a) structure, (b) timing diagram.
*START(i): START delayed by i·τ1, STOP(i): STOP delayed by i·τ2
Fig. 27(a) illustrates a typical VDL that is composed of two tapped delay lines,
whose cell delays are τ1 and τ2. Note that τ1 is slightly larger than τ2: τ1 − τ2 = τ∆.
The taps of line I are connected to the clock inputs of a series of DFFs. The data
inputs of the DFFs come from the corresponding taps of line II. The input time
duration is given as the time interval (tin) between the rising edges of START and
STOP signals that the inputs of line I and line II, respectively.
Fig. 27(b) illustrates an example of VDL timing diagram. The output of DFFs,
digital code D, has the pattern of 0..01..1: 0’s followed by 1’s. With the increase of
tin, the number of 0’s in D will increase proportionally. Therefore, the time duration
can be digitized through decoding D (counting the number of 0’s in D).
An equivalent D value can also be obtained by the one-line structure in Fig. 28(a).
Note that it is just an imaginary design for convenient explanation, since a single cell
64
¯ ° ±±¯ °² ³ ´ µ ¶ ¶· ¸ ¯ ¹ º » ±· ¸ ¯ ¹ ¼ »² ³ ½ ¾ ³ ±
(a)
¿ À Á Â Ã Ã Ä ÄÅÆ Ç È É¿ À Ê Ë À Å Ì Í Î Ï Ì ÐÑ Ò ¿ À Ê Ë À
(b)
Fig. 28. Equivalence of VDL: (a) structure, (b) timing diagram.
with the delay of τ∆ is usually too small to implement. From its corresponding timing
diagram in Fig. 28(b), we can directly see the proportional relationship between the
number of 0’s in D and tin, and that the resolution is τ∆. Since the resolution of VDL
is the difference between two cell delays, it is not limited to the smallest delay of a
single cell.
2. Gated ring oscillator
GRO structure was first proposed in [34] as part of a time-to-digital converter (TDC)
for all digital phase-locked loops (ADPLLs). In the two-level structure proposed in
this chapter, GRO serves at the first level to provide the coarse measurement. Its
principle is illustrated in Fig. 29. The time difference between the rising edges of
periodic signals START and STOP is converted into a periodic pulse signal EN,
with jitter on its pulse width. When EN is high, GRO is oscillating like a normal
ring oscillator, triggering the subsequent counters that capture the signal transitions
at the GRO taps GA, GB and GC. As EN transits from high to low, GRO stops
oscillating suddenly, with the voltages at each tap frozen. Then GRO continues
oscillating from its frozen state when EN becomes high again. This way, the counters
65
are actually counting the number of GRO stage delays within the pulse width of EN.
So the resolution of GRO equals its stage delay τG, which is the delay of a single
inverter. Also note that the clock to sample ADD is generated by delaying STOP.
The delay dCK is inserted to ensure that ADD is sampled while satisfying the setup
time constraint of the registers, which means that the GRO has been frozen and the
counters and the adder have finished their operations.
Because GRO phase does not change when EN is low, we can make an equivalent
timing diagram by removing the frozen intervals and linking the oscillating intervals
together, as shown in Fig. 30. For convenient explanation, we assume that ADD
changes instantaneously with the GRO phase. At the beginning time t[i] and the
ending time t[i+ 1] of the ith EN pulse width tEN[i], the values of ADD are ADD [i]
and ADD [i+ 1], respectively. Then the coarse measurement of tEN[i] is given by:
DIFF[i] = ADD[i+ 1]−ADD[i], (3.1)
with the unit of τG.
The quantization errors for ADD [i] are ADD [i+ 1] are:
q[i] = t[i]− tADD[i], (3.2)
q[i+ 1] = t[i+ 1]− tADD[i+ 1], (3.3)
where tADD[i] (tADD[i+1]) is the time when ADD transits from ADD [i] (ADD [i+1])
to ADD [i]+1 (ADD [i + 1]+1), and q[i] (q[i + 1]) uniformly ranges within [0, τG).
Combining Eq. (3.1)-(3.3), and also considering:
tADD[i+ 1]− tADD[i] = (ADD[i+ 1]−ADD[i]) · τG, (3.4)
66
Ó Ô Õ ÖÓ Ô × Ø Ô Ù Ú Ù ÛÙ ÜÝ Þ ß à ß à ß à
á â ã ä å æ ç èé Û ê × ë ëØ æ ì í è å æ çá î ë í ï ï æ ç æ ä å í ð å â çë ñ ò ò
(a)ó ô õ ö ôó ô ÷ øù ú û ü ý þ ÿ û ü ý þ    ÿ  ö ÷      	õ 
 
          
(b)
Fig. 29. GRO: (a) structure, (b) timing diagram.
67                ! " "      # ! " "    $ % & '! " " ( )* ( )+,*-    -     & '. /    . /     . /   #             0 1   
Fig. 30. Equivalent timing diagram of GRO.
we have:
tEN[i] = t[i+ 1]− t[i] = DIFF[i] · τG + (q[i+ 1]− q[i]), (3.5)
where the first term on the right side is the coarse measurement times the unit τG,
and the second term is the overall quantization error of the coarse measurement.
As explained above, thanks to the phase freezing during the disable state, the
quantization error of GRO is qsn[i] = q[i+ 1]− q[i], instead of q[i]. For general cases,
q[i] can be treated as a white noise [34]. Note that qsn[i] can be obtained by filtering
q[i] with a first-order high-pass filter. Therefore, compared with the white noise
shape of q, the frequency spectrum of qns is shaped to be small at lower frequencies
and large at higher frequencies, as illustrated in Fig. 31. The benefit of quantization
noise shaping lies at the low frequencies where qns is smaller than q. If the sampling
frequency is much higher than the bandwidth of the measured signal, then by safely
low-pass filtering the measurement result, the high frequency power of qns will be
dropped and therefore an effective resolution finer than τG can be achieved. To avoid
confusion, τG will be called GRO’s raw resolution. The principle of GRO quantization
noise shaping is similar to that of ∆Σ ADC [36].
68
2 3 45 6 7 8 9 : 6 ; 8 < = > 7 ?2 @ A? B = C ? B = ? B = C ? B = @ AD E B > B < 8 6 8 F 7 G < > = D E > B < 8 6 8 F 7 G < > =
Fig. 31. Frequency spectra of without/with quantization noise shaping.
3. The proposed GRO-PVDL structure
To achieve higher resolution for jitter measurement, a novel GRO-PVDL structure
is proposed that improves GRO’s raw resolution as well as keeps the feature of the
first order quantization noise shaping. The GRO-PVDL structure includes a GRO,
a delay line, a group of DFFs and a subsequent DSP unit. The first first level is a
standard GRO. At the second level, the PVDL cell delay τP is designed to be slightly
longer than the GRO stage delay τG: τP − τG = τ∆, so that the PVDL and the GRO
together compose a VDL-style structure.
The input to the PVDL is also from the pulse signal EN. Each tap of the PVDL
triggers the clock input of a DFF. The data inputs to all the DFFs come fromGxor, the
XOR of all GRO taps. As shown in Fig. 32, Gxor is switching its polarity every GRO
stage delay τG, when the GRO is shifting its phase. Note that here we temporarily
assume that Gxor is changing instantaneously with the GRO phase. Referring back
to the GRO time diagram in Fig. 30, the GRO has a phase residue of τG− q[i] (q[i] is
the quantization error for ADD [i]). Therefore, τG will switch its polarity at τG − q[i]
after the rising edge of the ith EN pulse EN [i]. Thanks to the VDL-style structure
at the second level, τG − q[i] can be further digitized with a resolution of τ∆.
As shown in Fig. 33, the EN rising edge is delayed by the PVDL with its single
cell delay τP, to trigger the clock inputs of the DFFs one by one. On the other hand,
69
H I J K I L M N O N O N OP Q P RP SH I T U V R WX Y Z [ \ ] \ [ ^ T KP _ ` a b c d e f \ g h i jk l m [ \ ] \ [ no p o p o o H U d e q fni iN r N r n
Fig. 32. The proposed GRO-PVDL structure.
s t u v s t s t s t s t s t s tw x y z { | } ~ w  w  w                    w             y     ~  |  ~   |       | z }     ~    x  s  s  s  s  s             
Fig. 33. The timing diagram of GRO-PVDL.
70
Gxor will switch every τG. The first Gxor switch is later than the EN rising edge by
τG − q. According to the principle of the standard VDL in Fig. 27, τG − q can be
digitized by decoding the outputs of the DFFs D. Note that the pattern of D is 1010...
or 0101..., and the position of the first double 1’s or 0’s indicates the time duration
of τG − q. Because the resolution of the second level is τ∆, we have:
τG − q[i] = Cv[i] · τ∆ + qf [i], (3.6)
where Cv is the output of the VDL-style structure that is decoded from D, and qf is
the fine quantization error ranging within [0, τ∆) which is also a white noise. Similarly,
for the next measurement, we have:
τG − q[i+ 1] = Cv[i+ 1] · τ∆ + qf [i+ 1]. (3.7)
Through the differentiation of the fine measurement, i.e. subtracting Eq. (3.7)
from Eq. (3.6), we have:
q[i+ 1]− q[i] = DIFFf [i] · τ∆ + (qf [i]− qf [i+ 1]), (3.8)
where DIFFf [i] = Cv[i] − Cv[i + 1] is the fine measurement at the second level. It is
seen that the quantization error of the coarse measurement is further converted into
digital code DIFFf with a finer resolution τ∆. Substituting Eq. (3.8) into Eq. (3.5),
we can write:
tEN[i] = DIFFc[i] · τG + DIFFf [i] · τ∆ + (qf [i+ 1]− qf [i]), (3.9)
where DIFFc represents the coarse measurement (the replacement ofDIFF in Eq. (3.5)
for clarity), DIFFf represents the fine measurement, and the last term on the right
side is the overall quantization error.
Eq. (3.9) tells us that not only the residue of the coarse measurement is digitized
71
with a finer resolution, but the feature of the first order quantization noise shaping
is also kept for the fine measurement. Similar to GRO, by low-pass filtering the mea-
surement result, an effective resolution finer than the raw resolution can be achieved.
Last but not least, the calibration between the fine resolution and the coarse res-
olution is required for combining the fine and coarse measurements to provide an
overall measurement. This calibration is implemented in the DSP unit, and will be
introduced in the next section.
C. Circuit implementation
The proposed GRO-PVDL architecture is implemented using a commercial 90nm
CMOS technology. All the circuits except of the DPS unit are designed with analog
design flow. Because the analog properties of the GRO, the PVDL and the DFFs
have a significant influence on measurement accuracy. By contrast, the DSP unit
is designed with digital design flow. In this section, the circuit implementation of
GRO-PVDL is presented, including practical issues and circuit optimizations.
1. GRO
To save hardware cost, the GRO at the first level is implemented with three stages,
and the stage delay is designed to be 25ps. A critical issue of the GRO design is gating
phase shift, which is also called gating skew in [37]. As mentioned earlier, because
the GRO perfectly freezes its phase while disabled, the first-order quantization noise
shaping can be achieved. In practical implementation, however, the GRO phase would
shift due to the charge redistribution within “floating” delay cells.
To illustrate the issue of GRO gating phase shift, we take an inverter-based
delay cell in Fig. 34(a) as an example. Fig. 34(b) is a simplified model of the gated
72
    ¡ ¢¡ £   
(a)
¤ ¥¤ ¦ ¥§ ¨ © ª « ¦ «¬ ­§ ® ¯
(b)
° ± ²³ ± ²° ° ´° µ ¶ · ¸ ¹ º° ´° µ ¶ · ¸ ¹ »
(c)
Fig. 34. GRO phase shift: (a) gated inverter, (b) simplified model, (c) waveform.
inverter, which is valid when its output voltage Vo is falling and the PMOS transistors
are already in cutoff region. The waveforms in Fig. 34(c) show that once EN turns
low, Vo will drop due to the charge redistribution between Co and Cp. And from this
level Vo will continue falling, when EN turns back to high. Due to the drop of Vo,
the GRO phase is not fully frozen when EN is low, instead an extra phase shift is
introduced. Moreover, the amount of gating phase shift is determined by the GRO
phase when EN switches from high to low, as the comparison of case 1 and case 2 in
Fig. 34(c). ¼ ½¾ ¿ ¿ ¿ ½ ¿ À ¿ Á ¿ Â ¿ Ã ¿Ä ÅÆ ÇÈÉ Ê Ë Ì Í Î Ï Ð Ñ Ò Ó¼ Ô¼ Ã¼ ÁÕ ÖÈ×ÈÕØ Ù
Fig. 35. Simulated phase shift vs ϕdisab of a three-stage inverter-based GRO.
Fig. 35 shows the simulated gating phase shift as a function of the GRO phase at
the disabling moment (ϕdisab) for a three-stage inverter-based GRO. The gating phase
shift is given as absolute time, and it varies within a range of around 8ps. Given ϕdisab
73
is random for general cases, the variation range of gating phase shift will finally turn
into noise in the measurement result. Also note that the DC value of the gating phase
shift is usually not critical for jitter measurement, because the measurement result
can be easily calibrated by adding a DC offset. Therefore it is desired to minimize
the range of gating phase shift. To lower it, one possible solution is to increase the
capacitive loading of each stage by inserting dummy transistors, so that the effect of
charge redistribution can be minimized. But this will increase the GRO stage delay.Ú Û Ü Ý Þ ß à á â Þ á ã ä å à ß æ ç á ä èé ê é ê ê é ê é é ê é êÚ Û Ü ë ß Ýì í ç ë ß î á à ï ðñò Þ ß à á à Þ å ó ëä ô á ë í ï ðá ß õ Þ à ë ß î á ñÚ Û Ü Ý Þ ß à áà Þ å ó ë
Fig. 36. Gating phase shift of three-stage inverter-based GRO.
Another solution comes from the observation that the gating phase shifts of the
three-stage invertor-based GRO are complementary to each other for rising Vo and
falling Vo, as is illustrated in Fig. 36. The gating phase shift is minimum when Vo
is rising and maximum when Vo is falling. So it is possible to cancel the minimum
and maximum phase shifts with each other using differential delay cells instead of
single-ended inverters, because differential structure has two output voltages Vo and
nVo, and the rising/falling of Vo is always accompanied with the falling/rising of nVo.
In the implementation, the differential delay cell with cross-coupled inverters
(CCI) [38] is chosen to build up the GRO. As is shown in Fig. 37, it is composed of
74
ö ÷ øù øö ù ø ú û üö ú û ü÷ ø
(a)
ý þ ÿ                     	 
     	 
  ÿ ý    ý  ÿ              	 
    	 
   	 
    	 
 þ ÿ          	 
          
       
(b)
Fig. 37. Gated delay cell with CCI: (a) symbol, (b) schematic.
two large inverters (P1/N1 and P2/N2), whose outputs are coupled with each other
through two small inverters (P3/N3 and P4/N4). And two transistors are inserted to
enable oscillation gating, one PMOS above and one NMOS beneath. Thanks to the
cross coupling, the gating phase shifts due to the charge redistribution of the two
larger inverters will be mostly canceled with each other. Fig. 38 shows the simulated
range of gating phase shift of the three-stage CCI-cell-based GRO. Note that the phase
shift range is only around one.5ps, which is much lower than that of the inverter-based
GRO.  ! " # $ % & ' ( ) * + ,- ./ 012  ! 3 ! ! ! ! ! 3 ! 4 ! 5 ! 6 ! 7 !8 91:18; <
Fig. 38. Simulated phase shift vs ϕdisab of a three-stage CCI-cell-based GRO.
Apart from reducing the gating phase shift, the CCI delay cell has other merits
75
that make it a good candidate to build up the GRO. The differential structure can
achieve higher power supply rejection, as well as to provide rail-to-rail output voltage
swing [39]. Moreover, the delay cell with CCI has higher immunity to rising/falling
delay mismatch, and thus the GRO will have a more even tap delay.=>?@ ABCDE FGHI JK BFL M N N O N O O N O O OP Q R P Q S S R T U U S V S W XY CDE AJ BG DFC Z [ \ Z ] ^ U U S Z ] _ ` a \
Fig. 39. Simulated phase shift vs EN /nEN rising/falling time of a three-stage invert-
er-based GRO.
Finally, the range of gating phase shift can be further reduced by increasing the
rising/falling time of EN /nEN. Fig. 39 demonstrates this phenomenon for the three-
stage inverter-based GRO. An intuitive explanation is that the GRO is “weakly”
oscillating when EN /nEN is rising/falling, and as a result the GRO phase when
disabled ϕdisab is more like a continuous variable than a fixed value. So the shifted
phases are effectively averaged over the range of ϕdisab. So in our implementation,
by adding dummies to increase the load capacitance of EN /nEN, the rising/falling
time of EN /nEN is enlarged to be 80ps (more than three GRO stage delays). And
simulated peak-to-peak phase shift is pushed to lower than 1ps.
Referring back to Fig. 30, the noise due to gating phase shift, similar to the
quantization noise, is added to both t[i] and t[i + 1] because it is injected at the
beginning of each EN pulse. Therefore, the noise due to gating phase shift is also
shaped like the quantization noise.
76
2. Counters
To count every stage delay of the GRO, two counters are needed for each tap, one
to count rising edges and one to count falling edges. One challenge of the counter
design is its high input frequency. For the three-stage GRO with 25ps stage delay,




Asynchronous counters are therefore adopted to handle such high frequency, whose
structure is shown in Fig. 40. Compared with synchronous counters, asynchronous
counters can operate at a much higher input frequency, and the highest operating
frequency will not decrease for larger bit width [40]. Simulation shows that an asyn-
chronous counter made of standard cell DFFs is already able to handle over 8GHz
input. For asynchronous counters using true single phase clocked (TSPC) DFFs [41],
an operating frequency as high as 20GHz can be achieved in simulation. And the
measuring range of the input time duration is 2k · 6τG, where τG=25ps and k is the
bit width of the counters. In the implementation, the counter bit width is designed
to be 4, which allows a measuring range of 2.4ns. Also note that the asynchronous
counter will automatically switch its output from all 1’s to all 0’s when overflow
occurs, and a flag signal OF is output to indicate the occurrence of overflow.bc d e f g h c d e f i h c d e f j hkl kl kl bm n l o op kq r l o op kq r l o op kq r b
Fig. 40. Asynchronous counter.
77
For the three-stage GRO, six of such asynchronous counters would be needed.
To reduce the hardware cost, however, the implementation only requires one asyn-
chronous counter, using a phase tracking technique [37]. Fig. 41 shows the counter
implementation, where a single-ended design is shown for simplicity, while the real
circuits are implemented with differential structures. Instead of six counters, only
one counter is used to count the number of the rising edge of the GRO tap GC. For
the three-stage GRO, GC has a rising edge for every 6 GRO stage delays, i.e. a GRO
cycle. Therefore, the counter output Ncycle is multiplied by 6 to represent the number
of GRO tap transitions. s tu v w xu v y z v { | { | { |} ~ } } }  z                                                y            t  t   ¡ ¢ £ ¤¥   ¦ ¦ §¨  u x     © y  ª « ª ¤  ¬ ­ ­ ªw ­
Fig. 41. Phase tracking based counting structure (single-ended version).
On the other hand, the GRO phase can be translated into the number of GRO
tap transition smaller than 6. Between any two successive rising edges of GC, the
GRO phase shifts in a fixed pattern with a length of 6, as illustrated in Fig. 42. So
by chronologically decoding the GRO phases into 0, 1, ..., 5, the number of GRO tap
78














C is the GRO






C is sampled when it is frozen, and the phase decoder is
implemented in the DSP unit. Finally, the number of total tap transitions ADD is
given by 6Ncycle +Nstage.® ¯ ¯ ¯ ® ® ® ¯ ¯ ¯ ® ® ®® ° ±²² ³ ¯ ¯ ¯ ® ® ® ¯ ¯ ¯ ® ® ® ¯ ¯´² µ ®¯ ¯¯ ® ® ® ¯ ¯ ¯ ® ® ® ¯¶ · ¸ · ¹ º » ¼ ² µ ½ ¾ ° ± ½ ¾ ° ±
Fig. 42. Phase transition of three-stage GRO.
In spite of saving hardware cost, the above counting scheme has an issue that
would result in over/under-counting. The timing mismatch between Nstage and Ncycle
might occur when the GRO phase is frozen around the rising edge of GC. As is shown
in Fig. 43, if Nstage has not been switched by the rising edge of GC but Ncycle has,
then ADD is over-counted by 5; if Nstage has been switched by the rising edge of GC
but Ncycle has not, then ADD is under-counted by 5. A latch sharing technique can
be applied to solve the issue [37]. As shown in Fig. 44, the register connected to GC
is separated into two latches, and the first latch is shared with the counter. This way,
the counter input and the decoder input are guaranteed to be synchronized to each
other.
79
¿À Á Â Ã Ä Å Æ Ç È É Ê Ë Æ Ç È ÉÌ Ç È ÉÀ Í Î Í Ï ÅÐ Ñ Ò Ò Ó Ô Õ Ö × ×Ø Ò Ñ Ù Ú Ö × × Æ Û Ü Ý Þ È Ç È È È É È Ê È ËÆ È È Ü Ý Þ È Ç Û È É È Ê È ËÑ ß Ó Ò à Ô Ñ á Ù Õ â Ù Ú á Ù ã Ó Ò à Ô Ñ á Ù Õ â Ù Ú
Fig. 43. Over/under-counting due to counter/decoder input mismatch.
ä å æå æ ç è ä ç è ç é ç êä ç é ä ç êë ì í î ï ë ì í î ï ë ì í î ï ð ñ ò ä îó ô õ ä í ö ÷ó øä ó ø ë ì í î ïë ì í î ïë ì í î ïí ô ù ú û ü ä ý í æ þ ÿ þ    è     é     ê     
Fig. 44. GRO counting structure with latch sharing.
80
Another issue of the counter is the glitch at its input, which will also cause over-
counting. When the counter input is frozen at close to some middle level between
high and low, glitch might occur due to noise, as illustrated in Fig. 45. The latch
between GC and the counter can largely reduce the chance of glitch. Moreover, a pair
of small cross-coupling inverters are inserted at the differential output of the latch,
in order to force the output to a logic level and ensure it will not be overturned by
noise. 	 
        
Fig. 45. Glitch illustration.
3. PVDL
The PVDL is composed of a line of differential delay cells with CCI, and its schematic
is shown in Fig. 46. And the cell delay τP is design to be 30ps by proper dummy
transistor loading. Considering the GRO stage delay τG=25ps, a raw resolution of
τ∆=5ps can be achieved with the proposed GRO-PVDL structure. As mentioned
before, the residue of the coarse measurement ranges from 0 to τG. Consequently, at
least τG/τ∆=5 delay cells are needed in order to cover the range of the coarse residue.
To ensure safety under process variation, the PVDL has a length of 10 delay cells in
the implementation.
Note that the delay cell of the PVDL is similar to that of the GRO, but with no
gating transistors. Compared with single-ended delay cells, the CCI-based delay cells
not only have higher immunity to power supply noise, but also have smaller delay
mismatch due to process variation. Because the delay variation between the two large
81          
(a)
              !     ! " # $ % & ' $ " #% & ' (    !  ) * +, - - . + , ! / 0 / . * + /1 2  !  1 2  !    !     ! 
(b)
Fig. 46. Differential delay cell: (a) symbol and gate-level schematic (b) schematic.
inverters in Fig. 46(a) will be canceled with each other. If we simply assume the delay
of CCI-based delay cell is the arithmetic average of the delays of its two inverters,
then the delay variance of the entire cell is half of that of a single inverter.
4. DFFs
A differential structure is adopted for the DFF design. As shown in Fig. 47, the
differential DFFs structure is composed of two differential latches with CCI. Such
structure was first proposed in [42] to achieve minimum power as well as propagation
delay. Note that the clock input is also differential such that it could be triggered by
the differential output of the PVDL. The DFF structure has very small setup/hold
time (less then 0.3ps in simulation). And thanks to the cross coupling inverters,
the outputs will be quickly forced to high/low level, even when the setup/hold time
condition is not met. Therefore it is suitable for the application as a lead/lag detector.
82
3 4 5 6 7 8 9 : 4 ; 7< =: 4 6 > ? < =: 4 6 > ?<@ < =@ =A B A BC D@ C D
(a)
E F GH I J K L M NJ K J O M N J K J O M NJ P Q R NJ P Q R NS T G E U V W E T GU V WJ P Q R N J P Q R NF G X L Q R N Y Z [\ P Q R N \ P Q R NI] ^ ^ _ [ ] R ` a ` _ Z [ `b P Q R N
(b)
Fig. 47. (a) Gate-level schematic of differential DFFs (b) schematic of differential latch.
5. GRO-PVDL
Referring back to Fig. 32, within the proposed GRO-PVDL structure, the DFFs are
supposed to sample the XOR of the GRO taps. Since the state delay of the GRO is
close to minimum, the XOR gate has to operate at an input switching frequency of
1/τG=40GHz. However, it is impossible to implement an XOR gate operating at such
high frequency. Our solution is to directly sample the GRO taps and then XOR the
sampled values of the DFFs in the subsequent DSP unit. As shown in Fig. 48, each
PVDL tap triggers the clock inputs of three DFFs, whose data inputs are connected












C , ... And








An issue of the GRO-PVDL structure is the uneven stage delays of the GRO at
the moments of enabling and disabling, which is caused by the charge redistribution
inside the delay cells. The unevenness of stage delays is more prominent when the
83
c d e fg hi g h j c d k fj l lhm hm
c d e fg ni g nop qr j c d k fj l lns tu m nmc d e fg vi g v j c d k fj l lvm vmw lx yi x y z l{ | j } ~  i  
Fig. 48. Schematic of PVDL and DFFs.
84
rising/falling time of EN /nEN is intentionally increased to achieve smaller variation
of the gating phase shift, as illustrated in Fig. 49. This may lead to inaccuracy or
even malfunction of the fine VDL-style measurement.                                                                                                             
Fig. 49. Uneven stage delays of the GRO, and the effective timing diagram.
The issue of uneven stage delays can be solved by inserting a delay of dV at the
PVDL input, as in Fig. 48. dV should be large enough to ensure the sampling clocks
from the PVDL can avoid the period of uneven stage delays, i.e. the sampling clock
from the first PVDL tap rises after the GRO has entered the period when its stage
85
delays are even.
For the convenient explanation of the solution details, we first draw an effective
timing diagram for the GRO enabling and disabling process, as in Fig. 49. With the
effect of uneven stage delays reflected by the gating phase shift tps, conceptually we
consider an effective EN that is steep and the effective GRO phase that has even
stage delays. As is stated previously in the GRO design that the variance of the
gating phase shift can be reduced to small enough, so here tps is treated as invariant.
Compared with Fig. 33, it is seen that τG − q + tps, rather than τG − q, needs to be
measured at the second level of the GRO-PVDL.     ¡ ¢ £ ¤    ¥ ¦ § ¨ © ª ¢ « ¬ ­ ® ¯ « ° ­ ® ¯ « ± ­ ® ¯     ¡ ¢ £ ¤  ² ³ ´ µ ¶ · ¸   ¢ ¹ ¦ §¥ º ¦ §» ¼« ¬ ­ ½ ¯ « ° ­ ½ ¯ « ± ­ ½ ¯¾ ¿ À Á ¢ · µ ¸ Â º ¦ Ã
Fig. 50. Timing diagram of the GRO-PVDL with dV.
Due to the uneven stage delays, however, τG−q+tps cannot be measured directly.
Thanks to the inserted delay dV, the time between the rising edge of the first PVDL
tap and the next GRO phase switch, tf , can be correctly measured with the fine
resolution τ∆ by the VDL-style structure:
tf [i] = CV[i] · τ∆ + qf [i], (3.11)
where CV is output of the VDL-style structure. On the other hand, from the timing
diagram in Fig. 50, the relationship between τG − q + tps and tf is:
(τG − q[i] + tps) +NV[i] · τG = dV + tf [i], (3.12)
86
where NV is the number of the GRO phase switches during the inserted delay time
dV. From Eq. (3.11) and Eq. (3.12), we can derive the expression of q :
q[i] = (NV[i] + 1) · τG − CV[i] · τ∆ − qf [i]− dV + tps. (3.13)
Substituting Eq. (3.13) into Eq. (3.5), we finally have:
tEN[i] = (DIFFc[i] + ∆NV[i]) · τG + DIFFf [i] · τ∆ + (qf [i]− qf [i+ 1]), (3.14)
where ∆NV[i] = NV[i + 1] − NV[i]. Compared with the measurement result for the
ideal GRO-PVDL (Eq. (3.9)), the coarse measurement has an extra term ∆NV. NV


























C is the GRO phase sampled by the first clock from the PVDL.
Because ∆NV is part of the fine measurement but it has the unit of τG, and thus it
can be treated as the carry from the fine measurement to the coarse measurement.
Also note that fixed offsets like dV and tps are canceled in the expression of tEN,
which means the exact value of of the inserted delay dV does not influence the final
measurement, as long as dV is large enough for the fine measurement to avoid uneven
GRO stage delays.
6. DSP unit
The DSP unit is responsible of translating the “raw” digital information generated
by the GRO-PVDL structure into an output digital code that is the measurement
of the input time duration. As is shown in Fig. 51, there are three modules: the
coarse code generator, the VDL decoder and the fine code generator. Note that an
extra low-pass filter is needed to remove the high-frequency quantization noise in the
output Moa. Since a standard low-pass filter can be implemented with low hardware
87
cost, its design is not detailed here. The final output is the overall measurement Moa
that has 15 bits, with 5 bits as the integer part and 10 bits as the fractional part.
And one LSB of the integer part represents the coarse resolution τG.Ä Å Æ Å Ç È É Ê Ë Ì Í Î É Ê Ï ÎÐ Î Ñ Î Ì Ë Ò Ê Ì Ó ÅÔ ÕÖ × Ø Ù Ú Ö Û Ø Ù Ú Ö Ü Ø Ù Ú Ó Ý ÞÖ × Ø ß Ú Ö Û Ø ß Ú Ö Ü Ø ß ÚÖ × Ø à Ú Ö Û Ø à Ú Ö Ü Ø à ÚØ á Ú Ø á Ú Ø á Úâ â Ö ã äÏ Î É Ê Ï Î Ì å æ ç è Ñ Î É Ê Ï ÎÐ Î Ñ Î Ì Ë Ò Ê Ì Ó éÖ × Ö Û Ö Üâ â
Fig. 51. Block diagram of the DSP unit.
The ever scaling CMOS technology is providing more digital resources with lower
hardware cost. This allows us to achieve high speed as the first design priority, in
order to help increase the sampling frequency of measurement. Therefore, pipelined
structures are adopted to achieve a higher data throughput. The entire DSP unit
is composed of synchronous sequential logic triggered by a single clock source whose
frequency is the same as the sampling frequency of the GRO-PVDL, and the largest
pipeline latency is 10 clock cycles. Using 90nm CMOS technology, the circuits is
successfully synthesized given the clock frequency of 500MHz, with the smallest timing
slack of 0.13ns.
88
a. Coarse code generator
The block diagram of the coarse code generator is shown in Fig. 52. For convenient
explanation, it is divided into three parts.ê ëì í î í ï ð ñò ó ô õ ò ó ô õ ò ó ô õ ì ö ÷ ø ø ùúûüýþ í ÿ        ðê   ê ë	 ñ ö 
 ñ ñ í  íüý ö ÷ ø ø ù ñ ì  ìúûý ñ í ÿñ í ÿò   ó  õ ò  ó  õ ò  ó  õ      úûü
Fig. 52. Implementation of coarse code generator.
The coarse measurement DIFFc is the digitization of EN pulse width tEN with
a resolution of GRO stage delay τG. Referring back to the GRO principle, DIFFc
is the differentiation of ADD which is the number of the total GRO tap transitions
during tEN. As introduced for the counter design, ADD is given by 6Ncycle +Nstage,














first part of the coarse code generator implements the above function.
Nevertheless, DIFFc should also include the overflow of the asynchronous counter.
When overflow happens, the counter output Ncycle automatically switches from all 1’s
to all 0’s, and at the same time the overflow flag OF becomes 1 (OF is 0 when there
is no overflow). Therefore, the second part of the coarse code generator includes the
effect of overflow by adding OF× 2k × 6 into DIFFc, where k is the bit width of the
asynchronous counter and k=4 in the implementation.
89
The third part of the coarse code generator deals with ∆NV, the carry from the
fine measurement to the coarse measurement. Referring back to the delay insertion
technique that solves the issue of the uneven GRO stage delays, ∆NV is the differen-
tiation of NV, and NV is the number of the GRO phase switches during the inserted

























C is the GRO phase sampled by the first
clock from the PVDL. Therefore, ∆NV can be given by:











where Diff{} means differentiation. Since the inserted delay dV is fixed, NV that




floor and the ceiling of dV
τG
). Consequently, ∆NV ranges within {−1, 0, 1}. However,
due to the periodicity of the GRO phase, Fch can only provide a range of {0, 1, .., 5},
which causes ∆NV ∈{−6,−5, .., 5, 6}. To solve this contradiction, another mapping













∆NV − 6, ∆NV ≥ 2
∆NV + 6, ∆NV ≤ −2
∆NV, otherwise
, (3.16)
Finally, the coarse measurement Mc is combined by DIFFc and FV(∆NV).
b. VDL decoder
The VDL decoder generates the output of the VDL-style structure at the second level



















C represents the GRO phase sampled by the ith clock generated by the










a through an XOR gate. Referring back to the principle of the GRO-PVDL, the




a · · · .




d · · · can be finally generated,
which is a sequence of 1’s followed by 0’s, and the boundary between 1’s or 0’s is at the




a · · · . Therefore, this position can




d · · · . In order to achieve a higher operating
frequency, the combinational logic of the VDL decoder in Fig. 53(a) is divided into
8 smaller combinational logics, and implemented using a pipeline structure with a
propagation latency of 8 clock cycles.
A good feature of the proposed VDL decoder is bubble suppression. In practice,
due to noise and delay mismatch, there may be more than one sequence of double 0’s




a · · · , a.k.a. bubbles. Fortunately, the proposed logics are capable of




d · · · is always bubble-free. An example
of bubble suppression is demonstrated in Fig. 53(b).
c. Fine code generator
From the VDL output CV, the fine measurement DIFFf can be generated simply by
DIFFf [i] =CV[i] − CV[i + 1]. But considering the fine measurement and the coarse
measurement have different resolutions (τ∆ and τG respectively), the question is how
to combine them together to get an overall measurement Moa. To solve it, the fine
measurement should change its unit to the same as the coarse resolution. Let Mf








DIFFf . Unfortunately, the ratio between τ∆ and τG is unknown due to
process variation, and thus needs to be calibrated.
In the proposed fine code generator, the VDL output CV is first transformed into
91
                           ! " # $     %     %     %                & &' (   & &' )    ' *    ' +                    %     %     % , ' (   ' (  % , ' )   ' )  % , ' *   ' *  % , ' +   ' +  % ,
(a)- . / 0 1 - 2 / 0 1 - 3 / 0 14 4 5 6 7 / 0 15 6 8 / 0 14 6 9 / 0 14 6 : / 0 15 ; <=4 5 54 5 4 4 5 44 44 555 > ? 5 5 45 4 54 4 5 44 5 544 555 444@ A B C D E4 F G 5 F 4 5 55 5 4H 44 H 5 HH 55 H 44 HC B C C D E
(b)




V whose unit is τG. Then C
(c)
V is differentiated to generate Mf whose
unit is also τG. Since now the fine measurement Mf and the coarse measurement Mc
have the same unit of τG, we have:
tEN = (Mc +Mf) · τG + q(sn)f , (3.17)
where q
(sn)
f is the shaped quantization noise.
To transform CV into C
(c)
V , we start from the observation that max(CV) ·τ∆ = τG,
where max(CV) is the maximum value of CV. This is because the VDL-style structure
digitized the residue of the coarse measurement which is within [0, τG). Also note
avg(CV) = max(CV)/2, where avg(CV) is the average value of CV, if the residue is
uniformly distributed. In practice, the use of avg(CV) is preferred, because avg(CV) is
more statistically stable than max(CV). Therefore,
τ∆
τG








Because the unit of C
(c)
V is τG, we have:
C
(c)
V · τG = CV · τ∆. (3.19)







The block diagram of this operation is shown in Fig. 54. The averager is implemented
using an infinite impulse response (IIR) filter: y = αx
1+(α−1)z−1
, where α = 2−8. The
divider is implemented in pipelined structure to increase the operating frequency.
93I JK L M NO P Q R S I T Q U V W X J YK L M S Z [ U J X \ J ]J ]Z [ J X T ^ _ ` ` NZ [ a b c d e
Fig. 54. Implementation of fine code calibration.
D. Experimental results
The proposed GRO-PVDL structure is implemented using a commercial 90nm CMOS
technology. The GRO, the PVDL and the DFFs are designed with analog flow and
their layouts are drawn manually. The DSP unit is designed with digital design flow,
and synthesized with the constraint of the 200MHz clock frequency. And timing check
is passed for the digital part with back annotations extracted from the automatically
generated layout, with the smallest timing slack of 2.97ns. The layout of the entire
system takes an area of 0.013mm2, as shown in Fig. 55. At 1.2V power supply and
the sampling frequency of 200MHz, the power consumption is 2.05mW for the 400ps
EN pulse width (0.92mW for the digital part and 1.13mW for the analog part).
Post-layout simulation is run for the proposed structure. Note that the simulation
is a mixed-signal simulation: SPICE simulation for the analog-designed parts with
R and C extracted from the layout, and Verilog simulation for the digital part. The
inputs in the simulation are two pulse signals of 200MHz, one as START input
and the other as STOP input. So the sampling frequency of the measurement is
also 200MHz. Besides the resistance and capacitance extracted from the layout, the
SPICE simulation for the analog part will include the effects of (1) the across-chip
process variation, with the process variation models included in the foundry provided
process design kit (PDK); (2) the thermal and flicker noise of transistors, using the
94
f g g h ijklmn o pq rs t u t v w x y v z { f | }~  f  z    
    
Fig. 55. Layout of the entire GRO-PVDL structure.
transient noise models included in the design PDK; (3) a manually injected power
supply noise, using a Verilog-AMS module that models a white Gaussian noise whose
standard deviation is 0.012V, i.e. 1% of the power supply voltage.
To find out the effective resolution, a single tone jitter input is first applied. The
time difference between the rising edges of the two input signals is sinusoidal with a
frequency of 500kHz and a peak-to-peak amplitude of 0.5ps, in addition to a DC level
of 400ps. Fig. 56 shows the corresponding measurement result in both frequency and
time domains. Note the DC offset is removed from the power spectral density (PSD)
for clear observation of the quantization noise, and the PSD is generated from 16,384
samples of measurement, using Welch’s averaged modified periodogram method of
spectral estimation [43] with Hanning window. As a reference, the ideal PSD of the
shaped quantization noise without the delay mismatch is given by the dash line, which
95









where fs=200MHz is the sampling frequency, and τ∆ is the fine raw resolution, i.e. the
difference of the average GRO cell delay and the average PVDL cell delay (τ∆ ≈5ps).
As illustrated in Fig. 56(a), most of the quantization noise is pushed towards high
frequencies (the Nyquist frequency =100MHz is half of the sampling frequency). At
frequencies between 50kHz and 5MHz, the noise is comparable to the ideal quanti-
zation noise (without noise shaping) that is produced by a classical quantizer with
0.8ps steps and a sampling frequency of 200MHz, which is represented by the straight
thick line. Therefore, the proposed on-chip jitter measurement can achieve an effec-
tive resolution of 0.8ps. Alternatively, if a classical quantizer with 5MHz sampling
frequency is used as reference, then its quantization step should be reduced to around
130fs to achieve the equivalent noise level.
To see the noise contribution from different noise sources, a post-layout simula-
tion with the same input jitter is also run as a reference, but without the device noise
of transistors and the power supply noise. Comparing Fig. 56(a) with Fig. 57, we
can see that among all the measurement noise, the noise at the lower frequencies is
dominated by the flicker noise while the noise at the higher frequencies is dominated
by the shaped quantization noise, and the frequency range that is most sensitive to
the input jitter is from 50kHz to 5MHz, where the thermal noise and power supply
noise is dominating.
A random jitter input is also applied to emulate the jitter of high speed signals.
The jitter under measurement is composed of a random jitter and a 1MHz sinusoidal
jitter. The random jitter is generated by filtering a white Gaussian noise using a low-






































































































































Fig. 57. PSD of the measurement for 0.5pspp sin. input (transistor noise and power
supply noise are NOT included in the simulation).
measurement results are generated by low-pass filtering the GRO-PVDL output. In
Fig. 58, the histogram of the measurement result within 50µs is compared with that
of the input jitter. Given the input jitter is 7.31ps (RMS) and the measured result
through simulation is 7.34ps (RMS) (the DC level is removed when calculating the
jitter RMS), a relative error of 0.41% is obtained.
1. Delay mismatch analysis
Due to process variation, the cell delays of the GRO and the PVDL vary for dif-
ferent chips and different cells on one chip. For the proposed high-resolution jitter
measurement technique, the effect of process variation has to be considered.
Chip-to-chip variations cause the average cell delays to deviate from the designed
values (25ps for the GRO and 30ps for the PVDL). The deviations of the average
delays τ∆ and τG can be calibrated. As introduced for the DSP unit, the fine code
98






































Fig. 58. The histograms of (a) input jitter (b) measurement result (after low-pass
filtering).
generator implicitly calibrates the ratio between τ∆ and τG on the fly, and requires
no external assistance or reconfiguration. As for the absolute value of τG, it can be
calibrated off-line, by measuring a periodic signal whose pulse width is already known.
According to [44], the calibration accuracy can be raised to an acceptable level by
increasing the calibration time and averaging the calibration result.
On the other hand, across-chip variations cause the mismatch between the delays
of the same type of cells, as is illustrated in Fig. 59. Sophisticated calibration schemes
[45] can be applied, in order to calibrate the delay mismatch of the delay cells of the
GRO/PVDL. However, such calibration schemes are usually very costly. Fortunately,
the following analysis will show that the proposed GRO-PVDL structure has tolerance
to the typical delay mismatch so that the accuracy of the measurement will not be
remarkably degraded.
99
                                                             
Fig. 59. Delay mismatch of the GRO and the PVDL.
The cell delay mismatches of the GRO and the PVDL lead to the unevenness
and uncertainty of the resolution, and finally result in noise in the measurement.
Given the discreteness and high nonlinearity of the system, it is difficult to analyze
the influence of the delay mismatch analytically. Alternatively, simulations based on
the behavioral model the of the GRO-PVDL structure are carried out to analyze the
degradation of the measurement accuracy due to the delay mismatch.
The behavioral model is built up using a Matlab program, in which the delay
mismatch is modeled as normal distributions. And to focus on the influence of the





P ) is the delay of the ith cell of the GRO (the PVDL), and its deviation from the




P ). In the following simulations, the average delay





P ) is generated with a normal distribution whose mean is 0, and standard
deviation is kmisτG (kmisτP). Note that the standard deviation is proportional to
the cell delay with a ratio of kmis, which represents the level of delay mismatch.
Through SPICE-level Monte Carlo simulation with the nominal process variation
given in the PDK, we can find the typical value of kmis is 3.0%. Therefore, the PSD
of the measurement is obtained using the behavioral simulation with kmis=3.0%, as
in Fig. 60. In order to explore the tolerance of the proposed GRO-PVDL structure
100
to the delay mismatch, behavioral simulations are also run for mismatch levels larger
than the nominal value: kmis=5% and 10%, and the corresponding PSDs are shown
in Fig. 60, too. In the simulation, the sampling frequency of the measurement is
200MHz. The time difference between the rising edges of the two input signals has a
DC value of 400ps, plus a sinusoidal signal with a frequency of 500kHz and a peak-to-


























































Sin. input of 0.5ps
(peak-to-peak)
Fig. 60. Simulated PSD with different delay mismatches (kmis= 3%, 5%, 10%).
Compared with the ideal level of the shaped quantization noise, the noise exac-
erbation due to delay mismatch lies mostly in high frequencies. This is because the
position of the delay cell that is hit at the beginning and ending of each measurement
is randomly distributed on the GRO and the PVDL, and thus the resulted mismatch
errors are canceled between different measurements, especially for a long measure-
ment time. Considering the high frequency noise will be filtered by the subsequent
101
low-pass filter, the degradation of the effective resolution is very limited. The effec-
tive resolutions corresponding to different levels of delay mismatch can be obtained
by comparing the noise at frequencies lower than 5MHz with the ideal quantization
noise produced by a classical quantizer with a sampling frequency of 200MHz, as
listed in Table II.
Table II. Effective resolutions for different kmis.





Table III compares the specifications of this work with those in earlier studies. Here
we focus on the comparison with two previous techniques that also utilize the quan-
tization noise shaping through the GRO principle: the multi-path GRO in [34] and
the Vernier GRO in [35].
[34] proposes a multi-path structure that improves the raw resolution of GRO
from 30-35ps to 6ps using a 47-stage GRO connected by multiple paths. A multi-path
GRO has tens of delay stages, and each stage has multiple inputs and one output. The
signal paths that connect the GRO taps need to be carefully designed, otherwise the
complex path connection may easily lead to the malfunction of the multi-path GRO,
such as oscillating at a wrong frequency due to the domination of small oscillation
loops inside the GRO. And the large number of delay stages in muli-path GROs also
increases the hardware overhead. As listed in Table III, the multi-path GRO in [34]
102
Table III. Comparison of Specifications.
Ref. [34] [35] This work
Process (nm) 130 90 90
Raw resol. (ps) 6 5.8 5
Effective resol. (ps) 1@50MHz 3.2@25MHz 0.8@200MHz
Sampling freq. (MHz) 50 25 200
Area (mm2) 0.04 0.027 0.013
Power (mW) 2.2 (1.5V) 3.6 (1.2V) 2.05 (1.2V)
Technique multi-path GRO Vernier GRO GRO-PVDL
takes an area of 0.04mm2, while the GRO-PVDL in this work only takes 0.013mm2.
After performing an ideal scaling (130/90)2 to include the process difference (130nm
for [34] vs 90nm for this work), the GRO-PVDL still takes 32% less area than the
multi-path GRO.
On the other hand, the Vernier GRO achieves a raw resolution of 5.8ps by means
by two Vernier-style GROs. But its drawback is that the measurement requires a
time much longer than the EN pulse width, which largely limits its highest sampling
frequency. Considering the advantage of noise shaping can only be achieved with large
over-sampling rate, the limited sampling frequency of the Vernier GRO (25MHz) does
not allow much room for the improvement of the effective resolution. Therefore, the
improvement from the raw resolution 5.8ps to the effective resolution 3.2ps is only
45%. In contrast, the proposed GRO-PVDL in our work can achieve a sampling
frequency of 200MHz, and thus the raw resolution of 5ps is improved by 84% to
achieve an effective resolution of 0.8ps.
Compared with these two previous structures, the GRO-PVDL structure not only
has lower hardware overhead, but also has a higher sampling frequency. Moreover,
103
the digital circuits in the GRO-PVDL take a larger proportion in the total hardware,
which improves the overall robustness of the system.
E. Summary
A novel structure of GRO-PVDL is proposed for the purpose of on-chip jitter mea-
surement of high-speed signals. The structure is composed of two level: the first level
is the GRO providing the coarse measurement; and the second level further measures
the residue from the first level with the fine resolution. The GRO-PVDL structure
improves the raw resolution of the GRO through the Vernier-style structure at the
second level that reuses the GRO on the first level in addition to a PVDL. At the same
time, the GRO feature of quantization noise shaping is also preserved by the GRO-
PVDL, and thus an even finer effective resolution can be achieved. The proposed
structure also includes a pipeline DSP unit with online calibration between the fine
resolution and the coarse resolution. Besides, the proposed GRO-PVDL is shown to
be highly tolerable to the delay mismatch, from the analysis based on the behavioral
model. Implemented with a commercial 90nm CMOS technology, the GRO-PVDL
can achieve a sampling frequency of 200MHz and an effective resolution of 0.8ps.
104
CHAPTER IV
IN-SITU TEST OF ALL DIGITAL PLLS
Unlike the in-situ jitter measurement technique proposed in Chapter III, this chapter
introduces an in-situ test scheme that is specifically designed for a specific types of
AMS circuits: all digital PLLs (ADPLLs). The proposed in-situ test scheme is based
on the loop reconfiguration of ADPLLs, which takes advantage of the close interaction
between the key analog building blocks and the digital loop filter. The work in this
chapter is also published in [46].
A. Introduction
Ensuring analog/mixed-signal design robustness and providing low-cost built-in test
solutions remain as a significant challenge due to the complex analog nature of circuit
operation [47] [48] [49]. The performance improvement of digital transistors via scaling
has stimulated wide interest in digitally intensive analog implementations [50] [51].
This has not only provided appealing new design tradeoffs, but also motivated us to
exploit such implementation style for novel analog built-in self test (BIST) solutions.
A digital-like BIST approach is proposed to the test and diagnosis of the out-
put jitter, a key complex RF analog performance, of recent all-digital PLL(ADPLL)
designs [51], whose block diagram is shown in Fig. 61, where the phase of the input
(φR, the reference phase) and the phase of the output (φV, the variable phase) are
normalized by their own periods. N is the frequency control word which defines the
frequency ratio between the output and the input, and it could be a fractional num-
ber. An accumulator generates the integral part of φV and a time-to-digital converter
(TDC) provides its fractional part. The phase error between φV and N · φR is fil-
tered by a loop filter and adjusts a digitally-controlled oscillator (DCO) in a negative
105
feedback manner such that φV ≈ N · φR.
    ¡ ¢ £¤ ¡ ¥ ¦ § § ¨ © ¨ ª « ¬ ­ ®¯ ° ±
° ± ²³ ­ ­ ´µ ¶ ª ¬ · ®¸ ¹ « º ·· ® ® ­ ®
» ¼ ½ ¯ ¾ ¿ À ¿ Á Â Ã Ä ¿ À Å Â Á Æ Ç ÈÉ Ê Ë Ì Í Î Ï Ð Ñ É Ê
Ì Æ Á Ò Æ Á Ó ¿ Á Á È Ç¾ ¿ Â À Å Ô Ä ¿ Ä Ï Å Õ Ô Ö
× · Ø · ® · Ù § ·¶ Ù ´ ¨ ¬ Ú « ® ¶ « Û ª ·­ ¨ ¬ ´ ¨ ¬
Ü ¾ Ë Ë Â Ã ¿ Ý Ç Â Á Ô Ç
Ü ¾ Ë Ç È Ä Ô Ã Æ Á ¿ Ô Å
Fig. 61. All-digital PLL block diagram including BIST.
Since the digital processing and control blocks are implemented in robust digital
logic, we target the jitter performance degradation introduced by parametric varia-
tions of key analog blocks including the TDC and the DCO, as well as the reference
jitter. The prediction of the output jitter is based upon processing low-frequency
phase error signals, the test signatures, in digital form. Unlike prior work [1] that
also utilizes digital signatures for jitter testing, the novel employment of loop fil-
ter reconfiguration and on-chip TDC calibrator makes the BIST scheme proposed
in this chapter possible to provide reliable diagnosis and test under multiple analog
performance perturbations. The digital-like design implementation has enabled easy
reconfiguration and led to the low cost of the proposed approach.
106
In the proposed BIST scheme, multiple digital signatures are extracted for ob-
serving the “syndromes” under different loop filter configurations. By means of the
transfer function analysis, the mapping from the signatures to the output jitter is
precalculated and stored in the BIST scheme. Moreover, for the purpose of diagno-
sis, the signatures can also be mapped to the levels of different noise sources, with
the assistance of the TDC calibrator. The hardware overhead of the BIST is mainly
from the digital signal processing on the signatures and additional filters, which is
relatively small compared to the whole ADPLL system, and could be further reduced
through the reuse of on-chip processor.
B. Principle of jitter estimation and diagnosis
In this section, the noise models used in this work are presented and the signal analysis
that leads to the proposed BIST and diagnosis.
1. Noise model
The three noise sources in the ADPLL, the reference clock jitter, the TDC quantiza-
tion noise and the DCO phase noise, are mathematically modeled in the frequency
domain, to help analyze the output jitter.
The reference clock is usually generated by a crystal oscillator, and thus provides
a single-tone spectrum with little spectral spread. Since the reference phase noise has
a relatively flat spectrum, it can be treated as constant from dc to half of the sampling
frequency, the reference frequency fREF, whose power spectral density (PSD) is:
ΦREF(∆f) = LR, (4.1)
where ∆f is the frequency offset ranging from −fREF/2 to fREF/2, and LR is a
107
constant describing the noise level of the reference phase noise.
Although in the realistic situation its low frequency components have higher
slopes, their bandwidth is so small that the corresponding frequency drifts hardly
show up in the concerned time, such as one GSM burst: 577 µs or one WCDMA slot:
667 µs.
The second noise source is the TDC quantization noise due to its time resolu-
tion. Similar to the quantization noise of analog-to-digital converter (ADC), the TDC
quantization noise can be modeled as an additive random variable with uniform dis-
tribution and white noise spectral characteristic. Its effective time jitter JTDC (RMS)











where LTDC is the constant noise level for ∆f from −fREF/2 to fREF/2, and fDCO is
the DCO frequency.
The assumption of uniform distribution means that the TDC generates different
quantization levels with equal probabilities, which is true except for some special
situations: e.g. an integral N that results in a bang-bang phase detection.
Besides the noise from the reference clock and the TDC, the DCO is another
major noise source. The phase noise spectrum of an oscillator can be divided into three
segments [53], as is shown in Fig. 62. The 1/∆f 2 segment is called the wander noise,
generally referred to as the thermal noise and caused by the white-noise fluctuation
108
of the oscillating frequency. The DCO quantization noise and the DCO power supply
noise will also cause the wander noise [54] [55]. The 1/∆f 3 segment at lower offset
frequencies is called the flicker noise, and the flat segment is the thermal electronic










Fig. 62. The phase noise spectrum of a typical oscillator.
Given the flicker noise and the wander noise are the dominant noise mechanisms
of the DCO within the frequency range concerned, the PSD of the DCO phase noise




where LF and LW are the noise levels of the flicker noise and the wander noise,
respectively.
2. Transfer function analysis
When the ADPLL is locked in the tracking mode, linear frequency-domain transfer
functions are applicable under the small signal assumption. The s-domain model of an
ADPLL system is shown in Fig. 63, where φn,REF is the phase noise from the reference
input, φn,TDC is from the TDC noise and φn,DCO is the phase noise of the DCO. In the
proposed BIST scheme, the loop filter is separated into two cascaded filters, LF1 and
109
LF2. φE and φ̂E are the potential signatures. The DCO gain calibration gives K̂DCO
as an estimate of the DCO gain KDCO. The coefficient r indicates the calibration
error of the TDC resolution.
Þß à á â ã ä å à æç ã ä å è é ê ë ì æ
í î ï ð ñ ò ó ô õ ö ÷ ø ùÞç ã ä å
ú û ñ ß õ ÷ õ ü õ ý ü õ ïþ ÿ  à á â  ã ä é êà á â   à 
  Þ   éà  à  î î   ó ò ü õ ï	 
    	    
Fig. 63. s-domain model of ADPLL including noise sources.





where F1(s) and F2(s) are the transfer functions of LF1 and LF2. The z-domain
transfer functions of the digital filters can be converted to s-domain by substituting
z with es/fR. KDCO/K̂DCO ≈ 1 and r ≈ 1 are assumed because of the DCO gain
calibration.
110













































The closed-loop transfer functions from the noise sources to the output phase
noise φn,O and the potential signatures are listed in Table IV. For typical noise levels
and loop settings, the TDC noise has the least influence among the three noise sources,
and the noise components at frequencies higher than fR/2 is tens of dBs smaller and
their spectrum aliasing is therefore neglectable.







Referring back to Eq. (5.5), (5.4), (5.3), Eq. (4.6) can be transformed as:
Sφn,O(f) =(NLR + LT)|HT2O(2πjf)|2
+ LW|HD2O(2πjf)|2/f 2 + LF|HD2O(2πjf)|2/f 3
(4.7)




SφO(f)df . Because signals are sampled at the reference frequency fR in digital
blocks, the integration range is from 0 to fR/2 to meet Nyquist theorem. The power
of the output phase noise in the normal working mode can be written as:














It can be seen from Eq. (4.8) that the power of the output phase noise is a linear
combination of the noise levels of the noise sources.
In the proposed BIST scheme, three signatures under different configurations are
collected and processed. Similar to the derivation of φ2n,O, the power of each signature
can also be approximated with a linear combination of the noise levels of the noise







































































































































































Substituting Eq. (4.13) back into Eq. (4.8), and the power of phase noise at the














































































In fact, Eq. (4.14) gives the mapping from the signatures to the output jitter, and
Eq. (4.13) gives out the mapping from the signatures to the noise-related parameters
of the analog blocks. Note that the coefficients in Eq. (4.13), (4.14) are determined
by transfer functions; CT2O, CW2O, CF2O are calculated from the transfer functions






F need to be calculated from
the transfer functions for the three BIST configurations. It is important to note
that these transfer functions are fully determined by digital logic that is assumed
to be robust and hence independent of analog block variations. This implies that
all the information required by the proposed scheme, i.e. the coefficient matrices in
Eq. (4.13) and Eq. (4.14), can be precomputed and stored on-chip in the form of
constants. Moveover, the noise level of TDC can be directly calculated by Eq. (4.2)
and Eq. (5.4), given the TDC resolution provided by the TDC calibrator.
113
C. BIST scheme
The block diagram of the proposed BIST is shown in Fig. 64. In the BIST mode, the
TDC calibrator reconfigures the TDC delay chain into a ring oscillator to provide on-
line calibration of the resolution. The loop filter is separated into two cascaded filters
and provides two internal signals φE and φ̂E to be potential signatures. The loop filter
characteristics can be altered in three pre-stored reconfigurations. Reconfiguration
exposes the parametric fluctuation of the TDC, the DCO and the reference signal to
digital test signatures with varying sensitivities so as to provide sufficient information
for test and diagnosis. Each pre-store configuration is selected by setting config num
and forcing the loop to settle. The reconfiguration controller also designates the dig-
ital signature by sig sel. The estimation mapper receives the designated signature,
processes it and stores the processing result. After the ADPLL has run under the
three configurations, the three signatures are all collected by the estimation mapper.
Together with the TDC resolution provided by the TDC calibrator, the estimated
output jitter, TDC resolution, noise performances of the DCO and reference that are
causing the output jitter level (i.e. diagnosis) are outputted. Noting that the loop
filter is configured as two cascaded filters only at BIST mode, it could be configured
to any other forms at normal working state.
1. Reconfigurable loop filters
As mentioned before, the loop filter is separated into two cascaded filters, LF1 and
LF2. LF2 is a first-order IIR for building a type-II loop. LF1 provides an option
for higher-order loop by a cascade of single-pole IIR filters, which is unconditionally
stable. Any of the IIR filters could be bypassed to adjust the loop order. The z-
114
                     
    ! " # $ % &  ' #  !  ! ' &  ( (  &) * + , - . / + 0 1 2 -. / 2 3 4    ( # 5 &  '  &
6  ' # 7  ' #  !8  9 9  &
:;
< =    & &  &    9  # ( '  &  > ? @  &   &  
 % ' 9 % 'A # ' '  & #  $ !   # 2 -. ; 2 - . B 2 - . C
D E F G 3 2 * 4 0 H - * +
Fig. 64. BIST block diagram.






z − (1− λi)
(4.15)
F2(z) = α +
%
z − 1 (4.16)
where λi, α and % are filter coefficients. LF1 and LF2 can be easily reconfigured by
changing their coefficients. These coefficients are set to integer powers of two, such
that the multiplications can be easily implemented by bit-shifters.
The purpose of loop filter reconfiguration is to distinguish the contribution of
each noise source to different signatures as much as possible. In order to optimize the
loop configurations for BIST purpose, the sensitivity of the signatures to the objective
115















































which can be calculated from Eq. (4.14).
An example of configuration setup is given in Table V. The reference frequency
is 26MHz and N = 96.15. Under such configuration setup and assuming the noise
sources have typical noise levels, the power spectra of the three signatures are drawn
in Fig. 65. For Config. 1, contributions from the noise sources to the signature power
are well balanced. The signature power mainly comes from the DCO noise for Config.
2, while for Config. 3 the noise from the reference input and the TDC dominates the
signature power.
Table V. An example of configuration setup. (LF1 is bypassed for Config. 3)
Config. λ1 λ2 λ3 λ4 α % Sig.
1 2−3 2−3 2−3 2−4 2−7 2−15 φ̂E
2 2−6 2−6 2−6 2−7 2−10 2−20 φ̂E
3 − − − − 2−4 2−10 φE
2. TDC calibrator
The TDC resolution is subject to change with process variations, so it needs to be
calibrated for each individual chip. In order to achieve that, the TDC delay cells can
be reconfigured into a ring oscillator by inserting inverters to connect the head and
the tail of the TDC delay line. And the total number of the inverters in the ring
oscillator is odd to ensure proper oscillation. The oscillating frequency is calculated




























































































































(c) Config.3 in Table V.
Fig. 65. The composition of the signature power spectral density.
117
Fig. 66.
I J KL M N O P Q O PK R S M Q O PT O U O P O M V OW M X S Q Y Z [ \ ] ^ _ `
a V b c b P d e f g h i d e f gj k f l m
Fig. 66. TDC resolution calibration.
Supposing there are Neqn equivalent inverters in the configured oscillation loop,
and for K reference clock periods we count C oscillation cycles, then the TDC reso-





where TR is the reference clock period.
3. Hardware overhead
The hardware overhead of the proposed BIST scheme is mainly from the digital signal
processing on the signatures and additional filters. Table VI compares the areas of
the BIST scheme and the ADPLL system. It can be seen that the BIST area is 9.3%
of the ADPLL area. Moreover, [51] has provided a system-on-chip solution including
both the ADPLL and an on-chip digital signal processor (DSP). Thus, the cost of
signature processing can be further saved by reusing the on-chip DSP, and the area
ratio of the BIST to the ADPLL can be reduced to 1.3%.
118










In the section, the proposed BIST scheme is evaluated by a behaviorial modeling and
simulation environment, and Monte Carlo analysis is carried out to show the accuracy
of the BIST results.
1. Setup of simulation environment
The simulation environment is based on a standard event-driven simulator, Verilog.
The whole ADPLL system, including digital logic and control, and analog circuits
like TDC and DCO, is integrated in a Verilog-based simulation environment. The
RTL description of Verilog for digital circuits is a behavioral model that can simulate
the digital circuits free from error. On the other hand, the behavioral models of the
analog circuits and input signals need to be built up carefully to include the factors
that will influence the noise performance of the ADPLL.
The signals at the interfaces of the IIR filters have a width of 23 bits, 8 as the
integral part and 15 as the fractional part. Most digital parts are synchronized by
the reference clock. The accumulator is working at the frequency of the output clock.
A Σ∆ dithering working at the 1/4 of the output clock frequency for removing spurs
119
in the PSD of output phase noise is also modeled.
Time-domain phase noises are simulated in the models of both the DCO and
the reference input. The flat segment is modeled as a white Gaussian noise. The
wander noise is modeled as an accumulative jitter, which is the integration of a white
Gaussian noise. The flicker noise is modeled by a weighted sum of low-pass filters
through time-domain filtering of white noise [56]. Effect of the TDC nonlinearity
is included in the TDC model, with its differential nonlinearity (DNL) and integral
nonlinearity (INL) both lower than 0.7LSB. It is assumed that the TDC resolution
has an random error with 5% due to the accuracy of the TDC calibrator. The center
frequency of reference clock is set to 10MHz and the feedback division ratio N is set
to 240.1875 so that output clock is about 2.4GHz.
2. Monte Carlo analysis
Based on the described simulation environment, Monte Carlo analysis is carried out
to evaluate the BIST scheme. In the analysis, the variational parameters are the
phase noise levels of the DCO and the reference input, the TDC resolution and its
DNL and INL. For each variational parameter, the variance is set to 3σ = 10% for a
90nm CMOS technology [57].
2,000 Monte-Carlo simulation samples are generated by conducting the event-
driven simulation for the ADPLL system. The RMS jitter of the output clock during
10ms is estimated by the proposed BIST method. In the BIST mode, the ADPLL
runs 10ms for each configuration.
In order to evaluate the accuracy of test and diagnosis results, the output jitter,
the phase noise level of the reference input and the DCO are also directly measured
in Monte Carlo analysis. For the output jitter, the distributions of direct measure-
ment and BIST estimation are compared in Fig. 67. According to the pass/fail line
120

















(a) Directly measured output jitter.

















(b) BIST estimated output jitter.
Fig. 67. BIST estimation VS. directly measurement.
(jitter=4ps) in the figures, the defect escape rate is 1.5% and the yield loss rate is
2%. Fig. 68 and Fig. 69 show the accuracy of output phase noise estimation. As can
be seen, the overall relative error is roughly 5%. The highest relative error of 20%
occurs when output jitter is smaller than 1ps. The average relative error comes to its
minimal point when output jitter is around 5ps. This estimation accuracy is not good
enough for marginal production test. However, this BIST scheme could effectively
detect large deviations of the analog modules from their nominal values, as well as
diagnose the noise sources in the ADPLL.
For each test case, the BIST scheme proposed in [1], where no TDC resolution
and loop filter reconfiguration is employed, is also simulated. Fig. 70 compares
the results of the propoased BIST scheme and the previous one. For the previous
BIST scheme, the estimation error increases in proportion to the contribution of the
reference noise to the output, while for the proposed BIST scheme, the error always
keeps at a low level. This is because the proposed scheme is able to separate the
influences of the reference noise and the DCO noise.
The diagnosis results are presented in Fig. 71, versus the percentage contribu-
121























Fig. 68. Estimated jitter compares with measured jitter




























Fig. 69. Relative estimation error. Error is averaged in each 1ps interval of the output
jitter.
122



































Fig. 70. The proposed BIST scheme VS. the BIST scheme in [1].
tion of the corresponding noise source to the output jitter. For the diagnosis of the
reference noise and the TDC noise, the relative estimation error is below 5%. The
DCO noise diagnosis only has a high relative error when its contribution to the output
jitter is low, and fortunately in this case the phase noise of the DCO is noncritical to
the output jitter performance. The relative error drops to about 10% for large DCO
contributions.

































Fig. 71. Average error of the diagnosis of the four main noise sources.
123
E. Summary
A BIST approach is proposed, targeting complex RF jitter performance of ADPLLs.
Digital signatures are collected and processed under specifically designed loop filter
configuration and a signature-to-performance mapping is derived based on simplified
noise models and transfer function analysis. Monte Carlo analysis is carried out within
a behaviorial modeling and simulation environment to evaluate the accuracy of the




IN-SITU TEST AND CALIBRATION OF
ALL DIGITAL POLAR TRANSMITTERS
This chapter extends the in-situ test scheme in Chapter IV to measure the error vector
magnitude performance of all digital polar transmitters. But unlike the BIST scheme
in Chapter IV, this test scheme can provide measurements on-the-fly. The in-situ
calibration of a key analog block, digitally-controlled oscillator is also implemented.
The work in this chapter is also published in [58].
A. Introduction
With the continuing technology scaling, the performance improvement of digital tran-
sistors has stimulated wide interest in digital intensive analog implementation [59]
[50]. However, low-cost self-adaptation and built-in self-test (BIST) solutions for
analog/mixed-signal designs, especially those for RF wireless applications, remain as
a significant challenge due to the complex analog nature of circuit operation [48] [1].
In this chapter, the interaction between the analog and digital domains is exploited
for a recent digital polar transmitter architecture, to provide a novel BIST solution
aiming at its key performance measure for modulation quality, error vector magnitude
(EVM).
For mobile communication with high data rates, the polar transmission [60] can
solve the contradiction between the spectral efficiency of modulation schemes and the
power efficiency of power amplifiers (PA). An all-digital polar transmitter (ADPT)
architecture is proposed [59], as shown in Fig. 72. The coordinate rotation digital
computer (CORDIC) transforms the baseband data streams, I and Q, to their po-
lar coordinates. In the digital-to-RF-amplitude converter (DRAC), the amplitude
125
modulation (AM) is realized by the digitally controlled PA (DPA). In the phase
modulation (PM) path, the frequency deviation ∆f modulates the frequency of the
digitally-controlled oscillator (DCO) through an all-digital phase-locked-loop (AD-
PLL). n o p q r s t u qv o w t p x r y u t wv p o s z x x r u y{ | } ~ } | } ~ } q r    }   
     ~   
     |   |  
 o q  w t  o p     ~|  
|         

Fig. 72. Diagram of an all-digital polar RF modulator.
For polar transmission, there are three noise sources [61]: the AM path, the PM
path and the delay mismatch between the AM and PM paths. For the ADPT, the
third source is a minor issue because the delay matching is guaranteed by the control
clock cycle of digital circuits [59]. In the AM and PM paths, the nonlinear map-
ping from digital control words to analog outputs will cause distortion in modulation.
Therefore, previous works have aimed at calibrating/compensating these nonlineari-
ties. For the AM path, [62] utilizes the on-chip receiver to conduct the adaptive digital
linearization of the DPA, whereas in [63] this is achieved by coupling the RF signal
to the reference clock of the ADPLL. For the PM path, [64] proposes a least-mean
126
square based gain calibration technique of the DCO. Given the nonlinear distortion
can be mostly eliminated, random noises from analog blocks (like thermal noise, shot
noise, flicker noise, etc.) will become the main noise sources.
Focusing on the PM path, i.e. the ADPLL, an RF BIST scheme is proposed to
estimate the modulation quality degradation due to the random noises from the analog
blocks in the PM path and the reference clock jitter. The proposed BIST scheme
directly aims at the EVM performance, and can provide accurate EVM estimate
under multiple parametric process, voltage, and temperature (PVT) variations.
While the bulk of the transmitter, namely digital signal processing and control
blocks, is implemented using robust digital logic, the main sources of performance
degradations are key analog blocks and their parametric variations, and the jitter
of the reference clock. The proposed RF BIST scheme specifically targets the DCO
phase noise, finite resolution of the time-to-digital converter (TDC) in addition to
the reference clock jitter. By introducing an optimized digital filter, we collect mul-
tiple realtime low-frequency phase error signals in the digital form as test signatures.
We conduct in-depth noise analysis to elucidate the correspondence between the se-
lected digital test signatures and the EVM. Such correspondence makes it possible to
adopt simple digital processing and look-up tables (LUT) to accurately predicate the
complex EVM performance of the RF transmitter.
The proposed BIST scheme is based on the linear system analysis. To ensure
the linear operation of the ADPLL, the nonlinearity of the DCO that is caused by
the DCO gain mismatch needs to be calibrated. The DCO gain calibration in [64]
targets at a narrow band application, EDGE (200kHz). For the WCDMA application
(5MHz), however, a much wider frequency tuning range will lead to significant PM
path distortion due to the DCO gain mismatch. Therefore, a wide-band DCO self-
calibration is also proposed in this chapter.
127
B. ADPLL
In this section, the ADPLL architecture as well as the principle of two-point modu-
lation are introduced. And the requirements of the WCDMA polar transmission on
the ADPLL are discussed.
1. Architecture
The ADPLL architecture is shown in Fig. 73. The phase detection is accomplished by
a TDC and a digital phase accumulator, where the phase of reference clock (FREF) is
multiplied by the frequency control word (FCW) and then compared to the phase of
the DCO signal (FDCO). The purely digital signal, phase error (PHE), is filtered by
the loop filter, and the filtered DCO tuning word (DTW) is used to tune the FDCO.
When the loop is stable, the DCO frequency fDCO can be expressed as:
fDCO ≈ N × fREF, (5.1)
where fREF is the frequency of the FREF and N is the value of the FCW and can be
a fractional number.
2. Two-point modulation
Two-point modulation is a common technique for polar transmitters [61]. It objective
is to obtain a quick response at the oscillator frequency without changing the lock-in
state of the loop, by synchronously modulating both the frequency divider ratio and
the oscillator frequency control signal.
For the ADPLL, two-point modulation is realized by changing the FCW and the
DTW at the same time. In order to quantitatively control the DTW offset to match
























































































































































































is achieved through adding the FCW offset to the normalized DTW (NDTW).
3. Requirements of WCDMA
Eq. (5.1) shows that the DCO frequency is controlled by the FCW. Here we focus
on the range of the FCW, which is the summation of the channel FCW that is set
by the channel frequency and the data FCW that is fed by the modulating frequency
deviation ∆f .
For a discrete-time system like the ADPLL, the modulation will be distorted
due to spectrum aliasing if the changing rate of the phase signal θ is over half of the
sampling frequency. In the ADPLL, the reference clock is also used as the sampling
clock, which is usually tens of megahertz. For the rest of the chapter, a reference
frequency of 26MHz is assumed. On the other hand, although the I/Q signals of
WCDMA are bandlimited (5MHz), the CORDIC transforms them to θ with an un-
limited bandwidth by performing arctan on the ratio of Q to I. This undesired
bandwidth growth can be alleviated using a time-domain signal processing method
in [65]. Fig. 74 shows the frequency deviation of a typical WCDMA modulation after
the bandwidth reduction. Using this technique, the changing rate of θ, i.e. ∆f , will
be within the range of ±13MHz. Though the cost of EVM degradation is inevitable,
such degradation is carefully controlled to a minimum level.
C. RF BIST for EVM
In this section, a BIST scheme is proposed that builds up an accurate mapping from
the digital signatures, PHE and PHE1, to the EVM degradation due to such noises.
PHE is the phase detection output. PHE1 is produced by filtering PHE with an
additional branch filter. The signatures have different sensitivities to different noise
130













Fig. 74. The frequency deviation of a typical WCDMA modulation in one WCDMA
slot (667µs) with the bandwidth reduction technique.
sources, which are pre-calculated from the filter configuration and stored in the BIST
scheme. The branch filter is optimized to distinguish the contributions of different
noise sources to the signatures from each other. Note that the branch filter does not
influence the normal functioning of the ADPLL, which enables online testing.
1. z-domain model
Given the assumption that the DCO gain can be calibrated, the ADPLL can be
treated as a linear system. The influence of the noise sources in the ADPLL on the
EVM degradation can be analyzed based on the z domain model in Fig. 75. The
open-loop transfer function is defined as:
HOL(z) = r ·HLF(z)/(z − 1), (5.2)
where HLF(z) is the transfer function of the loop filter. A common first-order loop fil-
ter (HLF(z) = α+ρ/(z−1)) is used in the following discussion. The r = KDCO/K̂DCO
factor, where K̂DCO is an estimate of the DCO gain, KDCO. Because of the DCO gain
calibration that will be introduced later, r = 1 is safely assumed. z-domain expres-
131
sions can be transformed to the frequency domain by substituting z = e2πjf/fREF ,
where fREF is the sampling rate.
Fig. 75. The z-domain model of the ADPLL including noise sources.
Again, because of the assumption that the DCO gain can be calibrated, the PM
path distortion will be mostly contributed by the reference clock jitter, the TDC
quantization noise and the DCO phase noise, rather than the DCO gain mismatch.
The three sources of noise are included in the z domain model of the ADPLL: φN,REF
from the reference clock, φN,DCO from the DCO and φN,TDC from the TDC. The
closed-loop transfer functions from the noise sources to the output phase noise and
the digital signatures are listed in Table VII, where HIIR(z) is the transfer function
of the branch low-pass IIR filter.
132














































The three noise sources are mathematically modeled in the frequency domain to help
analyze the relationship between PHE and the output phase noise.
The reference clock is usually generated by a crystal oscillator, and thus provides
a single-tone spectrum with little spectral spread. The relatively flat spectrum can
be simplified as constant, whose power spectral density (PSD) is:
ΦREF(∆f) = LREF, (5.3)
where ∆f is the frequency offset (from −fREF/2 to fREF/2). LREF is a constant
describing the noise level of the reference phase noise.
The second noise source is from the TDC due to its time resolution. Similar
to the quantization noise of analog-to-digital converters, the TDC quantization noise
133
can be modeled as an additive random variable with uniform distribution and white
noise spectral characteristic:
ΦTDC(∆f) = LTDC, (5.4)
where LTDC is the constant noise level.
Apart from the reference clock and the TDC, the DCO is another major noise
source. The wander noise is the dominant noise mechanism of the DCO within the
frequency range concerned. It is generally caused by the white-noise fluctuation of
the oscillating frequency. Besides, the DCO quantization noise and the DCO power
supply noise will also cause the wander noise [54] [55]. The PSD of the DCO wander
noise can be modeled as:
ΦDCO(∆f) = LDCO/∆f
2, (5.5)























































(b) The PSD of PHE.
Fig. 76. The composition of the noise.
The frequency-domain noise models can be instantiated according to typical noise
levels. Together with the transfer functions in Table VII, the spectra of the output
134
Table VIII. The comparison of the noise contributions (low frequency range:
100Hz-100kHz; high frequency range: 100kHz-13MHz).
ref. TDC DCO
@all freq. 42.87% 14.29% 42.83%
Output
@low freq. 44.33% 14.78% 40.89%
phase noise
@high freq. 35.60% 11.87% 52.53%
@all freq. 74.67% 24.89% 0.44%
@low freq. 46.75% 15.58% 37.67%
PHE
@high freq. 74.93% 24.98% 0.09%
phase noise and PHE are shown in Fig. 76, under the loop settings of fREF=26MHz,
N=150.7, α = 2−7 and ρ = 2−15. It is noted that the DCO noise contribution to
PHE decreases dramatically as frequency increasing. Table VIII shows that within
the high frequency range, only 0.09% of PHE power comes from the DCO phase noise.
Such huge difference between the noise contributions makes it possible to distinguish
the DCO phase noise from the other two noises, and therefore enables an accurate
estimation of the output phase noise.
3. BIST principle
The relationship between the phase noise of the ADPLL output and the EVM per-
formance is depicted in Fig. 77. Assuming the error vector −→ve is much shorter than
the ideal vector −→vi , and there is no noise in the amplitude, we can write:
|−→ve | ≈ |−→vi | · sin φOUT ≈ |−→vi | · φOUT, (5.6)
135








EVMRMS ≈ φOUT,RMS, (5.8)
which means that the EVM (RMS) equals to the phase noise (RMS) of the ADPLL









Fig. 77. The relationship between the EVM and the phase noise.






where ΦOUT(f) is the PSD of the output phase noise, and can be further expressed
by the noise levels:
ΦOUT(f) = LREF|HR2O(f)|2 + LTDC|HT2O(f)|2 + LDCO|HD2O(f)|2/f 2, (5.10)
in which the transfer functions are defined in Table VII. Noticing HR2O = N ·HT2O
for any frequency, a new noise level can be defined to treat the reference noise and
136
the TDC noise as a whole:
LR&T = NLREF + LTDC. (5.11)
Combining Eq. (5.8) to Eq. (5.10) together, we can write:
EVM2RMS ≈ LR&TCT2O + LDCOCD2O, (5.12)
where CT2O and CD2O are power transfer coefficients, i.e. the power transfer coefficient

















All possible instances of HX2Y are listed in Table VII.
In Eq. (5.12), the power transfer coefficients can be pre-calculated according to
the loop settings, while the noise levels are unknown because of the PVT variations.
The proposed BIST scheme provides a way to calculate these noise levels by processing
PHE and PHE1.
Similar to Eq. (5.12), the power difference between PHE and PHE1 can be
written as:
PHE2RMS − PHE12RMS = LR&T(CT2P − CT2P1) + LDCO(CD2P − CD2P1). (5.14)
As mentioned before, the high frequency components of PHE are dominated
by the reference and TDC noise contributions. So in Eq. (5.14), the second term
at the right side is much smaller than the first term, and thus Eq. (5.14) can be
approximated as:
PHE2RMS − PHE12RMS ≈ LR&T(CT2P − CT2P1), (5.15)
137
which can be used to calculate LR&T. And LDCO can also be calculated considering
PHE12RMS = LR&TCT2P1 + LDCOCD2P1. (5.16)















The block diagram of the proposed BIST scheme is shown in Fig. 78. The average
powers of both PHE and PHE1 are calculated within one WCDMA time slot(667µs).
And then PHE2RMS and PHE1
2







where KPHE and KPHE1 are the coefficients of the linear combination. Compared to










Similarly, for the sake of diagnosis, the noise estimations can be generated by
linearly combining PHE2RMS and PHE1
2






























Fig. 78. BIST block diagram.
5. The optimization of the branch filter
When optimizing the structure of the additional branch filter, 3 objectives are consid-
ered. The first is to minimize the systematic error εsys introduced by the approxima-
tion of Eq. (5.15). The second is to minimize the influence of the random fluctuation
of the signature observation due to the randomness of noises. This objective can
























≈ |KPHE|+ |KPHE1| .
(5.20)
Last but not least, the hardware cost of the additional branch filter should also be
minimized.
The low-pass filter is implemented by a cascade of single-pole IIR filters, whose
139
Table IX. The systematic errors and the EVM sensitivities to the digital signatures.
λ = 2−1 λ = 2−2 λ = 2−3
εsys = 0.20% εsys = 0.17% εsys = 0.16%
NIIR = 1
S = 1.9953 S = 1.3494 S = 1.1928
εsys = 0.14% εsys = 0.12% εsys = 0.10%
NIIR = 2
S = 1.4572 S = 1.1864 S = 1.1494
εsys = 0.12% εsys = 0.11% εsys = 0.09%
NIIR = 3 S = 1.3215 S = 1.1546 S = 1.1559




z − (1− λ)
)NIIR
, (5.21)
where λ is usually a negative integer power of two for the ease of hardware imple-
mentation, and NIIR is the number of the cascading filters. This structure will ensure
that the low-pass filter has a unit gain at DC in this scheme, so that only the high
frequency power of PHE is filtered. The filter bandwidth is proportional to λ, NIIR
can be increased when a larger rolloff is needed.
Table IX lists εsys and S corresponding to different groups of λ and NIIR. It is
obtained by the transfer function analysis under typical noise levels. Based on the 3
objectives of the filter optimization, the low-pass filter with λ = 2−3 and NIIR = 3
should be one of the optimized choices.
D. DCO gain calibration
The above BIST scheme is based on the assumption that noises are propagated in
linearized systems, which requires the incorporation with the wide-band DCO non-
140
linearity calibration.
1. DCO gain mismatch
The DCO for polar WCDMA transmitters in 90nm CMOS is shown in Fig. 79 [66].
The DCO core is composed of a cross coupled gm core, an LC tank, a current source
and a 2nd harmonic trap. The output frequency is tuned by switching on/off the MOS
capacitors according to the DTW. Note that the proposed DCO gain calibration can










Fig. 79. DCO core with LC tank and biasing network in [12].
The DCO gain KDCO is defined as:
KDCO = ∆fDCO/∆DTW, (5.22)
where ∆DTW is the increment of the DTW, and ∆fDCO is the corresponding fre-
141
Table X. EVM degradation due to DCO gain mismatch (PM path only, no random
noise).
EVM
no DCO gain mismatch 2.04%
the DCO gain mismatch with 1σ cap mismatch 3.51%
the DCO gain mismatch with 3σ cap mismatch 6.44%









where L is the inductance and C is the total capacitance parallel to the inductor, ∆C
is the change of the total capacitance due to ∆DTW. Eq. (5.23) suggests the cap
mismatch will lead to the variation of the DCO gain.
The DCO gain mismatch will cause frequency errors in the feed forward path in
the two-point modulation scheme. Since this frequency error directly modulates the
DCO frequency, it will result in considerable modulation distortion. According to
the cap mismatch information from a commercial 90nm technology and the foundry
provided design kit, Table X lists the EVM degradations due to different DCO cap
mismatch deviations, σ is the typical deviation of cap mismatch (around 5%). Al-
though no random noise from analog blocks is considered, there is still some EVM
degradation even with no DCO gain mismatch because of the bandwidth reduction
technique.
2. DCO gain calibration
Given the DCO mismatch is pushing the EVM performance to the pass/fail margin
(typically around 5% for the PM path), a DCO gain calibration scheme is proposed
142
for WCDMA. Its objective is to compensate the DCO gain mismatch caused by the
cap mismatch. The calibration scheme has two modes: (i) DCO mismatch detection
and (ii) DCO gain compensation. Mode(i) functions at the system power-on reset or
during the time interval between transmission windows, whereas Mode(ii) takes effect
throughout the modulation process.
The entire structure of the calibration is shown in Fig. 80. If mode sel is set to
be 0, then Mode (i) is chosen, when the sampled DCO gains are detected and stored
into a lookup table (LUT), as in Fig. 81(a). If mode sel is set to be 1, then Mode (ii)
is chosen, when the contents of the LUT are linearly interpolated to compensate the






































Fig. 81. The relationship between ∆FCW/∆NDTW and ∆DTW.
E. Simulation results
1. Simulation platform
The SPICE simulation of the entire ADPLL-based transmitter will take days for 1µs
of simulated time, and therefore is not practical [68]. An event-driven simulation
platform using VHDL is proposed in [68] to analyze the ADPLL noise performance,
which is validated by real chip measurements. This approach is adopted in this
chapter, but using Verilog and System Verilog.
The simulation platform is composed of the test environment (Env.) using Sys-
tem Verilog and the design under test (DUT) using Verilog, as shown in Fig. 82.
According to the WCDMA Release 5 standard, 8-PSK modulation scheme is adopted.
In the stimulus module, there are the random transmission data generation, the sub-
sequent CORDIC, and the calibration control. The synthesizable Verilog describes
the RTL implementation of the DUT, including the digital circuits of the ADPLL
and the additional calibration and BIST circuits. The noise models, though as part
of the DUT, are realized by System Verilog, and their noise levels can be configured
144
by the configuration module. The ADPLL output is imported to the score board,










Fig. 82. The event-driven simulation platform for the ADPLL.
The digital circuits are simulated at RTL level, e.g. the digital signals use 6 bits
for the integral part and 15 bits for the fractional part, such that the quantization
effect is modeled. The noise models can closely depict the real performance of the
analog circuits, e.g. both the reference jitter and the DCO phase noise are modeled by
three segments [68]. Also, the TDC nonlinearity is included in the TDC model, with
its differential nonlinearity and integral nonlinearity both lower than 0.7 LSB [69].
Both the noise levels and the TDC resolution can be configured through the Env.
2. Simulation results for DCO gain calibration
The proposed DCO gain calibration are run for two DCO instances obtained by a
commercial 90nm technology and the foundry provided design kit: one with the 1σ
cap mismatch and the other with the 3σ cap mismatch. Here σ is the typical deviation
of cap mismatch for the technology, which is around 5%. The constellation graphs
for the 3σ case are shown in Fig. 83.
The output EVM versus the sample step (the gap between two adjacent samples
145
(a) w/o calibration. (b) w/ calibration.
(sample step=1MSB)
Fig. 83. The constellation graphs for the worst mismatch case.
of the DCO gain) is shown in Fig. 84(a), with the noise models configured to the
typical noise levels, whereas Fig. 84(b) is the results at twice the typical noise levels.
And the EVM results with no DCO gain mismatch are also shown for reference.
It is clearly seen that the denser distribution of samples leads to better DCO gain
curve fitting when detecting the DCO mismatch, and therefore results in better EVM
performance. Taking the sample step of 1MSB, the proposed calibration can improve
the EVM performance to the reference level, which means the EVM degradation
mainly result from the noise sources like the reference noise, the TDC noise and the
DCO phase noise, rather than the DCO mismatch. This observation is in accordance
with the assumption in the BIST principle analysis.
3. Simulation results for EVM BIST
Monte Carlo analysis is carried out to evaluate the proposed BIST scheme. 2,000
Monte-Carlo simulation samples are generated by conducting the event-driven sim-
ulation of the ADPLL-based polar transmitter. For each sample the configuration
control in the Env. generate random noise levels of the noise models. The variance
146















(a) Typical noise levels. (b) Twice the typical noise levels.
Fig. 84. The output EVM versus the sample step.
of each noise level is set to 3σ = 10% of the mean value.
The EVM of the ADPLL output clock during one WCDMA time slot is estimated
by the proposed BIST method. Fig. 85 compares the measured EVM and the estima-
tion by the BIST. As can be seen, the BIST can provide an accurate EVM estimation
(with errors smaller than 3% for 95% samples) when working together with the DCO
gain calibration. This verifies our earlier assumption that the PM path distortion
caused by the DCO gain mismatch is neglectable after DCO gain calibration. On the
other hand, the estimation error is large without the DCO gain calibration. This is
because the DCO mismatch contributes to a significant part of EVM degradation that
cannot be handled by the BIST scheme. Therefore, the EVM BIST is only effective
when the DCO gain calibration is taking effects. Besides, when the EVM is lower
than 3%, the EVM estimation tends to be smaller than the measured value. The
reason is that the proposed BIST cannot detect the EVM degradation caused by the
bandwidth reduction technique mentioned before. But this will not be a problem,
since such cases are far away from the pass/fail margin.
A pass/fail test is also carried out, as shown in Fig. 86. The 3GPP standard
147



























(a) With the DCO calibration.



























(b) w/o the DCO calibration.
Fig. 85. Estimated EVM vs. simulated EVM.
requires the EVM not to exceed 17.5% for a WCDMA transmitter. Since only the
PM path distortion is considered here, we set the pass/fail line as EVMRMS = 5%.
If the pass/fail test is used for production test, then the defect escape rate is 0.35%,
and the yield loss rate is 0.75%.

















(a) Estimated EVM distribution.






















(b) Simlated EVM distribution.
Fig. 86. Pass/fail test.
148













In Table XI, the areas are estimated for the ADPLL (from existing layouts) and the
proposed self-calibration and BIST (from synthesized results), using a commercial
90nm technology. The hardware overhead is 9.1% of the ADPLL area, which is
acceptable. Most of these costs can be further saved in a system-on-chip solution
proposed in [59]. Both the ARM7MPU and the C54 DPS are running at the clock rate
of 104MHz, four times faster than fREF of the ADPLL. This implies their processing
speed is not a restriction to their reuse for digital processing. For the sake of the
DCO gain calibration, around 640 samples are needed to cover the FCW tuning
range of WCMDA, with the sample step of 1MSB. Supposing the LUT takes 2 bytes
per sample, it will take 10kb memory totally, which is relatively small compared to
2.5Mb on-chip SRAM. If the digital processing and the LUT can realized by reusing
the on-chip resource, the only area overhead will be from the additional branch IIR
filter, which is less than 1% of the ADPLL area.
149
G. Summary
A self-calibration is proposed to compensate the DCO gain mismatch as well as a
BIST targeting the EVM performance for an ADPLL-based WCDMA polar trans-
mitter, which are validated by an event-driven simulation platform. The digital-like
implementation has led to the low cost of the proposed approaches. The proposed
BIST scheme is focused on the EVM degradation from the PM path. Together with
the self-testing of the AM path, it is possible to provide a complete solution for the
BIST of EVM in the future.
150
CHAPTER VI
CONCLUSIONS AND FUTURE DIRECTIONS
A. Conclusions
This dissertation emphasizes on developing novel verification and test techniques for
improving the robustness of AMS designs in highly scaled CMOS technologies. A
formal verification framework is proposed that incorporates nonlinear SMT solving
techniques and simulation exploration, with a Bayesian inference based approach to
balance the costs of simulation and SMT solving. The feasibility and efficacy of the
proposed methodology are demonstrated on the verification of lock time specification
of a charge-pump PLL. On the other hand, in-situ test techniques are proposed for
AMS designs for the error detection after fabrication. First, a novel two-level structure
of GRO-PVDL is proposed to measure the jitter performance for high-speed high-
resolution applications on chip. Taking advantage of quantization noise shaping, an
effective resolution of 0.8ps is achieved using 90nm CMOS technology. Second, the
reconfigurability of recent ADPLL designs is exploited to provide novel in-situ output
jitter test and diagnosis abilities under multiple parametric variations of key analog
building blocks. As an extension, an in-situ test scheme is proposed to provide online
testing for ADPLL based polar transmitters.
B. Future directions
First, the verification of AMS designs, especially those with complex nonlinear dynam-
ics, may take a middle way between formal verification and conventional simulation-
based verification. Although formal verification techniques have found great success
in digital designs and linear analog designs, the huge cost of formally verifying nonlin-
151
ear properties is against their feasibility for nonlinear AMS designs. Therefore, it is
desired to combine formal checking techniques and simulations, to achieve both high
coverage and efficiency. It is also anticipated that statistical framework might be the
“glue” in such combination. Because statistical methods can potentially provide a
rigorous view of uncertain circuit properties and a statically defined coverage for the
verification that is composed of formal methods and simulations.
Second, in-situ test designs may become an essential part of future AMS designs.
Analog designs now have increasing digital content that tests or calibrates the variable
analog performance in highly scaled technologies. The future development of in-situ
test techniques may emphasize on two directions. The first is to develop innovative
techniques that transform analog properties into digital signals to achieve both high
observability and low interference to analog circuits. The second is to increase the
testability of analog circuits by exploiting the interaction between the analog and
digital circuits in a mix-signal environment.
Last but not least, an even higher level picture in the future is the combination
of verification and DfT techniques. Considering that verification and DfT techniques
are both aiming at error detection, tradeoff between verification and DfT techniques
can be leveraged to minimize the overall implementation cost as well as maximize
the overall error coverage. For example, we can tradeoff between the runtime cost
of verification and the hardware cost of DfT. If due to runtime limitation, 100%
error coverage is too costly to achieve in verification, then some error checks can be
intentionally skipped in verification. At the same time, the accompanied in-situ test
will emphasize on the errors that are not covered by the verification, given such errors




[1] R. B. Staszewski, I. Bashir, and O. Eliezer, “Rf built-in self test of a wireless
transmitter,” IEEE Transaction on Circuits System II, Express Briefs, vol. 54,
no. 2, pp. 186–190, February 2007.
[2] S. Nassif, “Process variability at the 65nm node and beyond,” in Proceedings
of IEEE Custom Integrated Circuits Conference, September 2008, pp. 1–8.
[3] “The international technology roadmap for semiconductors,” http://public.
itrs.net/, accessed in April 2012.
[4] B.C. Lim, J. Kim, and M.A. Horowitz, “An efficient test vector generation for
checking analog/mixed-signal functional models,” in Proceedings of the 47th
Design Automation Conference, June 2010, pp. 767–772.
[5] H. Chang and K. Kundert, “Verification of complex analog and rf ic designs,”
Proceedings of the IEEE, vol. 95, no. 3, pp. 622–639, March 2007.
[6] M. Jarwala and S.J. Tsai, “A framework for design for testability of mixed
analog/digital circuits,” in Proceedings of the IEEE Custom Integrated Circuits
Conference, May 1991, pp. 13–15.
[7] S. Gupta, B.H. Krogh, and R.A. Rutenbar, “Towards formal verification of
analog designs,” in Proceedings of International Conference on Computer Aided
Design, November 2004, pp. 210–217.
[8] W.K. Lam, Hardware Design Verification: Simulation and Formal Method-
Based Approaches (Prentice Hall Modern Semiconductor Design Series), Pren-
tice Hall PTR, Upper Saddle River, 2005.
153
[9] M. Franzle, C. Herde, T. Teige, S. Ratschan, and T. Schubert, “Efficient solving
of large non-linear arithmetic constraint systems with complex boolean struc-
ture,” Journal on Satisfiability, Boolean Modeling, and Computation, vol. 1, pp.
209–236, 2007.
[10] “ISAT: Tight integration of satisfiability & constraint solving,” http://isat.
gforge.avacs.org/.
[11] M. Wang, Z. Wen, L. Chen, Y. Zhang, and Z. Zhang, “A novel bist approach
for testing dlls of soc,” in IEEE Circuits and Systems International Conference
on Testing and Diagnosis, April 2009, pp. 1–4.
[12] J. F. Bulzacchelli and et al, “A 10-Gb/s 5-tap DFE/4-tap FFE transceiver in
90-nm CMOS technology,” IEEE Journal of Solid-State Circuits, vol. 41, no. 12,
pp. 2885–2900, December 2006.
[13] M. H. Zaki, S. Tahar, and G. Bois, “Formal verification of analog and mixed
signal designs: A survey,” Microelectronics Journal, vol. 39, pp. 1395–1404,
2008.
[14] W. Denman, B. Akbarpour, S. Tahar, M.H. Zaki, and L.C. Paulson, “Formal
verification of analog designs using metitarski,” in Formal Methods in Computer-
Aided Design, November 2009, pp. 93–100.
[15] S. Steinhorst and L. Hedrich, “Advanced methods for equivalence checking of
analog circuits with strong nonlinearities,” Formal Methods in System Design,
vol. 36, pp. 131–147, 2010.
[16] A. Singh and P. Li, “On behavioral model equivalence checking for large ana-
log/mixed signal systems,” in IEEE/ACM ICCAD, November 2010, pp. 55–61.
154
[17] W. Hartong, L. Hedrich, and E. Barke, “Model checking algorithms for analog
verification,” in Proceedings of the 39th Design Automation Conference, 2002,
pp. 542–547.
[18] S. Little, D. Walter, K. Jones, C. J. Myers, and A. Sen, “Analog/mixed-signal
circuit verification using models generated from simulation traces,” International
Journal of Foundations of Computer Science, vol. 21, no. 2, pp. 191–210, 2010.
[19] G. Frehse, B.H. Krogh, and R.A. Rutenbar, “Verifying analog oscillator circuits
using forward/backward abstraction refinement,” in Proceedings of Design, Au-
tomation and Test in Europe, March 2006, vol. 1.
[20] B.I. Silva and B.H. Krogh, “Formal verification of hybrid systems using check-
mate: a case study,” in American Control Conference, 2000, vol. 3, pp. 1679–
1683 vol.3.
[21] M. Althoff, A. Rajhans, B. H. Krogh, S. Yaldiz, X. Li, and L. Pileggi, “Formal
verification of phase-locked loops using reachability analysis and continuization,”
in Proceedings of International Conference on Computer Aided Design, Novem-
ber 2011.
[22] Mark R. Greenstreet and Suwen Yang, “Verifying start-up conditions for a ring
oscillator,” in Proceedings of the Great Lakes Symposium on VLSI, February
2008, pp. 201–206.
[23] S.K. Tiwary, A. Gupta, J.R. Phillips, C. Pinello, and R. Zlatanovici, “First steps
towards sat-based formal analog verification,” in Proceedings of International
Conference on Computer Aided Design, November 2009, pp. 1–8.
[24] T.A. Henzinger, “The theory of hybrid automata,” in Proceedings of the
155
Eleventh Annual IEEE Symposium on Logic in Computer Science, July 1996,
pp. 278–292.
[25] A. Gelman, J. Carlin, H. Stern, and D. Rubin, Bayesian Data Analysis, Chap-
man & Hall, London, 1995.
[26] J.-M. Bernard, “An introduction to the imprecise dirichlet model for multinomial
data,” International Journal of Approximate Reasoning, vol. 39, no. 2-3, pp.
123–150, 2005.
[27] L. Yin, Y. Kim, and P. Li, “High effective-resolution built-in jitter character-
ization with quantization noise shaping,” in Proceedings of the 48th Design
Automation Conference, June 2011, pp. 765–770.
[28] M. Ishida, K. Ichiyama, et al., “On-chip circuit for measuring data jitter in
the time or frequency domain,” in IEEE Radio Frequency Integrated Circuits
Symposium, June 2007, pp. 347–350.
[29] M. Hsiao, J. Huang, and T. Chang, “A built-in parametric timing measurement
unit,” IEEE Design and Test of Computers, vol. 21, no. 4, pp. 322–330, July
2004.
[30] R. B. Staszewski, S. Vemulapalli, et al., “1.3 v 20ps time-to-digital converter
for frequency synthesis in 90-nm CMOS,” IEEE Transactions on Circuits and
Systems II: Express Briefs, vol. 53, no. 3, pp. 220–224, March 2006.
[31] A. H. Chan and G. W. Roberts, “A jitter characterization system using a
component-invariant Vernier delay line,” IEEE Transactions on Very Large Scale
Integration Systems, vol. 12, no. 1, pp. 79–95, January 2004.
156
[32] A.H. Chan and G.W. Roberts, “A synthesizable, fast and high-resolution timing
measurement device using a component-invariant vernier delay line,” in Pro-
ceedings of International Test Conference, 2001, pp. 858–867.
[33] S. Sattler and et al, “Pll built-in self-test jitter measurement integration into
0.18u cmos technology,” in Proceedings of Testmethodsand Reliability of Cir-
cuits and Systems Workshop, 2001.
[34] M. Z. Straayer and M. H. Perrott, “A multi-path gated ring oscillator TDC with
first-order noise shaping,” IEEE Journal of Solid-State Circuits, vol. 44, no. 4,
pp. 1089–1098, April 2009.
[35] Ping Lu, P. Andreani, and A. Liscidini, “A 90nm cmos gated-ring-oscillator-
based vernier time-to-digital converter for dplls,” in ESSCIRC, 2011 Proceedings
of the, September 2011, pp. 459–462.
[36] J. C. Candy and G. C. Temes, Oversampling Methods for A/D D/A Conversion,
Oversampling Delta-Sigma Converters, IEEE Press, New Jersey, 1992.
[37] M. Z. Straayer, Noise Shaping Techniques for Analog and Time to Digital Con-
verters using Voltage Controlled Oscillators, MIT, Cambridge, MA, 2008.
[38] I. Hwang, C. Kim, and S. Kang, “A cmos self-regulating VCO with low supply
sensitivity,” IEEE Journal of Solid-State Circuits, vol. 39, no. 1, pp. 42–48,
January 2004.
[39] T. Kwasniewski, M. Abou-Seido, et al., “Inductorless oscillator design for per-
sonal communication devicesa 1.2 µm CMOS process case study,” in Proceedings
of IEEE Custom Integrated Circuits Conference, May 1995, pp. 327–330.
157
[40] K. Hwang and L. Kim, “An area efficient asynchronous gated ring oscillator TDC
with minimum GRO stages,” in Proceedings of IEEE International Symposium
on Circuits and Systems, May 2010, pp. 3973–3976.
[41] I. Hwang, C. Kim, and S. Kang, “A true single-phase-clock dynamic CMOS
circuit technique,” IEEE Journal of Solid-State Circuits, vol. 22, no. 5, pp.
899–901, October 1987.
[42] J. Yuan and C. Svensson, “New TSPC latches and flipflops minimizing delay
and power,” in Digest of Technical Papers of Symposium on VLSI Circuits, June
1996, pp. 160–161.
[43] P. Welch, “The use of fast fourier transform for the estimation of power spectra:
A method based on time averaging over short, modified periodograms,” IEEE
Transactions on Audio and Electroacoustics, vol. 15, no. 2, pp. 70–73, 1967.
[44] O. Petre and H. G. Kerkhoff, “On-chip tap-delay measurements for a digital
delay-line used in high-speed inter-chip data communications,” in Proceedings
of Asian Test Symposium, November 2002, pp. 122–127.
[45] T. Hashimoto, H. Yamazaki, A. Muramatsu, T. Sato, and A. Inoue, “Time-to-
digital converter with vernier delay mismatch compensation for high resolution
on-die clock jitter measurement,” in IEEE Symposium on VLSI Circuits, 2008,
pp. 166–167.
[46] L. Yin and P. Li, “Exploiting reconfigurability for low-cost in-situ test and moni-
toring of digital plls,” in Proceedings of the 47th Design Automation Conference,
June 2010, pp. 929–934.
[47] B. Provost and E. Sanchez-Sinencio, “On-chip ramp generators for mixed-signal
158
bist and adc self-test,” IEEE Journal of Solid-State Circuits, vol. 38, pp. 263–
273, February 2003.
[48] S. R. Das, J. Zakizadeh, S. Biswas, M. H. Assaf, A. R. Nayak, E. M. Petriu,
W. B. Jone, and M. Sahinoglu, “Testing analog and mixed-signal circuits with
built-in hardware - a new approach,” IEEE Transaction on Instrumentation and
Measurement, vol. 56, no. 3, pp. 840–855, February 2007.
[49] B. Dufort and G. W. Roberts, “On-chip analog signal generation for mixed-signal
built-in self-test,” IEEE Journal of Solid-State Circuits, vol. 34, pp. 318–330,
March 1999.
[50] B. Murmann and B. E. Boser, “A 12 b 75 ms/s pipelined adc using open-loop
residue amplification,” Digest of Technical Papers of IEEE International Solid-
State Circuits Conference, pp. 328–329, February 2003.
[51] R. Staszewski and P. Balsara, All-Digital Frequency Synthesizer in Deepsubmi-
cron CMOS, Wiley-Interscience, Hoboken, 2008.
[52] W. M. Siebert, “Circuits, signals, and systems,” 1986.
[53] T. H. Lee and A. Hajimiri, “Oscillator phase noise: A tutorial,” IEEE Journal
of Solid-State Circuits, vol. 35, pp. 326–336, March 2000.
[54] R. B. Staszewski, C. M. Hung, N. Barton, M. C. Lee, and D. Leipold, “A digital
controlled oscillator in a 90nm digital cmos process for mobile phones,” IEEE
Journal of Solid-State Circuits, vol. 40, pp. 2203–2211, November 2005.
[55] F. Herzel and B. Razavi, “A study of oscillator jitter due to supply and substrate
noise,” IEEE Transaction on Circuits System II, vol. 46, pp. 56–62, January
1999.
159
[56] J. Kasdin, “Discrete simulation of colored noise and stochastic processes and
1/fα power law noise generation,” Proceedings of the IEEE, vol. 83, no. 5, pp.
802–827, May 1995.
[57] Y. Xu, K. Hsiung, X. Li, I. Vausieda, S. Boyd, and L. Pileggi, “Opera: Opti-
mization with ellipsoidal uncertainty for robust analog ic design,” Proceedings
of IEEE/ACM Design Automation Conference, pp. 632–637, June 2005.
[58] L. Yin and P. Li, “Rf bist for adpll-based polar transmitters with wide-band
dco gain calibration,” in Proceedings of the 12th International Symposium on
Quality Electronic Design, March 2011, pp. 1–8.
[59] R.B. Staszewski, J.L. Wallberg, S. Rezeq, C.M. Hung, O.E. Eliezer, S.K. Vem-
ulapalli, C. Fernando, K. Maggio, R. Staszewski, N. Barton, et al., “All-digital
pll and transmitter for mobile phones,” IEEE Journal of Solid-State Circuits,
vol. 40, no. 12, pp. 2469–2482, 2005.
[60] L.R. Kahn, “Single-sideband transmission by envelope elimination and restora-
tion,” Proceedings of the IRE, vol. 40, no. 7, pp. 803–806, 1952.
[61] J. Groe, “Polar transmitters for wireless communications,” IEEE Communica-
tions Magazine, vol. 45, no. 9, pp. 58–63, 2007.
[62] K. Waheed and S.N. Ba, “Adaptive digital linearization of a drp based edge
transmitter for cellular handsets,” in 50th Midwest Symposium on Circuits and
Systems, 2007, pp. 706–709.
[63] RB Staszewski, D. Leipold, O. Eliezer, M. Entezari, K. Muhammad, I. Bashir,
C.M. Hung, J. Wallberg, R. Staszewski, P. Cruise, et al., “A 24mm2 quad-band
single-chip gsm radio with transmitter calibration in 90nm digital cmos,” in
160
Digest of Technical Papers of IEEE International of Solid-State Circuits Confer-
ence, 2008, pp. 208–607.
[64] R.B. Staszewski, J. Wallberg, C.M. Hung, G. Feygin, M. Entezari, and
D. Leipold, “Lms-based calibration of an rf digitally controlled oscillator for
mobile phones,” IEEE Transactions on Circuits and Systems II: Express Briefs,
vol. 53, no. 3, pp. 225–229, 2006.
[65] J. Zhuang, K. Waheed, and R.B. Staszewski, “A technique to reduce
phase/frequency modulation bandwidth in a polar rf transmitter,” IEEE Trans-
actions on Circuits and Systems I: Regular Papers, vol. 57, no. 8, pp. 2196–2207,
2010.
[66] S. Akhtar, M. Ipek, J. Lin, RB Staszewski, and P. Litmanen, “Quad band
digitally controlled oscillator for wcdma transmitter in 90nm cmos,” in IEEE
Custom Integrated Circuits Conference, 2006, pp. 129–132.
[67] K. Waheed and R.B. Staszewski, “Characterization of deep-submicron varactor
mismatches in a digitally controlled oscillator,” in Proceedings of the IEEE 2005
Custom Integrated Circuits Conference, 2005, pp. 605–608.
[68] R.B. Staszewski, C. Fernando, and P.T. Balsara, “Event-driven simulation and
modeling of an rf oscillator,” in Proceedings of the 2004 International Sympo-
sium on Circuits and Systems, 2004, vol. 4, pp. IV–641.
[69] R.B. Staszewski, S. Vemulapalli, P. Vallur, J. Wallberg, and P.T. Balsara, “1.3
v 20 ps time-to-digital converter for frequency synthesis in 90-nm cmos,” IEEE
Transactions on Circuits and Systems II: Express Briefs, vol. 53, no. 3, pp. 220–
224, 2006.
