Open-Source Synthesizable Analog Blocks for High-Speed Link Designs:
  20-GS/s 5b ENOB Analog-to-Digital Converter and 5-GHz Phase Interpolator by Kim, Sung-Jin et al.
Open-Source Synthesizable Analog Blocks for High-Speed Link Designs:  
20-GS/s 5b ENOB Analog-to-Digital Converter and 5-GHz Phase Interpolator  
 
Sung-Jin Kim, Zachary Myers, Steven Herbst, ByongChan Lim, Mark Horowitz 
Stanford University, CA, U.S.A. 
sjkim85@stanford.edu 
 
Abstract 
Using digital standard cells and digital place-and-route (PnR) 
tools, we created a 20 GS/s, 8-bit analog-to-digital converter 
(ADC) for use in high-speed serial link applications with an 
ENOB of 5.6, a DNL of 0.96 LSB, and an INL of 2.39 LSB, which 
dissipated 175 mW in 0.102 mm2 in a 16nm technology.  The 
design is entirely described by HDL so that it can be ported to 
other processes with minimal effort and shared as open source. 
Introduction 
In recent years, many open-source digital circuit generators 
have been developed to make it easier to create ASICs for 
demanding applications such as machine learning.  Unfortunately, 
high-speed transceivers, which traditionally require precise analog 
circuits, are harder to open-source.  As a result, designers still have 
to obtain proprietary transceiver IPs, which are expensive, hard to 
verify, and inflexible. To break this bottleneck, this paper 
describes the design and measurement of our open-source, 
portable generator for a high-speed link transceiver [1]. 
We can open-source this design because it uses digital standard 
cells and digital PnR tools, and does not rely on precision analog 
design. In fact, it takes advantage of circuit variations in its novel 
stochastic time-to-digital converter (STDC) that, in conjunction 
with a synthesizable phase interpolator (PI), is used to build a 
time-interleaved ADC that can run to 20 GS/s. While the idea of 
taking advantage of variation in an ADC is not new [2,3,4], our 
approach is much more power efficient than previous work. 
The proposed ADC 
Fig. 1 shows the architecture of the proposed ADC. It consists 
of two voltage-to-time (V2T) converters, a V2T clock generator, 
a phase folder (PF), and an STDC. Each V2T samples its input 
voltage to CS at Φ1. At Φ2, the sampled voltage (VC) is discharged 
toward ground at a constant rate by a current source; a digital 
buffer generates an edge when the VC crosses its logic threshold 
voltage. The first gate of that buffer is a 3-input NOR to position 
the threshold below VDD/2. For the current source, a thick-gate 
cascode device is used to achieve sufficient output resistance and 
eliminate the need for a second bias voltage. The V2T clock 
generator provides a slightly early phase (Φ1e) for bottom-plate 
sampling and a slightly late phase (Φ2l) for stable settling. The 
V2Ts effectively encode the input signal as the time difference 
between their outputs TINP and TINN, which the PF folds into an 
unsigned pulse (PIN), along with a sign bit.  PIN includes a small 
offset delay Doffset so as to achieve a minimum width of about 100 
ps. 
The STDC quantizes the width of PIN by counting the number 
of delayed clock edges it contains.  These clock edges are 
generated from 255 non-precise unit inverters, and an adder tree 
and unfolder generate the signed binary result, as shown in Fig. 2. 
A divided clock from the V2T clock generator propagates through 
the inverter chain, generating a distribution of clock edges that is 
quasi-uniform in time because the period of the clock is 
uncorrelated to the delay of each inverter [5].  Due to the inherent 
immunity of the STDC against device mismatch, jitter, and PVT 
variation, it is a good fit for an automated PnR flow. The unfolder 
removes offset and converts the unsigned value back to a signed 
value according to the sign decision from the PF. The absolute 
value of the offset that should be subtracted is calculated through 
a background adaptation loop based on a histogram of output 
codes from the ADC (ADCOUT).  
Fig. 3(a) shows the PI used to generate and adjust sampling 
clocks for the multi-channel ADC. The delay chain generates 32 
delayed phases of input clock (𝜑𝑛). The mux network selects one 
out of 16 odd phases (𝜑2𝑛−1
𝐵 ) for the first input of the phase blender 
(phSEL1) and one out of 16 even phases (𝜑2𝑛
𝐵 ) for the second input 
of the phase blender (phSEL2). Additional circuitry is added before 
the muxes to make the operation robust to variations. Arbiters are 
used to find the number of delays in a clock cycle, and this 
information is used to control the phase mixers (1-bit phase 
blenders) connected to every node of the delay chain. The one 
mixer at the clock boundary blends its input edge with the input 
clock edge to generate an average edge between them, while the 
rest of the mixers act as buffers to achieve monotonic phase 
rotation across this edge as shown in Fig. 3(b)(c). The encoder 
controls the phase mixers, the mux network, and the phase blender 
based on the input control code and the quantized period from the 
arbiters. The phase blender is implemented by 16 output-shorted 
muxes. Unfortunately, the monotonicity of the PI may be 
jeopardized by automated PnR when the mismatch among 
accumulated path delays exceeds the unit delay of the delay chain 
(TD). This situation is detected and corrected after fabrication by 
adding an arbiter to the input of the phase blender and adjusting 
the strength of the delay buffer for the corresponding path. 
   The overall architecture of the proposed multi-channel ADC is 
shown in Fig. 4. Sixteen ADCs are time-interleaved to support a 
20 GS/s conversion rate. Four groups of ADCs, each of which 
consists of four ADC slices, are connected to individual switches 
(SW0-SW3) without inter-stage buffers, forming a two-stage 
passive track and hold (T&H) structure (front-end switches in each 
ADC slice act as the second-stage). Four PIs take a quarter-rate 
input clock and generate 5 GHz quadrature sampling clock phases 
for the first T&H switches. Each PI has an independent control 
dedicated for canceling clock skew among sampling phases. 
Quantized data from each ADC slice are synchronized by a double 
flop aligner to become the final output. Look up table based static 
non-linearity calibration is done supported by an on-chip SRAM. 
An analog voltage generator based on logic gates provides bias 
voltages for the current source cells in all V2Ts. 
    It is worthwhile to note that the design is described entirely in 
SystemVerilog and was verified through Verilog simulation. For 
the few building blocks that could not be directly simulated by the 
Verilog simulator, such as the phase blender, V2T, and bias 
generator, event-driven functional models were generated through 
DaVE, an automated model generation flow from Stanford [6]. 
This enables over 100k times faster simulation time with 
sufficiently accurate results as compared to conventional SPICE 
simulation. The physical design of the proposed multi-channel 
ADC was done with a standard digital PnR flow with no manual 
layout, aside from two custom cells (a switch cell and the current 
source cell in the V2T). 
Measurement results 
    A prototype chip was fabricated in the TSMC 16nm FinFET 
technology. Fig. 5 illustrates measurement results of the proposed 
PI. Its transfer function, measured by an on-chip delay monitoring 
circuit based on uncorrelated sampling [7], is monotonic with a 
resolution of 0.7 ps. The measured performance of the ADC is 
shown in Fig. 6, demonstrating a DNL of 0.95 LSB and an INL of 
2.39 LSB.  The ENOB is 5.6 at low frequencies, falling to 2.7 at 
Nyquist. The ENOB frequency dependence is mainly due to input 
power loss induced by package network (676pin FCBGA) which 
is not optimized for high frequency testing. The areas of an ADC 
slice and PI are 80μm x 40μm and 25μm x 80μm and they consume 
8.6mW and 9.6mW respectively. The area of the 16-channel ADC 
including 4 PIs is 300μm x 340μm and the total power 
consumption is 175mW under a 0.9V supply. The performance of 
our ADC is summarized and compared with published papers in 
Fig. 7. 
Conclusions 
In this paper, we presented work towards a portable analog 
generator for a high-speed link transceiver: namely, open-source, 
synthesizable designs for a 20-GS/s 8-bit ADC and a four-phase 5 
GHz PI.  By using circuit topologies that are insensitive to exact 
device parameters and whose analog constraints can be expressed 
in the time domain, we were able to make extensive use of digital 
tools for synthesis, layout, and simulation.  Finally, we validated 
these techniques by fabricating a prototype chip in a 16nm 
technology. 
Acknowledgements 
This work is supported by Stanford SystemX, DARPA POSH, a 
Hertz Fellowship, and a Stanford Graduate Fellowship. 
References 
[1] DragonPHY project, https://github.com/StanfordVLSI/dragonphy 
[2] T. Sundstrom, et. al., “Utilizing Process Variations for Reference 
Generation in a Flash ADC,” TCAS-II, Vol. 56, May 2009.  
[3] S. Weaver, et. al., “Digitally Synthesized Stochastic Flash ADC Using 
Only Standard Digital Cells,” Symposium on VLSI Circuits 2011. 
[4] A. Fahmy, et. al., “An All-Digital Reconfigurable Wide-Input Range 
Stochastic ADC Using Only Standard Cells,” TCAS-II, Vol. 62, Aug 2015. 
[5] S. Kim, et al., “A 0.6V 1.17ps PVT-tolerant and synthesizable time-to-
digital converter using stochastic phase interpolation with 16× spatial 
redundancy in 14nm FinFET technology,” ISSCC, 2015.  
[6] Stanford’s DaVE project, http://github.com/StanfordVLSI/DaVE. 
[7] M. Mansuri, et al., “An On-Die All-Digital Delay Measurement Circuit 
with 250fs Accuracy,” Symposium on VLSI Circuits 2012. 
VDD
VSS
VSS
VINN
VCP
VCN
VINP
VDD
VDD/2
Φ2
VTH
TINP
TINN
t
VDD/2
TINVC
CS
VIN Φ1
Φ1e
Φ2l Φ1Φ2
V2T
Φ Φ
switch cell
Vbias
(0V/0.8V)
ctrl
current cell
thick
gate
V2T clkgen
CLKDIV
Φ1e
Φ1
Φ1l
CLK
DIV
4
D
D
CLK
CLKDIV
Φ1e
Φ1l(Φ2l)
Φ1(Φ2)
V2TVINP
V2TVINN
CLK
r
r
TINP Arbiter
sign
Doffset
V2T
clkgen
Ф1 Ф2 Ф1e Ф2l
Ф2 Ф1e Ф2lФ1
TINP
TINN
TINN
CLKDIV
stochastic
TDC
ADCOUT
PIN
Phase Folder
8
 
Fig. 1: Proposed fully synthesizable ADC slice based on a stochastic TDC. 
 > DoffsetDoffset
co
u
nt
ADCOU T
 < DoffsetDoffset
co
u
nt
ADCOU T
co
u
nt
edge distribution
t
t0 t0+TCLKΦn
sign
adder
ADCOUT
histogram
CLK
TCLK
PIN
TPW
Φ1 Φ2 Φ255
D Q
Doffset
 
Fig. 2: Stochastic time-to-digital converter and unfolder. 
phSEL1
phSEL2
Ф1
B
Ф2
B Ф2
B
Ф3
B Ф3
B
Ф4
B
Ф1
B
Ф4
B
Ф1
B
Ф2
B
ΔT TD TD TD
TD+TF
2
TD+TF
2
phSEL1
phSEL2
Ф1
B
Ф2
B Ф2
B
Ф3
B Ф3
B
Ф4
B
Ф1
B
Ф4
B
Ф1
B
Ф2
B
ΔT TD TD TDTD+
TF
2
TD+
TF
2
: phase mixerM
0 101ctrl
phIN1phIN2
phOUT
ctrl
phIN
phOUT
: adjustable  bufferB
IN OUT
en
tri-state 
buffer
: arbiterA : delay cellD
CLKIN
TF
CLKIN
Ф1 Ф2 Ф3 Ф4 Ф1Ф5 Ф6
TD
Ф1
M
Ф2
M
Ф3
M
Ф4
M
Ф1
M
Ф5
M
Ф6
M
Ф1
B
Ф2
B
Ф3
B
Ф4
B
Ф1
B
Ф5
B
Ф6
B
Ф1 Ф2 Ф3 Ф4Ф1Ф5
TD
TF
Ф2
M
Ф3
M
Ф4
M
Ф1
M
Ф5
M
Ф2
B
Ф3
B
Ф4
B
Ф1
B
Ф5
B
Ф1
M
Ф1
B
(b) (c)
DD
A A
M
B
M
B
A
M
B
A
M
B
mux network
A
CLKIN
encoder
PI_ctrl
9
mixer 
ctrl
buffer
ctrl
5
16
blender
ctrl
32
mux_ctrl
32
Ф1
B
Ф2
B
Ф3
B
Ф32
B
Ф1
M
Ф2
M
Ф3
M
Ф32
M
Ф1 Ф2 Ф3 Ф32
phSEL1 phSEL2
D
CLKOUT
phase blender
0 1
(a)
 
Fig. 3: Proposed fully synthesizable PI slice based on a digital phase blender. 
PI
ph0
PI
ph1
PI
ph2
PI
ph3
CLK
Φ0 Φ90 Φ180 Φ270
PI_ctrl
ADCOUT
SRAM
8
static 
correction
(off chip)
Vbias
biasgen
bias
ctrl
VIN
ADCADCADCADC
ADCADCADCADC
ADCADCADCADC
ADCADCADCADC
aligner
Φ0
SW0
Φ90
SW1
Φ180
SW2
Φ270
SW3
 
Fig. 4: Proposed 16-channel time-interleaved ADC architecture.  
(a) (b)
PI
CLKIN CLKOUT
PI_ctrl
Σ 
r
DIV M
CLKASYNC
PMOUT
max step = 0.66ps
 
Fig. 5: Measured results of the PI. 
L
S
B
L
S
B
INL
DNL
ADC code
SNDR (dB)
Frequency (MHz)
1-chan.
16-chan.
 
Fig. 6: Measured results of the ADC. 
ADC
biasgen
PI
clock buffer
300um
3
4
0
u
m
[2] [3] [4] This Work
Type
Stochastic 
Flash ADC
Stochastic 
Flash ADC
Stochastic 
Flash ADC
Stochastic 
Flash TDC
Process (nm) 90 90 130 16
Supply (V) 1.2 1.2 1.0 0.9
Normalized Input 
Range (VDD)
0.04 0.12 0.4 0.5
Speed (GHz) 1.5 0.21 0.32 1.25 20
ENOB 3.7 5.69 5.17 5.9 5.6
Power (mW) 23 34.8 87 8.6 175
FoM (pJ/b) 1.18 3.2 7.55 0.12 0.18
Area (   ) 0.04 0.18 0.51 0.003 0.1
Synthesizable No Yes Yes Yes
 
Fig. 7: Test chip layout and performance comparison. 
