A 0.5GHz 0.35mW LDO-Powered Constant-Slope Phase Interpolator with
  0.22$\%$ INL by Elnaqib, Ahmed et al.
ar
X
iv
:2
00
7.
07
65
4v
1 
 [e
es
s.S
P]
  1
5 J
ul 
20
20
1
A 0.5GHz 0.35mW LDO-Powered Constant-Slope
Phase Interpolator with 0.22% INL
Ahmed Elnaqib, Hayate Okuhara, Taekwang Jang, Davide Rossi and Luca Benini
Abstract—Clock generators are an essential and critical build-
ing block of any communication link, whether it be wired or
wireless, and they are increasingly critical given the push for
lower I/O power and higher bandwidth in Systems-on-Chip
(SoCs) for the Internet-of-Things (IoT). One recurrent issue with
clock generators is multiple-phase generation, especially for low-
power applications; several methods of phase generation have
been proposed, one of which is phase interpolation. We propose a
phase interpolator (PI) that employs the concept of constant-slope
operation. Consequently, a low-power highly-linear operation
is coupled with the wide dynamic range (i.e. phase wrapping)
capabilities of a PI. Furthermore, the PI is powered by a low-
dropout regulator (LDO) supporting fast transient operation.
Implemented in 65-nm CMOS technology, it consumes 350µW
at a 1.2-V supply and a 0.5-GHz clock; it achieves energy
efficiency 4×-15× lower than state-of-the-art (SoA) digital-to-
time converters (DTCs) and an integral non-linearity (INL) of
2.5×-3.1× better than SoA PIs, striking a good balance between
linearity and energy efficiency.
Index terms—Clock generator, digital-to-phase converter
(DPC), digital-to-time converter (DTC), phase interpolator
(PI), clock and data recovery (CDR), low-dropout regula-
tor (LDO)
I. INTRODUCTION
Due to the importance of multiple-phase generation, several
methods have been presented tackling the main trade-offs that
constrain phase generators: resolution, dynamic range, linear-
ity, and power consumption. Different applications have led
to different solutions dictated by their specific requirements.
Phase generators, or digital-to-phase converters (DPCs) as we
will be strictly concerned with digital control, can be classified
into digitally-controlled delay lines (DCDLs), digital-to-time
converters (DTCs) and phase interpolators (PIs) (Fig. 1).
Starting with DCDLs, a delay line is a single-input multiple-
output DPC based on identical delay cells. The minimum
resolution of a DCDL is limited by the minimum delay of the
delay cell, which in turn is limited by the minimum inverter
delay. The minimum operating frequency is limited by the
maximum delay of the delay line, which is limited by the
number of delay stages used, and more stages mean higher
power consumption [1]. Therefore, there is an inherent trade-
off between resolution, dynamic range, and power. Further-
more, a delay line could be followed by a multiplexer (MUX)
which implies that the MUX delay has to be taken into account
The authors are with the University of Bologna, Bologna, Italy and ETH
Zurich, Zurich, Switzerland (e-mail: ahmedgamal.mahmoud2@unibo.it, hay-
ate.okuhara@unibo.it, tkjang@ethz.ch, davide.rossi@unibo.it, luca.benini@u-
nibo.it)
DCDLinφ DTC
1
φ
 2φ
 
n
φ
 
Nb
 
1b
 
0b
 
2b
 
in
φ
 
out
φ
 
Nb
 
1b
 
0b
 
2b
 
PI
Nb
 
1b
 
0b
 
2b
 
out
φ
 
1
φ
 2φ
 
n
φ
 
Fig. 1. Types of digital-to-phase converters (DPCs).
Variable
Slope
Nb 1b 0b2b
Constant
Slope
Nb 1b 0b2b
THV
THV
Fig. 2. Constant-slope vs. variable-slope operation.
as well [3]. Delay lines are also susceptible to PVT variations,
which compromises linearity. Therefore, DCDLs are typically
used within a delay-locked loop (DLL) in order to control
these mismatches, resulting in extra overhead [4], [5].
Moving on to the DTC, it is a single-input single-output
block that typically generates delay through a comparator that
detects the threshold crossing of a current charging a capacitor.
Some delay cells that are used in delay lines can actually be
considered stand-alone DTCs (e.g. current-starved inverter).
DTCs are widely used in fractional-N frequency synthesizers
for quantization noise cancellation [6]-[8]; DTCs provide fine
resolution, but it can be difficult to achieve a wide dynamic
range while maintaining an acceptable linearity [16]. In [10],
the DTC is composed of a delay stage loaded by switched
capacitors; it can be classified as a variable-slope DTC and
will be discussed later in more detail. A delay-line-based DTC
can be found in [11] which is basically a delay line with a
single output, but the input is injected into any given delay cell
based on a digital control code. The delay cells are identical
inverters with a slight mismatch between even and odd cells.
While it consumes as much as 0.53mW at 50MHz, it achieves
0.11% INL with constant delay cell currents.
Another approach is the constant-slope DTC [12]. Most
DTCs depend upon the delay generated by a voltage ramp
[6]-[10] and this delay is mainly adjusted through a tunable
capacitance; also, some DCDLs use switched current sources
as well [5]; both of these methods tune the delay by varying
the slope of the output signal. This was shown to be a source
of non-linearity due to the relationship between the input slope
2TABLE I
STATE-OF-THE-ART DTCS COMPARISON
[6] [7] [9] [10] [11] [12] [13]
Technology 28-nm 40-nm 65-nm 28-nm 40-nm 65-nm 65-nm
Architecture Variable-slope Variable-slope Multi-stage Variable-slope Variable-slope Delay Line Constant-slope Constant-slope
Supply Voltage 0.9V 1.1V 0.9V 0.9V 1.1V 1.2V 1.0V
Frequency 40MHz 150MHz 2GHz 40MHz 50MHz 55MHz 52MHz
Dynamic Range 512ps 1.1ns 256ps 563ps 1.28ns 189ps 593ps
Resolution 500fs 1.075ps 2ps 550fs 16ps 185fs 580fs
INL 1.5ps (0.29%) 2.8ps (0.25%) 2ps (0.78%) 990fs (0.18%) 1.4ps (0.11%) 328fs (0.17%) 870fs (0.15%)
Power 0.5mW - - 0.58mW 0.53mW 0.8mW 0.14mW
V
D 
PI outφ(0-5)φ
LDOrefV
Fig. 3. Simplified block diagram of the proposed DPC.
and the delay of the comparator used to detect these signals;
the bandwidth limit of the comparator could be another source
of non-linearity as well [12]. Therefore, a constant slope was
proposed by varying the start voltage rather than the load
capacitance as shown in Fig. 2. In [13], a DTC based on this
concept was proposed for low-power applications achieving
an INL of 0.15%. Shown in Table I is a comparison of the
different DTC implementations that have been discussed. It
can be seen that constant-slope DTCs exhibit better linearity
compared to variable-slope DTCs.
PIs exploit multiple clock phases in order to generate an
intermediate phase. PIs can be found in fractional feedback
dividers [14], [16] and clock/data recovery (CDR) systems
[15]. In [14], the tail currents are manipulated to adjust the
weight of each input phase. The varying slope of the tail
current is a major source of non-linearity in such current-
mode interpolators [16]. They also suffer from the static
power consumption typical of CML-based circuitry. A charge-
mode PI was proposed in [17] which makes use of a tunable
capacitance to produce different phases similar to DTCs based
on a voltage ramp. However, to improve linearity, non-linear
capacitance values were used to counteract the variable slope
operation; this makes calibration and matching extremely
challenging, and leads to a degradation in linearity. Another
issue is the limited support for multi-rate operation because it
requires tuning a full array of capacitors, rather than a single
capacitance. Alternatively, a pipelined PI is proposed in [16]
that consists of several stages, each capable of phase forward-
ing or phase interpolation. This scheme aims at improving the
linearity by maintaining a constant current flowing through
each stage, which is the same concept proposed by constant-
slope DTCs. This indicates a trend of adopting constant-slope
operation in both DTCs and PIs.
In this paper, we present a highly-linear DPC suited to low-
power applications that demonstrates the similarities between
state-of-the-art PIs and DTCs, and draws inspiration from
both; it achieves the wide dynamic range of a PI and the
linearity of constant-slope signaling. It improves the energy
efficiency of DTCs by 4×-15× and keeps the INL at 2×
4M 
3M 
1M 2
M
 
sM
 
o
C
Mode I
CLK0
CLK0
CLK120
CLK120
o
V
s
V
4
M
 
3
M
 
1
M
 2
M
 
o
C
 
Mode II
CLK0
CLK0
CLK120
CLK120
o
V
sM
 
s
V
C
L
K
0
me
0
0
V
DD
V
LDO
V
o
C
L
K
1
2
0
V
TH
0
V
DD
I II III
Fig. 4. Schematic and operation of a PI half-cell.
at worst; and when compared to PIs, linearity is improved
by 2.5×-3.1× while maintaining the energy efficiency within
1.1×-1.5×. In Section II, the proposed DPC design is de-
scribed in detail, and in Section III, the simulation results
of the DPC are presented and discussed. Section IV is the
conclusion.
II. PROPOSED PHASE GENERATOR OPERATION
Shown in Fig. 3 is a simplified block diagram of the
proposed DPC. The PI exploits constant-slope operation, so a
low-dropout regulator (LDO) provides the supply voltage (i.e.
start voltage) corresponding to a given output phase. The PI
was implemented within a CDR loop; the loop has an update
period of 128ns and is clocked by six phases provided by
an external frequency-locked loop (FLL) [18] and integrated
with a low-power micro-controller system. The phase space is
divided into six regions (sextants) according to the six-phase
clock provided by the FLL; therefore, six PI unit cells are
needed.
3TFF
PIφ0
φ120
CLK0
CLK120
Half
Cell
TFF
XOR
Gate
V
I
out
φ
V
II
V
LDO
PIφ0
φ120
CLK180
CLK300
Half
Cell
V
LDO
V
I
V
II
me
0
0
0
V
DD
V
DD
V
DD

o
u
t
Fig. 5. Phase interpolator unit cell.
sV
1
M
2
M refV
sM
3
M M
V
DD
PM
Rf
2
oV
mC4
oC
5
6 8
7
Fig. 6. Low-dropout regulator (LDO) schematic.
A. Proposed Phase Interpolator
Shown in Fig. 4 is the schematic of a PI half-cell. Ms is
a constant current source and Co is the output capacitance.
M1 controls the discharge timing of Co; M2 is a reset switch
to discharge Co to ground; M3 and M4 are set switches that
charge Co to VLDO. The operation of the half-cell is shown
as well. In Mode I, where CLK0 and CLK120 are low, Co is
charged from 0 to VLDO (i.e. start voltage); in Mode II, where
only CLK0 is high, Co discharges throughM1, and it is during
this mode that the threshold crossing is detected by subsequent
comparators. Then, CO continues to discharge throughM2 as
well when CLK120 is high, and does so exclusively when
CLK0 eventually goes low; that is Mode III, which is the
reset mode.
The slope of the output voltage (Vo) during the relevant
Mode II can be represented as
dVo
dt
= −
Ic
Co
where Ic is the current flowing through M1. Since Co is
constant, then in order to achieve constant-slope operation, Ic
needs to be kept constant across the different PI steps. To this
PI
out
φ
0
φ
60
φ
1	

φ

φ
270
φ
3
φ
1 PI2 PI3
PI4 PI5 PI6
1
4
25
3
6
V
DD
r
V
2
P
f
C
LDO
M
U
X
c
D
ff
D
2
2
MUX
1
0
fif
flffi

Fig. 7. Block diagram of the full proposed DPC.
end, a constant current sourceMs is used. Therefore, the slope
of the output voltage can be solely controlled by Co, which
can be tuned to support multi-rate operation. The maximum
current variation across the PI steps is kept at about 5%.
To build a unit cell, the output of two PI half-cells are then
connected to toggle flip-flops (TFFs) as shown in Fig. 5; the
inputs of the second half-cell are the inverse of the inputs to
the first half-cell. The TFFs provide two half-rate clock signals
that are strictly 90o apart since they are generated by two full-
rate clock signals that are 180o apart. Finally, feeding both to a
2-input XOR gate generates the required full-rate clock signal.
Accordingly, the output clock phase is controlled by the input
clock phases (i.e. the phases fed to a given unit cell), and the
start voltage provided by the LDO.
B. Low-Dropout Regulator
Shown in Fig. 6 is the schematic of the LDO. Since high
gain and fast response are the main targets, a three-stage
regulator is used; the error amplifier consists of two stages.
Ms,M1,M2,M3, andM4 compose the first stage; the second
stage consists of M5, M6, M7, and M8; MP is the power
transistor, and Rf and R2 provide the feedback signal; Cm is
the compensation capacitor and Co is the output capacitance.
Vo is controlled by the variable resistance Rf where its value
is determined by a digital control code based on the required
output phase. The range of Vo is from 0.7V to 1V.
The full DPC architecture is shown in Fig. 7; the control
bits for the LDO and the PI are both generated by the CDR
loop; the phase space is divided into sextants with the first
sextant consisting of 6 steps, the second and third of 5 steps
each; these three sextants cover half the phase space, and they
are mirrored in the other half. The first multiplexer selects the
feedback resistor for a given start voltage (i.e. VLDO), thus
handling fine control within a sextant. The second multiplexer
handles coarse control by selecting the sextant. Shown in Fig.
8 is an example of the operation of the DPC where an output
phase of 22.5o is required; since 22.5o lies in the first sextant
4out
φ(0-5)φ
V
DD
refV
D
D
1
V
L
D
O
 me
!
o
u
t
0
V
DD
2
1
V
LDO
0
V
LDO
5
Fig. 8. Proposed DPC operation with an output phase of 22.5o. (sextant 1,
step 3)
me (ns)
0 10 20 30 40 50 60 70 80 90 100
V
v
  
  
 (
V
)
4mV
8ns
ld
o
0
V5
Fig. 9. LDO step response.
(i.e. from 0o to 60o), the first PI is selected by MUX2 (i.e.
coarse control); a phase resolution of 11.25o is used, so MUX1
selects the third feedback resistor (i.e. voltage step), given that
the first step is 0o.
III. SIMULATION RESULTS
The LDO has an open-loop gain of 50dB across the required
range of start voltages with a phase margin of 60o. The
step response of the LDO is shown in Fig. 9; the largest
possible output voltage transition (i.e. feedback resistor chang-
ing from Rf0 to Rf5) was simulated and the output voltage
was observed; the LDO load is a PI unit cell. The output
voltage stabilizes within 8ns; this was deemed acceptable as
the sampling period of the CDR is 16ns. The ripples on the
output voltage are within 4mV. The quiescent current is around
50µA.
0 1 2 3 4 5 6 7 8 9 10 11 12 "# 14 15
Input Code
-$%&
'()*
+,./
0
100
200
024
400
I
N
L
 
(
f
s
)
Fig. 10. INL Nominal Simulation.
0 1 2 3 4 5 6 7 8 5 10 11 12 67 14 15
Input Code
8:
;<
=>
?@
AB
0
1
2
C
4
5
I
N
L
 
(
p
s
)
EFG
HIJ
Fig. 11. INL Monte Carlo Simulation.
The INL of the 5-bit PI is demonstrated in Fig. 10; the
error is within 300fs (0.015%). The Monte Carlo process
variations is also shown in Fig. 11. The worst maximum of
the INL is 4.32ps (0.22%), and the worst minimum INL is
-2.9ps (0.15%). Vthn is 392.73mV for the maximum points
and 390.28mV for the minimum points. To combat these
variations, the PI output capacitance Co can be dynamically
calibrated. Only one PI unit cell is kept powered-up at a given
time; a PI unit cell consumes about 290µW.
To demonstrate the variation in the edges of the output
clock, the histogram of the zero-crossings of an output phase
of 22.5o is shown in Fig. 12; the mean is around 46.73ns in this
example and the standard deviation is 9.41ps. Also, shown in
Fig. 13 are the results of the Monte Carlo simulations for the
maximum INL and corresponding power consumption across
different values of the supply voltage. Under temperature
variations, the worst INL is 3.54ps at a temperature of -40o
and -0.63ps at 125o, resulting in a variation of 4.17ps across
this range.
The performance of the proposed PI is compared to other
state-of-the-art PIs and DTCs in Table II; the DTCs with the
highest linearity and the lowest power consumption from Table
I were included in this comparison. For the PIs, the traditional
phase deviation metric was replaced by the equivalent integral
non-linearity (INL) metric to have a basis for comparing them
to DTCs. The combination of a 0.7pJ/bit energy efficiency and
0.22% INL achieved by this work balances that of state-of-the-
art PIs and DTCs.
5TABLE II
STATE-OF-THE-ART DTCS AND PIS COMPARISON
[11] [13] [15] [16] This Work
Technology 40-nm 65-nm 65-nm 65-nm 65-nm
Architecture Delay line - Current-mode Pipeline Charge-mode
Slope Constant Constant Variable Constant Constant
Supply Voltage 1.1V 1.0V 1.2V 1V 1.2V
Frequency 50MHz 52MHz 6GHz 5GHz 0.5GHz
Dynamic Range 1.28ns 593ps 166.66ps 200ps 2ns*
Resolution 16ps 580fs 4.63ps 6.25ps 62.5ps*
INL 1.4ps (0.11%) 870fs (0.15%) 920fs (0.55%) 1.4ps (0.7%) 4.32ps*(0.22%)
Power 0.53mW 0.14mW <3.8mW 2.3mW 0.35mW*
* Simulation results
me (ns)f
re
q
u
e
n
cy
 (
n
o
. 
o
f 

m
e
s) 46.705ns 46.7333ns 46.7615ns
Number = 60
Mean = 46.73ns
Std. Dev. = 9.41ps
15
0
Fig. 12. Zero-crossing histogram of output phase 22.5o.
1 1.1 1.2 1.3
Supply Voltage (V)
0
1
2
3
4
5
6
7
8
9
10
M
a
x
im
um
 I
NL
 (
ps
)
1 1.1 1.2 1.3
Supply Voltage (V)
260
280
300
320
340
360
380
400
P
o
w
e
r
 
C
o
n
s
u
m
p
t
io
n 
(u
W)
Fig. 13. MC INL and power consumption versus supply voltage for output
phase 22.5o.
IV. CONCLUSION
A constant-slope phase generator in 65-nm was presented
that achieves a balance between the speed and energy effi-
ciency of phase interpolators, and the linearity of digital-to-
time converters. It employs a PI powered by a low-dropout
regulator, and it has a worst peak of 4.32ps INL operating at
0.5GHz and consuming 350µW at a 1.2V supply.
ACKNOWLEDGMENT
This work was supported in part by the WiPLASH project
founded from the European Union’s Horizon 2020 research
and innovation program under Grant Agreement No. 863337.
REFERENCES
[1] Chang, Hsiang-Hui, et al. ”A wide-range delay-locked loop with a fixed
latency of one clock cycle.” IEEE journal of solid-state circuits 37.8
(2002): 1021-1027.
[2] Lo, Yu-Lung, et al. ”An all-digital DLL with dual-loop control for mul-
tiphase clock generator.” 2011 International Symposium on Integrated
Circuits. IEEE, 2011.
[3] De Caro, Davide. ”Glitch-free NAND-based digitally controlled delay-
lines.” IEEE Transactions on Very Large Scale Integration (VLSI)
Systems 21.1 (2012): 55-66.
[4] Lee, Hyun-Woo, et al. ”A 1.0-ns/1.0-V delay-locked loop with racing
mode and countered CAS latency controller for DRAM interfaces.”
IEEE Journal of Solid-State Circuits 47.6 (2012): 1436-1447.
[5] Zhang, Dandan, et al. ”A multiphase DLL with a novel fast-locking
fine-code time-to-digital converter.” IEEE Transactions on Very Large
Scale Integration (VLSI) Systems 23.11 (2015): 2680-2684.
[6] Raczkowski, Kuba, et al. ”A 9.2–12.7 GHz wideband fractional-N
subsampling PLL in 28 nm CMOS with 280 fs RMS jitter.” IEEE Journal
of Solid-State Circuits 50.5 (2015): 1203-1213.
[7] Tseng, Yen-Hsiang, Che-Wei Yeh, and Shen-Iuan Liu. ”A 2.25–2.7 GHz
area-efficient subharmonically injection-locked fractional-N frequency
synthesizer with a fast-converging correlation loop.” IEEE Transactions
on Circuits and Systems I: Regular Papers 64.4 (2016): 811-822.
[8] Chang, Wei-Sung, and Tai-Cheng Lee. ”A 5 GHz Fractional-N ADC-
Based Digital Phase-Locked Loops With 243.8 dB FOM.” IEEE Trans-
actions on Circuits and Systems I: Regular Papers 63.11 (2016): 1845-
1853.
[9] Elkholy, Ahmed, et al. ”Low-jitter multi-output all-digital clock gener-
ator using DTC-based open loop fractional dividers.” IEEE Journal of
Solid-State Circuits 53.6 (2018): 1806-1817.
[10] Markulic, Nereo, et al. ”A 10-bit, 550-fs step Digital-to-Time Converter
in 28nm CMOS.” ESSCIRC 2014-40th European Solid State Circuits
Conference (ESSCIRC). IEEE, 2014.
[11] Wu, Ying, et al. ”A 3.5–6.8-GHz Wide-Bandwidth DTC-Assisted
Fractional-N All-Digital PLL With a MASH ∆Σ-TDC for Low In-Band
Phase Noise.”.IEEE Journal of Solid-State Circuits 52.7 (2017): 1885-
1903.
[12] Ru, Jiayoon Zhiyu, et al. ”A high-linearity digital-to-time converter tech-
nique: Constant-slope charging.” IEEE journal of solid-state circuits 50.6
(2015): 1412-1423.
[13] Liu, Hanli, et al. ”A Sub-mW Fractional-N ADPLL With FOM of 246
dB for IoT Applications.” IEEE Journal of Solid-State Circuits 53.12
(2018): 3540-3552.
[14] Nonis, Roberto, et al. ”digPLL-Lite: A low-complexity, low-jitter
fractional-N digital PLL architecture.” IEEE Journal of Solid-State
Circuits 48.12 (2013): 3134-3145.
[15] Abiri, Behrooz, et al. ”A 1-to-6Gb/s phase-interpolator-based burst-mode
CDR in 65nm CMOS.” 2011 IEEE International Solid-State Circuits
Conference. IEEE, 2011.
[16] Narayanan, Aravind Tharayil, et al. ”A fractional-Nsub-sampling PLL
using a pipelined phase-interpolator with an FoM of-250 dB.” IEEE
Journal of Solid-State Circuits 51.7 (2016): 1630-1640.
[17] A. Elnaqib and S. A. Ibrahim. ”Low-power charge-steering phase
interpolator.” Electronics Letters 52.10 (2016): 810-812.
[18] Bellasi, David E., and Luca Benini. ”Smart energy-efficient clock
synthesizer for duty-cycled sensor socs in 65 nm/28nm cmos.” IEEE
Transactions on Circuits and Systems I: Regular Papers 64.9 (2017):
2322-2333.
