A single shot TDC with 4.8 ps resolution in 40 nm CMOS for high energy physics applications by Prinzie, Jeffrey et al.
Preprint typeset in JINST style - HYPER VERSION
A single shot TDC with 4.8 ps resolution in 40 nm
CMOS for high energy physics applications
J. Prinziea,b, M. Steyaertb and P. Lerouxa,b
aKU Leuven dept. ESAT-ADVISE,
Kleinhoefstraat 4 B-2440 Geel, Belgium
bKU Leuven dept. ESAT-MICAS,
Kasteelpark Arenberg 10 B-3001 Leuven, Belgium
E-mail: Jeffrey.Prinzie@esat.kuleuven.be
ABSTRACT: A robust TDC with 4.8 ps bin width has been designed for harsh environments and
high energy physics applications. The circuit uses resistive interpolation DLL with a novel dual
phase detector architecture. This architecture improves startup- and recovery speed from single
event strikes without control voltage ripple trade-off and requires no off-line calibrations. A 0.43
LSB DNL has been measured at a power consumption of 4.2 mW with an extended frequency
range from 0.8 GHz to 2.4 GHz. The TDC has been processed in 40 nm CMOS technology.
KEYWORDS: TDC, DLL, single shot.
Contents
1. Introduction 1
2. DLL architecture 3
2.1 DLL phase detector 3
2.2 DLL architecture 4
3. DLL stabilized TDC 5
3.1 VCDL 5
3.2 Thermometer code to binary converter 6
4. Measurements 6
5. Conclusions 8
1. Introduction
High resolution time-to-digital converters (TDCs) are required in today’s high demanding appli-
cations, especially in high energy physics detectors such as ATLAS and CMS and for medical ap-
plications including PET imaging. Besides these experimental and medical applications, TDCs are
gaining more interest in sensor readout and signal processing applications. Nanometer CMOS tech-
nologies with feature sizes of 65 nm and lower suffer from ever decreasing power supply voltages,
making analog signal processing in the voltage domain increasingly difficult. Fortunately, smaller
technology nodes also have ever increasing operating speeds enabling highly accurate analog sig-
nal processing in the time-domain where the analog information is stored in the zero-crossings of
a signal. Some of the above applications require true single shot TDCs. This is the case for most
high energy physics experiments because particles can only be measured once before disappearing.
Therefore TDCs with high reliability are necessary to capture all occurring events after a single hit.
In applications that do not require true single shot precision, Σ∆modulation can be used to improve
the resolution even under severe radiation damage [1] .
In general, different TDC topologies are used today. Flash based TDCs utilize a delay line with
an injected start pulse. Capture flip-flops sample this delay line upon a stop pulse to achieve time
difference information based on the edge position of the start pulse in the delay line. This is one
of the simplest but also limited architectures among TDCs. While the resolution of flash TDCs is
limited to the delay line’s delay cells, resistive interpolation can be used to improve the resolution
beyond this delay [2]. A Vernier topology increases the flash TDC’s resolution at the cost of
worse linearity and dynamic range [3]. Pipelined TDC’s use multiple stages as MDACs to amplify
time residues. In this way, several very low resolution stages can be combined into a pipelined
– 1 –
structure with high resolution [4]. Coarse-fine TDCs work in a similar way using a coarse TDC
and amplifying the residue to the same coarse TDC [5]. However, these topologies mostly rely on
time-amplification circuits that utilize metastability regions of latches. These circuits may behave
unexpectedly under low temperatures or high radiation levels.
Delay locked loop (DLL) stabilized TDCs incorporate a feedback loop to stabilize the delay
line of flash based TDCs [6]. The loop ensures that the delay at the end of the line is equal to
the reference clock’s period. A phase detector compares both reference and output signals and
adjusts the delay cells until both signal arrive at the same time as shown in Figure 1a. The chip
discussed in this paper uses a DLL based TDC. Compared to open loop TDCs, the DLL locks the
end of the delay line to the beginning resulting in a low uncertainty at the end of the line. Therefore
the highest uncertainty is located at the center of the delay line. When a TDC without DLL is
compared to the one with DLL, the maximum variance in a DLL TDC is only 1/4th of the open
loop equivalent as shown in Figure 1b [7]. Any external influences such as process, voltage and
temperature variations that may alter the delay line’s speed are compensated by the loop. Also
total dose radiation effects on the delay line are compensated by the loop if equal distribution to
the delay cells is assumed. The amount of radiation that can be tolerated depends on the cells’
delay sensitivity to total dose radiation and tuning range capability of the voltage controlled delay
line (VCDL). This kind of TDC is able to measure the time difference between the (periodic) start
signal and the stop signal. When a stop event arrives, the state of the delay line is sampled into
the time capture registers. The location where the rising edge of the start signal has propagated
in the delay line gives information on the time difference. Therefore, the edge (“10“ variation in
the delay line capture registers) has to be looked up and multiplied by the buffer delay to get the
correct timing information. Multiple channels with multiple layers of time capture registers can be
used to detect time differences between two random signals relatively.
Figure 1. a) DLL stabilized TDC , b) Reduced variance in locked TDC’s
While the DLL has some major practical advantages, it does require some amount of startup
time before it acquires lock. During lock acquisition, the resolution is unknown until it is converged
to its programmed value. During this time, the TDC cannot be used in most applications because
the absolute resolution is unknown. Also, when the loop has lost lock, it needs a reacquisition
time. Traditional DLL phase detectors have a trade-off between speed and ripple (spur). This
paper proposes a solution to have both high (re)acquisition speed and low control voltage ripple.
– 2 –
2. DLL architecture
2.1 DLL phase detector
Figure 2. a) PDF transfer function b) Bang-bang phase detector transfer function
Generally, two kinds of phase detectors are used in today’s time based circuit. Phase frequency
detectors (PFD) or bang-bang phase detectors (BBPD). Both have their own advantages and disad-
vantages. Figure 2a and Figure 2b show the phase to current transfer function of the PFD and the
BBPD respectively. As can be seen from the transfer functions, the PFD has an output signal that
is proportional to the phase difference where a BBPD does not have this feature. Only an “early“
or “late“ signal is generated that continuously charges up or down the loop capacitor resulting in a
periodic limit cycle stemming from the discrete non-linear detector.
Figure 3. a) BBPD acquisition graph b) PFD acquisition graph
The limit cycle of the BBPD incorporates a serious trade-off between control-voltage ripple
and (re)acquisition speed. Because the current sources, driven by the BBPD, are continuously
charging the loop capacitor, the ripple depends on the absolute value of this current (and capacitor).
Low current operation results in low control voltage ripple but also slow startup speed, whereas high
currents improve acquisition speed but degenerate deterministic noise of the VCDL (control ripple).
A trade-off has to be made between speed and noise as shown in Figure 3a. Doubling the startup
speed would also double the voltage ripple in the limit cycle. In contrast, a PFD does not have this
problem of limit-cycle operation. During acquisition, a large output value quickly converges the
– 3 –
loop. The net output current decreases near locking point resulting in steady operation. However a
large static phase offset can occur because of charge pump mismatch as shown in Figure 3b. The
resulting gain error may cause a time offset of several picoseconds.
The proposed TDC uses a combination of both detectors to ensure low static phase offset from
the BBPD, biased with a low current and high startup speed from the PFD. This eliminates the
trade-off between startup speed and control voltage ripple.
2.2 DLL architecture
The system architecture of the TDC is shown in Figure 4. Both phase detectors are combined
and process the VCDL and reference signal independently. The PFD has a built-in deadzone and
features higher charge pump current (50 µA) compared to the BBPD. In this way, during start-up,
the PFD controls the loop at high speed. When the loop is approaching its locking point, the PFD
enters its built-in deadzone. The deadzone ensures automatic shutdown and reactivation of the PFD
without any complex detection scheme. In this design a 20 ps deadzone window has been chosen.
After the PFD shuts down, the BBPD only has to do acquisition within this narrow timing range.
The low current (2 µA) charge pump of the BBPD ensures low ripple. In case of loss of lock, the
PFD leaves deadzone and is reactivated automatically. The deadzone has been accurately built in
using a PFD without deadzone and using dedicated deadzone logic operations on both up and down
signals as shown in Figure 5. The up and down signals are compared using a NAND gate that will
generate a pulse with a pulse-width proportional to the phase error. However, for small phase erros
(within the targeted deadzone window), the NAND gate will not be able to reach the switching
point of the inverter and no up pulse will be generated at the output. The deadzone window can be
accurately controlled with the slew rate of the NAND gate and depends on the drive current of the
gate and its load capacitance. If this load capacitor or NAND gate current is made programmable,
a programmable deadzone window can be used.
Figure 4. DLL architecture combining BBPD and PDF
Because the PFD originated from PLL designs, some adjustments have to be made to ensure
the correct state each cycle. Because no cycle slipping is possible in a DLL, a PFD in an incorrectly
initialized state results in false locking and saturated DLL. A simple solution to prevent this is to
reset the state of the detector each period to a known state. The signal at the middle of the VCDL
– 4 –
is used to reset the PFD by triggering a glitch generator internally. This pulse resets the PFD via
the PFD’s internal reset path. When this event has passed to the output of the VCDL, the PFD
has already been reset and proper detection can occur. In case of any single-events, periodic resets
prevent corrupted DLL operation.
Compared to dynamic charge pump current regulation, this design has no need for complex
detection circuits to change the amount of current at the correct time and automatically adjusts its
operation state for startup or limit cycle. The PFD is implemented using 4 standard CMOS NOR
gates for each flip-flop. Charge pump mismatch is not crucial in this design as long as the static
phase offset, generated by the PFD – charge pump combination is below the deadzone window. No
special circuits have to be used to compensate for this mismatch. This allows the DLL to be easily
integrated into a digital flow and even transferred to new technologies without intensive analog
design care.
3. DLL stabilized TDC
3.1 VCDL
The performance of the TDC is defined by the delay cells, used in the VCDL. A resistive inter-
polation scheme is used to achieve sub-gate delay resolution. The TDC incorporates a 2 stage
interpolation. The first being the DLL and second being a 5 tap resistive interpolator as shown
in Figure 6. Each delay of the DLL is interpolated by a factor 5. The resistors’ taps generate a
weighted value of the buffer’s input signal and it’s delayed output. The resistors are implemented
by poly tracks and scaled iteratively to ensure correct RC delays between all phases. In this chip, a
single-ended delay cell architecture is implemented. Non inverting buffers should be used because
of the resistive interpolation. This will offer a speed decrease by a factor 2 compared to inverting
stages or differential delay cells. However, for the same power consumption, a single-ended cell
architecture is less noisy if proper power supply decoupling is done. To improve the speed of the
non inverting stages, only one inverter per pair is speed-limited for only rising edges. While this
results in an asymmetric duty cycle, the DLL has no issues with this asymmetry because only rising
edges are processed. This results in a speed penalty of less than 2. Delay cell mismatch is only
156fs/
√
ps rms open loop error.
Figure 5. Detailed PFD with controlled deadzone and cyclic reset
– 5 –
Figure 6. Delay cell resistive interpolation
3.2 Thermometer code to binary converter
After the capture stage of the delay line flip-flops, a pseudo thermometer code to binary conversion
has to be done to detect a “..1100..“ sequence in the chain. A “10“ detector can be used to search
the registers but single bubbles can corrupt the output data that may occur due to single event
upsets. Therefore, a “100“ detector is employed to correct single bubbles. This bubble detector is
implemented using an array of combinational 3-input AND gates that gives a signal at the place
where this event has happened. At a second stage, a priority decoding is done. This is used to make
sure only one event is captured. If a two-level bubble occurs, it is possible that 2 outputs of the
AND array are high resulting in bad functionality of the following ROM without priority detection.
At the final stage, a ROM with hardware programmed binary table converts the priority signal into
a binary value. Finally the output of the ROM is used as 6 bit output of the TDC.
4. Measurements
The acquisition range from the DLL goes from 800 MHz to 2.4 GHz. Depending on the applied
frequency, the LSB bin width can be programmed from 14 ps down to 4.8 ps using the 5 times
resistive interpolation DLL. The power consumption ranges from 2.4 mW to 4.2 mW respectively.
Different samples are measured to verify the linearity of the TDC due to local process mis-
match. Global mismatch variations that change the entire chip’s performance are generally com-
pensated by the DLL’s control loop that only requires enough tuning range to compensate for these
effects. However, local variation between different delay cells of the VCDL and capture time
difference of the flip-flops result in the remaining non-linearity. The linearity measurements are
shown in Figure 7 where the phase difference between the DLL’s reference clock and event signal
is varied over 360◦ to characterize the entire range of the TDC using a code density test. No noise
is included in this measurement. The appearing pattern in the DNL results from non-uniform RC
loading in the resistive interpolation due to additive layout parasitics. However, this could be re-
moved with calibration. While this range covers only 6 bits, it can be extended for larger dynamic
ranges with a reference counter where only long term accumulated jitter of the reference clock
limits the range. This figure shows the INL and DNL performance of the TDC expressed in LSB
– 6 –
Figure 7. DNL and INL measurement
between two interpolated nodes. Different measured chips show similar results. A 0.43 LSB DNL
rms and 0.4 LSB INL rms is measured.
Figure 8 shows the startup speed of the DLL as a function of the applied bias current of the
PFD. The increasing current does not result in excess control voltage ripple as would be the case
with a single BBPD detector. For larger charge pump currents, the speed does not increase anymore
because of saturating current sources. After the time reported in Figure 8, the PFD enters its
deadzone window and shuts down automatically until the DLL is disturbed. When shutting down,
no current flows and no offset is introduced by the PFD’s charge pump. A standard charge pump
architecture is used without the need for any intense analog design methods to cope with clock
feedthrough and charge injection or output impedance mismatch over the DLL’s control range.
Figure 8. DLL startup speed
The key specifications of the TDC are summarized in Figure 9 together with a chip photo-
graph. The entire TDC with its DLL occupies only 156 µm x 256 µm of silicon area. As shown
on the photograph, this is only a small portion of the entire chip where decoupling fills unused area
that is created by bonding pad requirements. The resolution can be tuned from 4.8 ps up to 14 ps
with a power consumption of 4.2 mW to 2.4 mW respectively for the entire TDC with DLL. A
dynamic range of 6 bits is achieved but can be extended with a reference clock counter. Supply
voltages are limited from 0.9 V to 1.2 V. The TDC is processed in a 40 nm CMOS technology.
– 7 –
Figure 9. TDC specifications and chip photograph
5. Conclusions
A high resolution TDC with improved startup speed has been designed for high energy physics ap-
plications. An increased startup- and recovery speed has been achieved with low static phase offset
by combining a phase frequency detector and a bang-bang phase detector. The phase frequency
detector has intentional built in deadzone that causes automatic shutdown near locking point to
remove it’s offset. A resolution down to 4.8 ps with a power consumption of 4.2 mW is measured.
The design has been processed to silicon in a 40 nm CMOS technology.
References
[1] Y. Cao, W. De Cock, M. Steyaert, P. Leroux, 1-1-1 MASH Time-to-Digital Converters With 6 ps
Resolution and Third-Order Noise-Shaping, IEEE J. Solid-State Circuits Vol. 47, NO. 9, pp 2093-2106,
Sept. 2012
[2] S. Henzler, S. Koeppe, W. Kamp, H. Mulatz and D. Schmitt-Landsiedel, "90nm 4.7ps-Resolution
0.7-LSB Single-Shot Precision and 19pJ-per-Shot Local Passive Interpolation Time-to-Digital
Converter with On-Chip Characterization, IEEE ISSCC Dig. Tech. Papers pp. 548-549, Feb. 2008.
[3] P. Dudek, S. Szczepanski, J. V. Hatfield, A High-Resolution CMOS Time-to-Digital Converter Utilizing
a Vernier Delay Line, IEEE J. Solid-State Circuits Vol. 32, NO. 2, pp 240 - 247, Feb. 2000
[4] Y. Seo, J. Kim, H. Park, J. Sim, A 0.63ps Resolution, 11b Pipeline TDC in 0.13µm CMOS, Symposium
on VLSI Circuits Dig. Tech. Papers, pp 152-153, Jun. 2011
[5] M. Lee, A. A. Abidi, A 9 b, 1.25 ps Resolution Coarse-Fine Time-to-Digital Converter in 90 nm CMOS
that Amplifies a Time Residue, IEEE J. Solid-State Circuits Vol. 43, NO. 4, pp 168-169, Apr. 2008
[6] L. Perktold, J. Christiansen, A Flexible 5 ps Bin-Width Timing Core for Next Generation
High-Energy-Physics Time-to-Digital Converter Applications, Conference on Ph.D. Research in
Microelectronics and Electronics (PRIME), 2012 8th , pp 179-182, Jun. 2012
[7] T. Toifl, R. Vari, P. Moreira, A. Marchioro, 4-Channel Rad-Hard Delay Generation ASIC with 1ns
Timing Resolution for LHC , IEEE Trans. on Nuclear Science, Vol. 46, NO. 3, pp 139-143 Jun. 1999
– 8 –
