Ultra Compact and Low-power TDC and TAC Architectures for Highly-Parallel Implementation in Time-Resolved Image Sensors by Stoppa, D. et al.
2011 International Workshop on ADC Modelling, Testing and Data Converter Analysis and Design  and IEEE 2011 ADC Forum 
June 30 - July 1, 2011. Orvieto, Italy. 
 
Ultra Compact and Low-power TDC and TAC Architectures  
for Highly-Parallel Implementation in Time-Resolved Image Sensors  
 
David Stoppa1, Fausto Borghetti1, Justin Richardson3,5, Richard Walker2,3  
Robert K. Henderson3, Marek Gersbach4, Edoardo Charbon4 
 
1Fondazione Bruno Kessler (FBK), Trento, Italy, +390461314531, +390451302040, stoppa@fbk.eu 
2STMicroelectronics Imaging Division, Edinburgh, U.K. 
3The University of Edinburgh, Edinburgh, U.K. 
4TU Delft, Delft, The Netherlands 
5Dialog Semiconductor, Edinburgh, U.K.  
 
 
Abstrac t - We report on the design and characterization of three different architectures, namely two Time-to-
Digital Converters (TDCs) and a Time-to-Amplitude Converter (TAC) with embedded analog-to-digital 
conversion, implemented in a 130-nm CMOS imaging technology. The proposed circuit solutions are conceived 
for implementation at pixel-level, in image sensors exploiting Single-Photon Avalanche Diodes as 
photodetectors. The fabricated 32x32 TDCs/TACs arrays have a pitch of 50µm in both directions while the 
average power consumption is between 28mW and 300mW depending on the architectural choice. The TAC 
achieves a time resolution of 160ps on a 20-ns time range with a differential and integral non-linearity (DNL, 
INL) of 0.7LSB and 1.9LSB, respectively. The two TDCs have a 10-bit resolution with a minimum time 
resolution between 50ps and 119ps and a worst-case accuracy of ±0.5 LSB DNL and 2.4 LSB INL. An overview 
of the performance is given together with the analysis of the pros and cons for each architecture. 
 
I. Introduction 
 
Electronic circuits aimed at time measurements have been developed for nuclear science applications since the 
1960’s [1] and their performance has been constantly improving, mainly thanks to the reduction of propagation 
delays as the fabrication CMOS technologies are progressing towards nanometer channel lengths. However, 
these devices are typically based on complex circuits [2]-[5], whose area occupation and power consumption 
makes it impossible to implement highly parallel on-chip architectures. Recent fabrication techniques for low-
noise Single-Photon Avalanche Diodes (SPADs) in deep-submicron CMOS technologies, as proven in [6],[7], 
make it now possible to integrate a TAC/TDC inside each pixel, allowing fully parallel operation as required to 
realize a monolithic Time-Correlated Single-Photon Counting (TCSPC) imager. Several TAC/TDC architectures 
[8]-[10] have been developed and successfully implemented at pixel-level within the project MEGAFRAME 
[11], demonstrating the feasibility of the largest single-chip array of TACs/TDCs so far reported [13]. In this 
contribution a summary and comparison of these architectures is given.  
 
II. Time-resolved SPAD-based Pixel Architecture 
 
The pixel, developed within the MEGAFRAME project, supports both time-uncorrelated and time-correlated 
single-photon imaging. Each pixel of the array consists of a SPAD with quenching circuitry, a TAC/TDC stage, 
and a memory to allow a time-interleaved readout-convert operation. Three different time-measuring 
architectures have been developed and fully characterized: a TDC based on an external master clock (TDC-EC) 
[8], a ring-oscillator TDC (RO-TDC) [10] and a TAC with embedded ADC conversion (TADC) [9]. 
  
A. Time-to-Amplitude-to-Digital Converter (TADC)  
 
The schematic diagram of the TAC is shown in Figure 1. The basic cell consists of a current source (IbiasP), 
which charges up capacitor Cs when the switch-structure, composed of Mp1, Mp2, Mp3, and IbiasN, is turned 
on. The basic building block used to generate the voltage ramp is replicated into three layout-matched structures: 
Stage1 and Stage2 are used alternately to measure the number of events or the event arrival time in a time 
interleaved fashion, while StageREF is used to generate a reference voltage ramp used to implement the 
embedded ADC. The time-coding voltage ramp is started when a photon is detected, whilst the charging of 
capacitor Cs is stopped by the system clock (to map a 25-ns time range a 40-MHz frequency is used for the 
STOP signal). At the end of the observation window, a voltage is accumulated across capacitor Cs that is 
2011 International Workshop on ADC Modelling, Testing and Data Converter Analysis and Design  and IEEE 2011 ADC Forum 
June 30 - July 1, 2011. Orvieto, Italy. 
proportional to the photon arrival time.  At this point, the reference TAC block is externally stimulated by the 
signal CNT (globally distributed to the whole pixel matrix, synchronously with the 6-bit Gray Code Counter), 
while the output of TAC selector is applied to the second branch (time interleaved operation) and Vo1 is 
connected to the comparator. As the number of CNT pulses increases, the voltage at node VoCNT increases and 
when it reaches the same voltage previously stored on Vo1 the voltage comparator toggles, thus sampling the 
digital code GCC<0:5> into the memory.  
 
Figure 1: Schematic of the TAC with embedded digital conversion (TADC). 
 
B. Time-to-Digital Converter based on External Clock (TDC-EC) 
 
This architecture exploits an external clock generated by an on-chip PLL that runs at 280MHz and is distributed 
to the whole TDC array, as depicted in Figure 2. The global clock is then doubled at the pixel level to achieve 
higher resolution. Each 10-bit TDC consists of a two-level (coarse and fine) interpolator activated by the digital 
pulse from a SPAD upon photon detection (START signal). The coarse interpolator consists of a 6-bit ripple 
counter clocked by the global clock CK. The fine interpolator further divides each clock cycle by sending the 
START signal through a delay chain consisting of 16 buffer propagation delays. To minimize dependence on 
substrate and supply noise, a differential buffer architecture was chosen for the implementation of the 16-tap 
delay line. On the first clock edge after the START signal the propagation of the pulse through the delay chain is 
stopped, thus the number of buffer elements that toggled between the photon detection and the subsequent clock 
edge corresponds to the time elapsed between these two events. A coder converts the resulting thermometer code 
of the fine interpolator output into a 4-bit binary number and the result is stored in one of the two 10-bit 
memories to allow the read out of the pixel array in parallel with the acquisition of the following frame. 
 
Figure 2: Block diagram of the TDC-EC. 
 
C. Ring-Oscillator Time-to-Digital Converter (RO-TDC) 
 
In this case the clock is internally generated by means of a ring-oscillator realized by four differential buffers and 
activated by the START signal (see Fig. 3). Similarly to the TDC-EC architecture, coarse conversion is achieved 
by means of a 7-bit ripple counter while the fine conversion is provided by the dynamic state stored in the 
internal nodes of the ring (3 bits). The use of	  a	  ‘power	  of	  2’	  number	  of	  elements	  simplifies	  the	  fine	  state	  binary	  
2011 International Workshop on ADC Modelling, Testing and Data Converter Analysis and Design  and IEEE 2011 ADC Forum 
June 30 - July 1, 2011. Orvieto, Italy. 
coding,	  while	  positive	  feedback	  in	  the ring is achieved by simply swapping the polarity of feedback on the last 
stage.  
 
 
Figure 3: Block diagram of the RO-TDC. 
 
The differential buffers have an NMOS supply regulation to reduce the impact of power supply noise. The 
propagation delay of the buffers can be adjusted by tuning the gate-voltage of the NMOS transistors controlling 
the tail current, by means of a calibration loop locking the mean array time resolution to a stable external clock 
source using a PLL-like structure. 
 
III. Experimental Results and Comparison 
 
The three versions have been implemented in a 32x32-pixel array and fabricated in a 130-nm CMOS technology 
(die micrograph of the RO-TDC version is shown in Figure 4). The three pixel-arrays share the same SPAD 
front-end, readout circuitry and I/O padring in order to allow a direct comparison of the performance.  
 
 
Figure 4: 32x32-pixel array with in-pixel TDC. 
 
More details concerning the three structures and the characterization procedure are provided in [8-10], while a 
summary of the performance achieved by the three architectures is given in Table I. 
In light of the results obtained from the testing of the implemented TAC/TDC architectures it is possible to draw 
the following conclusions: 
 
TADC: The main advantage of this architecture, basically exploiting the charge of a capacitor by means of a 
constant current source, is that its time resolution performance is not limited by the use of deep-submicron 
technologies, so the proposed design can be easily ported to less-advanced fabrication processes without 
substantial degradation effects [12], and only minor impact onto the area occupation. On the other hand, to 
achieve good uniformity in the implementation of large array of TACs, the basic TAC circuit is unsuitable, and 
the architecture proposed here adds complexity to cope with mismatch effects. This was also needed because to 
achieve the high frame rate of 1Mfps targeted by the MEGAFRAME project the only viable solution was to 
stream digital data directly from the chip, thus requiring on-chip high-speed analog-to-digital conversion. In this 
2011 International Workshop on ADC Modelling, Testing and Data Converter Analysis and Design  and IEEE 2011 ADC Forum 
June 30 - July 1, 2011. Orvieto, Italy. 
case our choice was to adopt a pixel-level ADC approach, and to exploit it to realize a TAC with embedded 
digital conversion exhibiting good immunity to mismatch effects. 
 
Table I: Performance summary of the TAC/TDC arrays. 
 PARAMETER TAC [9] TDC-EC [8] TDC-RO [10] Unit 
Bit resolution 6 10 10 bits 
Time resolution 
(LSB) 160
 119 178/52 ps 
Uniformity ±2 ±2 8 LSB 
INL 1.9 ±1.2 ±0.4/1.4 LSB 
DNL 0.7 ±0.4 ±0.5/2.4 LSB 
 
Time Jitter < 600 185 107/32 ps @ FWHM 
 Power Consumption 300 94 28/38 µW@ 500kframe/sec 
 
 
Another key point of the TAC structure is that the time range can be adjusted by simply changing the reference 
current used to charge up the capacitors, while in a TDC this is limited by the clock-frequency and the counter 
depth.  
From Table I it is evident that the main drawbacks of this architecture with respect to TDCs are higher power 
consumption, mainly due to the extra-bias current (IBIASN in Fig. 1) needed by the high-bandwidth switches, 
and the worst jitter noise performance of the three versions, although this figure is dominated by the FPGA jitter 
noise contribution and can be considered as a worst case estimation of the TAC noise on the whole time-range. 
On the other hand, for the TDCs a refined characterization procedure has been used. In any case it is reasonable 
to expect a higher jitter noise for the TADC because of the extra contribution given by the relatively low-slope 
ramp used for the analog-to-digital conversion. Another contribution to the TADC power dissipation is due to 
the distribution of the Gray-code signals to the whole pixel-array, as required to implement the analog-to-digital 
conversion.  
Finally, the uniformity along the 32x32-pixel array is quite good in comparison to TDCs although there is no 
calibration loop implemented in the TADC. This has been achieved thanks to the use of the in-pixel reference 
stage, which dramatically improves the immunity to process mismatch.  
 
TDC-EC and RO-TDC: These structures provide excellent performance in this deep-submicron technology 
implementation both in terms of timing resolution and area occupation. The TDC-EC could potentially be 
integrated in a very compact structure, however, to successfully distribute the global clock over a large TDC-
array there is an intrinsic limitation in the maximum clock frequency, which requires a local frequency doubler 
and a longer delay line chain with respect to the RO-TDC, where a higher clock frequency can be used being 
locally generated. The use of a global clock will also impact the total power consumption. As can be seen in 
Table I, the TDC-EC consumes more power with respect to the RO-TDC, because of the “CV2f” dissipation due 
to the distribution network. On the contrary the RO-TDC is very power efficient when the number of active 
TDCs is low, which is the typical operating condition in FLIM applications. Moreover the local ring oscillator in 
the RO-TDC could run at very high frequency, providing the best time resolution (52ps) among the three 
structures, while still exhibiting the lowest power consumption. There are a two main design issues to be taken 
into account in the implementation of the RO-TDC: the first one concerns the potential metastability of the 
structure, which has been solved by implementing a hysteresis output stage, the other one is the periodic INL 
error intrinsic of this structure, which has been attenuated by means of an optimized layout of the four stages in 
order to keep each stage perfectly symmetric (analog approach). 
The main drawback of the RO-TDC, at least in our implementation where the structure has been implemented in 
a very compact layout (less than 50x50µm2), is the relatively high DNL-INL. Additionally, for longer conversion 
periods, oscillator based jitter is permitted to accumulate for longer periods of time. 
On the basis of the above mentioned considerations the final version of the MEGAFRAME sensor, consisting of 
a 160x128-pixel array has been based on the RO-TDC version. Even in such a massive parallel TDCs 
implementation, spread on a very large area sensor (8x6.4mm2), this approach confirmed its validity as 
demonstrated by the excellent results summarized in [13]. 
 
 
 
 
2011 International Workshop on ADC Modelling, Testing and Data Converter Analysis and Design  and IEEE 2011 ADC Forum 
June 30 - July 1, 2011. Orvieto, Italy. 
IV. Conclusions 
 
Three different time-measurement circuit architectures have been designed and fabricated within a 130-nm 
CMOS technology. A Time-to-Amplitude Converter with embedded analog-to-digital conversion and two Time-
to-Digital Converters have been described. The TDCs implement two different strategies for the reference clock, 
the first one using an external clock globally distributed to the whole TDC-array, the second one generating the 
clock locally by means of a ring-oscillator in each pixel.  The fabricated 32x32 TDCs/TACs arrays have a pitch 
of 50µm in both directions and share the very same front-end and read out electronics to allow a direct 
comparison of the performance. A summary of most important parameters and the analysis of the pros and cons 
for each architecture has been given. The resulting time-correlated pixel array is a viable candidate for single 
photon counting (TCSPC) applications such as fluorescent lifetime imaging microscopy (FLIM), nuclear or 3D 
imaging and permits scaling to larger array formats. 
 
References 
 
[1] R. Nutt, “Digital Time Intervalometer”, The Review of Scientific Instruments, 1968, Vol. 39, Issue 9. 
[2] J.-P. Jansson, et al., “A CMOS time-to-digital converter with better than 10 ps single-shot precision”, IEEE 
J. Solid-State Circuits, vol. 41, no.6, pp. 1286-1296, June 2006. 
[3] A. S. Yousif et al., “A Fine Resolution TDC Architecture for next generation PET imaging”, IEEE Trans. 
on Nuclear Science, vol. 54, no. 5, pp. 1574-1582, Oct. 2007. 
[4] M. Lee, et al., “A 9b, 1.25ps Resolution Coarse-Fine Time-to-Digital Converter in 90nm CMOS that 
Amplifies a Time Residue”, Digest of IEEE Symposium on VLSI Circuits, pp. 168-169, 2007. 
[5] P. Chen, et al., “A low-cost low-power CMOS time-to-digital converter based on pulse stretching”, IEEE 
Trans. on Nuclear Science, vol. 53, no. 4, pp. 2215-2220, Aug. 2006. 
[6] C. Niclass et al., “A single photon avalanche diode implementation in 130-nm CMOS technology”, IEEE J. 
of Sel. Top. in Quantum Electron., vol. 13, pp. 863-869, 2007.  
[7] J. A. Richardson et al., “Low Dark Count Single-Photon Avalanche Diode Structure Compatible With 
Standard Nanometer Scale CMOS Technology”, IEEE Photonics Technology Letters, Vol. 21 , Issue 14, pp. 
1020 1022, (2009). 
[8] M. Gersbach et al., “A Parallel 32x32 Time-to-Digital Converter Array Fabricated in a 130nm Imaging 
CMOS Technology”, IEEE European Solid-State Circuits Conference (ESSCIRC`09), pp. 196-199, Sept. 
2009. 
[9] D. Stoppa at al., “A 32x32-Pixel Array with In-Pixel Photon Counting and Arrival Time Measurement in the 
Analog Domain”, IEEE European Solid-State Circuits Conference (ESSCIRC`09), pp. 204-207, Sept. 2009. 
[10] J. Richardson et al., “A 32x32 50ps Resolution 10 bit Time to Digital Converter Array in 130nm CMOS for 
Time Correlated Imaging”, IEEE Custom Integrated Circuits Conference (CICC’09), pp. 77-80, Sept. 2009. 
[11] EC FET Open MEGAFRAME project, www.megaframe.eu. 
[12] D. Stoppa, L. Pancheri, M. Scandiuzzo, L. Gonzo, G.-F. Dalla Betta, A. Simoni, “A CMOS 3-D Imager 
based on Single Photon Avalanche Diode”, IEEE Trans. On Circuits and Systems I, Vol. 54, No. 1, January 
2007. 
[13] C. Veerappan, J. Richardson, R. Walker, D.-U. Li, M. W. Fishburn, Y. Maruyama, D. Stoppa, F. Borghetti, 
M. Gersbach, R. K. Henderson, E. Charbon, "A 160x128 Single-Photon Image Sensor with On-Pixel 55ps 
10b Time-to-Digital Converter", IEEE International Solid-State Circuits Conference, San Francisco, CA, 
USA, vol. 54, 2011, pp. 312-313, 20-24 February 2011. 
 
 
 
