# Arrayable Voltage-Controlled Ring-Oscillator for Direct Time-of-Flight Image Sensors

I. Vornicu, R. Carmona-Galán, Senior Member, IEEE, Á. Rodríguez-Vázquez, Fellow, IEEE

Abstract—Direct time-of-flight (d-ToF) estimation with high frame rate requires the incorporation of a time-to-digital converter (TDC) at pixel level. A feasible approach to a compact implementation of the TDC is to use the multiple phases of a voltage-controlled ring-oscillator (VCRO) for the finest bits. The VCRO becomes central in determining the performance parameters of a d-ToF image sensor. In this paper we are covering the modeling, design and measurement of a CMOS pseudo-differential VCRO. The oscillation frequency, the jitter due to mismatches and noise and the power consumption are analytically evaluated. This design has been incorporated into a 64×64-pixel array. It has been fabricated in a 0.18μm standard CMOS technology. Occupation area is 28×29µm<sup>2</sup> and power consumption is 1.17mW at 850MHz. The measured gain of the VCRO is of 477MHz/V with a frequency tuning range of 53%. Moreover it features a linearity of 99.4% over a wide range of control frequencies, namely from 400MHz to 850MHz. The phase noise is of -102dBc/Hz at 2MHz offset frequency from 850MHz. The influence of these parameters in the performance of the TDC has been measured. The minimum time bin of the TDC is 147ps with a RMS DNL/ INL of 0.13/ 1.7LSB.

Index Terms—phase interpolator; pseudo-differential voltagecontrolled ring-oscillator; time-to-digital converter

#### I. INTRODUCTION

Time-to-Digital Converters (TDC) are basic building blocks for the implementation of 3D CMOS Image Sensors (CIS) based on direct Time-of-Flight (d-ToF) measurements [1]. Similar to what is done for 2D imagers, TDCs can be embedded into a d-ToF-CIS by using either perchip, or per-column or per-pixel architectures. Each option raises specific constraints and hence poses different design challenges. Particularly, per-pixel architectures are largely constrained by TDC power consumption and area occupation. The reason is obvious: on the one hand, the area must shrink for minimum pixel pitch; on the other hand, since the total power is proportional to the pixel count, the power per TDC must be as low as possible.

The rationale for d-ToF-CIS with per-pixel TDCs is linked to the necessity of using thousands of image captures to properly reconstruct 3D scene maps. It requires large frame

This work has been funded by the Office of Naval Research (USA) ONR, grant No. N000141410355, the Spanish Ministry of Economy (MINECO) through project TEC2015-66878-C3-1-R (European Region Development Fund, ERDF/FEDER), and Junta de Andalucía, Consejería de Economía, Innovación, Ciencia y Empleo (CEICE) P12-TIC 2338

Ion Vornicu, Ricardo Carmona-Galán, Ángel Rodríguez-Vázquez - Instituto de Microelectrónica de Sevilla, IMSE, CNM (CSIC, Universidad de Sevilla), Avenida Americo Vespucio s/n, Parque Científico y Tecnológico de la Cartuja 41092 – Sevilla, Spain (ivornicu@imse-cnm.csic.es)

rates, in the range of 1-10kfps, and naturally calls for parallelism and hence for the usage of per-pixel TDCs.

This paper reports a TDC circuit which is conceived to be embedded per-pixel into a d-ToF-CIS based on  $\underline{\mathbf{S}}$ ingle- $\underline{\mathbf{P}}$ hoton  $\underline{\mathbf{A}}$ valanche- $\underline{\mathbf{D}}$ iodes (SPADs). As required for the per-pixel implementation, the TDC is designed for low area and low power consumption when implemented in a CMOS technology. Design equations are reported and measurements from a  $64\times64$  array implemented in a digital 180nm CMOS technology are presented for validation purposes.

Most compact options to implement TDCs are based on a coarse counter and the use of either delay lines or oscillators to encode the finer bits [2]. However, the former still requires the distribution of a high-speed clock across the pixel array. Regarding oscillator implementation, different options can be considered as well, such as ring oscillators and LC tanks. Although LC tanks feature better phase noise than ring oscillators, the latter are better suited for standard CMOS technologies. Besides that, considerations regarding versatility, compactness, power dissipation, frequency tuning range and simultaneous multi-phase generation lead us to choose  $\underline{V}$ oltage  $\underline{C}$ ontrolled  $\underline{R}$ ing  $\underline{O}$ scillators (VCRO) for the implementation of per-pixel TDCs.

Seeking to address speed challenges, the TDC reported in this paper employs a two-step architecture that requires a VCRO with an even number of phases. Either a true or a pseudo-differential ring oscillator can be used to this purpose. The latter [3], [4] has some advantages over the former [5]. First, pseudo-differential ring oscillators minimize the jitter due to thermal noise by maximizing the waveform amplitude [6]. Second, they have zero static power consumption. As disadvantages, they have a worse supply noise rejection and higher jitter due to the positive reaction of the cross-coupled inverters. This paper concentrates on the analysis and design of the pseudo-differential scheme.

Frequency control is another relevant feature for TDC implementation. It can be achieved either by using current starved techniques [7], [8] or by resistive tuning of the delay cell [9]. Another widely used technique is based on tuning the voltage supply or the load capacitor of the delay cell [10], [11]. Our architecture employs resistive tuning. Specifically, tuning is achieved by connecting a variable resistor to the charging/discharging path of the individual pseudo-differential delay cell output nodes. In order to achieve minimum area and power consumption, this variable resistor is implemented by using transmission gates. To the best of our knowledge, such approach has never been used before for this particular type of delay cell. As compared to the current starved technique, the variable resistor implemented with a transmission gate allows

full swing between power rails and much higher oscillation frequencies (see Fig. 1). Moreover the maximum deviation from 50% duty cycle is lowered from 9% down to less than 3.5% along the entire range of frequency control voltage (see Fig. 2).



Fig. 1 Comparison of VCROs based on the proposed time constant tuning and starved inverter scheme



Fig. 2 Duty cycle variation of VCROs based on the proposed time constant tuning and starved inverter scheme

Besides describing the proposed architecture and reporting measurement results, this paper also includes calculations for the oscillation frequency, the jitter due to white and flicker noise and the power consumption. These calculations are employed to support the VCRO design procedure by providing initial, rough estimations of the design parameters. Also, the insight provided by the analysis outcomes is useful for making refinements during an iterative design procedure.

This paper is organized as follows: Section II presents a short overview of the design and operation of the TDC building blocks. Section III concentrates on the model of the VCRO and the computation of the oscillation frequency. Section IV develops the analytical analysis of the VCRO limitations that have an impact on the TDC performance. A thorough analysis of the VCRO mismatch and noise gives a better insight of the design. Section V is meant to compute the power consumption of the in-pixel TDC. Section VI indicates a possible design guideline and Section VII is dedicated to describe the experimental setup and several measurement results. Section VIII draws the conclusion of this work.

# II. ARCHITECTURE OF THE TDC BASED ON VCRO

Although TDCs can be implemented by using just one counter, this would require very large clock frequencies to achieve small time bins. It is overcome by performing the conversion in two steps: coarse and fine. Fig. 3 displays the concept of such two-step TDC. The first step of the conversion

is completed by a counter that is fed by the first phase of the VCRO. This counter operates at much lower frequency than required for one-step architectures. The second step occurs by the end of the conversion interval, when the oscillation is stopped, and consists of encoding the VCRO phases in the final state. A thermometric-to-binary encoder is employed for this purpose.



Fig. 3 VCRO-based TDC with coarse/fine conversion steps

The VCRO in this paper delivers 8 phases from 4 pseudo-differential stages. This number provides a reasonable balance between area and oscillation frequency, i.e. the more phases the more area and the lower the oscillation frequency required for the same temporal resolution. In these conditions, time intervals between the edges established by the input logic are measured by counting the integer number of oscillation periods, which renders the coarser bits of the conversion (8b in this occasion) and then interpolating the 8 phases of the VCRO to get the 3 finest bits.

The building blocks of the in-pixel TDC are: the start/stop logic, the VCRO, the ripple counter, and the phase encoder. The time-to-digital conversion is realized as follows:

1) The Start/ Stop logic (Fig. 4) defines the limits of the time interval to be measured. The output signal EN VCRO equals to the time elapsed between the rising edges of the Start and Stop signals. The Start signal can be provided either externally, Ext\_Start, or by the local SPAD detector, Vout. The Stop signal, Ext Stop, is the synchronization signal of the pulsed laser which triggers the light pulse. In this scenario, first occurs the synchronization pulse, then the light pulse which travels to the scene and back to the sensor. The light pulse is eventually detected by the SPAD which provides the Start pulse for the TDC. Finally the Stop pulse is given by the next synchronization pulse. This technique is called reverse start-stop. The most important feature of this block is that its output stays disabled as long as no Start pulse precedes a Stop pulse. This is the key of the power saving strategy: the TDC remains OFF if the SPAD detector is not fired. The other way around means that the TDC consumes power even if no light is detected which is not power efficient. This is not desirable especially for in-pixel TDC architectures because in this case all TDCs will turn ON at the same time which for large resolution means a tremendous current peak.



Fig. 4 Schematic of the Start/ Stop logic block

2) Signal EN\_VCRO turns on the VCRO (Fig. 5). It is composed by 4 pseudo-differential stages, with positive feedback between each pair of complementary outputs. This shortens the start-up time, hence improving the overall TDC accuracy. Also, auto-alignment is achieved by forcing the oscillator to start each time with the same phase through the reset signal, R. The block labeled Tune, to be explained later, is employed to provide wide-range linear control of the oscillation frequency.

Post-layout simulations have been performed to evaluate the delays between the Ext\_Start or Vout signal and VCRO output and also the delays between the Ext\_Stop signal and VCRO output. These delay paths are matched such that the difference between them is less than 110ps. It is worth to mention that the delay between the rising edge of EN\_VCRO signal and VCRO output is about 50ps. It matches with the delay between the falling edge of EN\_VCRO signal and VCRO output. The overall mismatch of the delay paths translates into an offset error which can be easily canceled in the calibration phase. However this error is much less than the FWHM jitter of the SPAD plus TDC ensemble.



Fig. 5 Pseudo-differential VCRO



Fig. 6 Schematic of the ripple counter and inset showing the DFF

The first phase of the VCRO (labeled out1) drives the ripple counter (Fig. 6), whose 8b output represents the most significant bits of the conversion (B10...B3). On the rising edge of Ext\_Stop, the counter keeps the number of the full oscillation periods which is the coarse approximation of the input time interval. Signal R is an asynchronous reset that is also employed to reset the VCRO. Seeking to reduce the area and the switching power without losing from the maximum allowed input frequency, the ripple counter is based on CMOS **D**-type **F**[lip-**F**[lops (DFF) [13], [14]. Hence the channel length of the transistors is the minimum allowed by the technology.

Worst case post layout simulations have been performed. The DFF has been proved to work properly from 20kHz up to 2GHz. The lower input frequency in this circuit is given by the refresh rate requirement or the minimum retention time of the DFF internal capacitive nodes (Fig. 6 inset).

3) On the rising edge of Ext\_Stop, signal EN\_VCRO turns the VCRO OFF. The frozen oscillator phases are fed into an encoder (Fig. 7) to obtain the 3 least significant bits of the conversion. The encoder's outputs are described by:

$$B_0 = \overline{out1} \oplus \overline{out2} + \overline{out3} \oplus \overline{out4} \tag{1}$$

$$B_1 = \overline{out1 \cdot out2 \cdot out3 + \overline{out1} \cdot \overline{out2} \cdot \overline{out3}}$$
 (2)

$$B_2 = out1 (3)$$

By employing a CMOS XNOR, the total area of the encoder is less than  $260\mu m^2$  in this prototype. Basically at the end of the input time interval, on the rising edge of Ext\_Stop, the coarse counter holds the 8 most significant bits of the conversion. Right after that, the encoder provides the finest 3 bits. The 11b conversion code is stored in an in-pixel SRAM memory.

Let us consider that the conversion time,  $\tau_{conv}$  is the time elapsed between the end of the input time interval and the moment when the digital code is available at the output of the TDC.

$$\tau_{conv} = \max\{2\tau_{NAND} + 2\tau_{INV} + \tau_{NOR} + \tau_{recov}, 8\tau_{DFF}\}$$
 (4)

where  $\tau_{NAND}$ ,  $\tau_{NOR}$ ,  $\tau_{INV}$ ,  $\tau_{DFF}$  and  $\tau_{recov}$  are the delays introduced by the logic gates and CMOS DFF and the recovery time of the VCRO internal nodes. The conversion time is about 2ns. This feature renders the proposed architecture very well suited for high frame rate d-ToF imagers.



Fig. 7 Schematic of the phase encoder and CMOS XNOR gate

#### III. MODEL OF THE VCRO

Fig. 8(a) shows the block diagram of the pseudo-differential stages composing the VCRO where the asynchronous reset R is used for auto-alignment. The tunable element, labeled Tune, is basically a transmission gate (Fig. 8(b)) that is enabled by the signal EN\_VCRO. When this signal is low, the transmission gate is in open circuit, and therefore no oscillation takes place. When EN\_VCRO is high, the transmission gate is a voltage controlled resistor, which resistance, called  $R_V$  is tuned through the voltage labeled TUNE to set the oscillation frequency. The larger  $R_V$  the larger the delay introduced by the delay cell and thus the lower

the oscillation frequency.

The behavior of the oscillator is not easy to describe because of its nonlinearity. Usual techniques employed for linear circuits do not apply here. However, it is possible to do a progressive analysis that starts with a linear step and then, in order to have sustained oscillations, introduces a nonlinear amplitude control. The good thing is that the oscillation frequency can be predicted in the first step from a linearized model of the delay cell [15]. First of all, we are interested in the delay cell when EN\_VCRO is high and the reset signal R is also high. This results into the simplified schematic of Fig. 9 where transistors MP1 and MN1 are the components of the inverters I in Fig. 8(a), transistors MP2 and MN2 correspond to the NAND (the other input is high and the corresponding branch is not shown), and  $R_V$  is the resistance of the transmission gate inside the element Tune.

This simplified schematic can be modeled by the linearized equivalent of Fig. 10, from where the following transfer function is obtained:

$$H(s) = \frac{V_o(s)}{V_i(s)} = -\frac{(g_{mn1} + g_{mp1})Z_{eq}R_o}{R_V + R_o + Z_{eq}}$$
 (5)

where  $g_{mn1}$ ,  $g_{mp1}$  are the transconductances of MN1 and MP1,  $R_o$  is their equivalent output resistance, and  $R_V$  is given by the inverse of the conductance of the transmission gate,  $G_V$ :

$$G_V = \beta_n \left[ \text{TUNE} - V_o(t) - V_{T_n} \right] + \beta_p \left[ \text{VDD} - V_{sat,MP1} - V_{T_p} \right]$$
 (6)

with  $\beta_n = \mu_n C'_{ox}(W/L)_n$  for the nMOS transistor of the transmission gate and  $\beta_p = \mu_p C'_{ox}(W/L)_p$  for the pMOS.  $V_o(t)$  is the voltage at the output of the transmission. Hence, Eq. (6) holds as long as  $V_o(t) < \text{TUNE} - V_{T_n}$ .



Fig. 8 (a) Block diagram of the delay cell and (b) schematic of Tune



Fig. 9 Equivalent schematic of the delay cell



Fig. 10 Linearized equivalent of the half circuit of the delay cell

In addition to this,  $Z_{eq}$  in Eq. (5) is given by:

$$Z_{eq} = \frac{R_N R_L}{s C_L R_N R_L + R_N + R_L} \tag{7}$$

where  $R_N$  is the negative input resistance of the feedback differential pair (Fig. 9):

$$R_N = -\frac{1}{g_{mn2} + g_{mn2}} \tag{8}$$

and  $R_L$  captures the equivalent positive output resistance of transistors MN2 and MP2:

$$R_L = r_{omn2} || r_{omp2} \tag{9}$$

Finally,  $C_L$  is the capacitance in the output node:

$$C_L = C_N + C_P \tag{10}$$

where  $C_N$ ,  $C_P$  are the lumped capacitances of the transistors MN1, MN2, MP1, MP2, and transistors MN and MP of the transmission gate:

$$C_N = C_{GS,MN1} + C_{GS,MN2} + C_{GS,MN} + C_{DB,MN2} + C_{SB,MN} + 2C_{GD,MN2}$$
(11)

$$C_P = C_{GS,MP1} + C_{GS,MP2} + C_{GD,MP} + C_{DB,MP2} + C_{DB,MP} + 2C_{GD,MP2}$$
(12)

These expressions employ the Miller effect for the calculation of parasitic capacitances of digital inverters [16]. By replacing Eq. (7) and (8) into (5), H(s) can be written as:

$$H(s) = -\frac{(g_{mn1} + g_{mp1})R_o R_L}{R_L + (R_V + R_o)[1 - (g_{mn2} + g_{mp2})R_L + sC_L R_L]}$$
(13)

Let us assume that all the delay stages are described by H(s) so that the open loop gain is:

$$H_{op}(s) = [H(s)]^4$$
 (14)

According to Barkhausen criterion, this open loop gain must yield a phase shift of  $2\pi$  and a gain of unity at the oscillation frequency,  $f_0[17]$ . Therefore the following must be fulfilled:

$$\left| H_{op}(j2\pi f_0) \right| = 1 \tag{15}$$

$$\varphi(j2\pi f_0) = \arg[H_{op}(j2\pi f_0)] = 4\arg[H(j2\pi f_0)] = \pi$$
 (16)

The oscillation frequency is then:

$$f_o = \frac{1}{2\pi C_L} \left[ \frac{1}{R_L} + \frac{1}{R_V + R_O} - (g_{mn2} + g_{mp2}) \right]$$
 (17)

This expression can be simplified taking into account that MN1 and MP1 act as switches and hence that the following

assumptions apply:

$$R_o \ll R_V \tag{18}$$

$$\frac{1}{R_V + R_o} \gg \frac{1}{R_L} - (g_{mn2} + g_{mp2}) \tag{19}$$

Resulting into the following simplified oscillation frequency expression:

$$f_o \cong \frac{1}{2\pi C_L R_V} \tag{20}$$

 $R_V$  is therefore, the key parameter for oscillation frequency control as long as Eqs. (18) and (19) hold.

The previous analysis has been employed to support the design procedure for the chip in this paper. All the transistors have minimum length, L=180nm. The nominal values for the widths are:  $W_{MP_1}=2.4\mu$ m,  $W_{MN_1}=800$ nm,  $W_{MP}=1\mu$ m,  $W_{MN}=1.2\mu$ m,  $W_{MP_2}=1\mu$ m and  $W_{MN_2}=250$ nm.

Adequacy of the procedure and the calculations beneath is illustrated in Fig. 11. The horizontal axis corresponds to the <u>M</u>ultiplication <u>Factor</u> (MF) that varies between 0.5 and 3. When the sizes of the transistors of either  $R_V$  or I or the NAND gate are varied, the multiplying factor MF is applied to either  $W_{MP}$  and  $W_{MN}$ , or  $W_{MP_1}$  and  $W_{MN_1}$ , or  $W_{MP_2}$  and  $W_{MN_2}$ , respectively. When all the transistors vary jointly, labeled in the figures with  $I = R_V = NAND$ , MF applies to all of them. This simulation is required to show the design tradeoff for sizing the positive reaction gain, seeking at the same time to minimize area without severely decreasing the oscillation frequency which is crucial to get a small time resolution.



Fig. 11 Illustrating design choices

The unity value of these scaling factors corresponds to the nominal design case. The vertical axis corresponds to the oscillation frequency obtained by electrical simulations. The set of curves show that the selected widths feature a reasonable high frequency. Note that around 20% larger frequencies might have been obtained by making the transistors involved in the block NAND more resistive. However, this choice brings the design closer to oscillation failure. The reason is that the positive reaction disappears to the limit, failing in this way to ensure the oscillation phase condition. Therefore MFs of NAND smaller than unity are not recommended. Note also that around 10% larger frequencies might have been obtained by using more conductive transistors for I and R<sub>V</sub> blocks. However, this penalizes area occupation and power consumption – see Fig. 25.

Moreover, besides lowering the oscillation frequency, stronger positive reaction also decreases the tuning range (see Fig. 12).



Fig. 12 VCRO tuning range dependence on MF of NAND



Fig. 13 Predicted, simulated and measured oscillation frequency dependence on the frequency control voltage

The accuracy of the proposed linearized model of the VCRO oscillation frequency has been demonstrated by successfully fitting the parameters of Eqs. (17) and (20) to the measurement results (Fig. 13). Thus  $R_L=1\mathrm{M}\Omega$ ,  $R_o=700\Omega$  and  $g_{mn2}+g_{mp2}=8\mu\mathrm{S}$ .  $C_L$  and  $R_V$  dependence on VCRO control voltage are shown in Fig. 14 and Fig. 15.



Fig. 14  $C_L$  dependence on the frequency control voltage



Fig. 15  $R_V$  dependence on the frequency control voltage

#### IV. VCRO LIMITATIONS

Main errors impacting the timing accuracy of VCRO-based TDCs are mismatch and jitter which are analyzed below. Although temperature and voltage supply variation have also a significant impact on the TDC time bin, they can be compensated by a global scheme [18].

## A. Mismatch

Taking into account that the VCRO has eight phases, the time bin,  $T_{bin}$ , of the in-pixel TDC can be expressed as:

$$T_{bin} = \frac{1}{8f_o} \tag{21}$$

Hence, the local time bin deviation is given as:

$$\Delta T_{bin} = T_{bin} \left( \frac{\Delta f_o}{f_o} \right) \tag{22}$$

showing that time bin uniformity is linked to the pixel-to-pixel mismatch of the oscillation frequency.

We have analyzed the effect of mismatch by making use of Monte Carlo simulation. We have simulated the behavior of the VCRO allowing a  $3\sigma$  spread of transistor mismatch parameters. We have then obtained maximum and minimum oscillation frequencies for each value of the multiplication factor (MF) that describes the scaling of the transistors in the delay cell. With this, we have calculated the deviation in the time bin for each value of MF using Eq. (22). These values are represented in Fig. 16 (square markers).



Fig. 16 Deviation of the time bin vs. MF

Moreover we have measured the maximum deviation of the time bin across the TDC array for 29 chips (Fig. 16-circle marker). However, while increasing device dimensions improves mismatch, it is detrimental to power consumption and oscillation frequency. Such trade-off has been addressed during the design process. In fact the deviations of the time bin (Fig. 16) are smaller for MFs larger than the nominal value of unity. However, penalties regarding area and power of these larger factors may not be assumable for a rather modest mismatch improvement. This becomes evident by looking at the Figure-of-Merit (FoM) presented in Section VI.

# B. Noise

Timing accuracy of the TDC is limited by the period jitter of the VCRO. It is defined as the standard deviation  $\sigma_{T_o}$  of the oscillation period,  $T_o = 1/f_o$  [19]. The positive feedback

action of transistors MN2 and MP2 in Fig. 9 is a source of time uncertainty. Hence, careful design is needed to preclude the overall positive feedback exceeding unity gain [6]. Still, the positive feedback features prompt start-up thus preventing further increases of the jitter through period-to-period coupling mechanisms – jitter from one delay cell that can affect jitter in another delay cell [20]. The impact of thermal and flicker noise on jitter is addressed in the next sections.

# 1) VCRO jitter due to white noise

Let us consider the half circuit of the delay cell depicted in Fig. 17(a). Assume that a negative step signal is applied at the input and the trip point of the inverters is located at VDD/2. It turns MN1 off and places MP1 in triode. Because the output voltage is initially set to the ground, MN2 is placed in triode and MP2 is turned off. Under these conditions, the voltage  $V_{o-}$  on the capacitor  $C_L$  starts to build up. But  $V_{o-}$  is also connected to a cross coupled inverters cell which has been set low. Therefore it slows down the charging of  $C_L$  until its switching point is reached. Fig. 17(b-lower inset) shows the equivalent circuit when  $V_{o-}$  is below VDD/2. This case corresponds to the upper branch of Eq. (23).

When  $V_0$  exceeds VDD/2, then  $V_0$  is speeded up towards VDD by the positive reaction loop. Fig. 17(b-upper inset) shows the equivalent circuit. The nodal equation corresponds to the lower branch of Eq. (23). In order to fulfill the continuity condition, both equations have to meet at VDD/2.



Fig. 17 (a) Half circuit and (b) approximate model of the half delay cell

The output voltage is calculated as:

$$\begin{cases} V_o(t) = \frac{R_{eqn}}{R_V + R_{eqn}} \text{VDD} \left( 1 - e^{-t/\tau_n} \right), & V_o(t) \le \frac{\text{VDD}}{2} \\ V_o(t) \cong \text{VDD} \left( 1 - \frac{R_{eqn}}{R_{eqn} - R_V} e^{-t/\tau_p} \right), & V_o(t) > \frac{\text{VDD}}{2} \end{cases}$$
 (23)

where  $\tau_n = \tau_p$  to approximate the lower branch of the equation. The time constants  $\tau_n$ ,  $\tau_p$  are given by:

$$\begin{cases}
\tau_n = \frac{R_V R_{eqn}}{R_V + R_{eqn}} C_L \\
\tau_p = \frac{R_V R_{eqp}}{R_V + R_{eqp}} C_L
\end{cases}$$
(24)

and  $R_{eqn}$ ,  $R_{eqp}$  are the equivalent resistances loading the output node of the delay cell. Their particular values are obtained by fitting the model to the simulation or measurement results. The propagation delay  $t_d$  is defined as the time interval between the ideal input step and the moment when the output ramp crosses the trip point of the next delay cell. The propagation delay is therefore given by:

$$t_d = \tau_n \ln \left( \frac{2R_{eqn}}{R_{eqn} - R_V} \right)$$
 as long as  $R_{eqn} > R_V$  (25)

There are three resistors contributing noise to  $V_o(t)$ , namely  $R_V$ ,  $R_{eqn}$  and  $R_{eqp}$ . The relation between this voltage noise and the jitter of the time delay is [19] [20]:

$$v_n^2 = \left(\frac{dV_o}{dt}\right)^2 \sigma_{t_d}^2 \tag{26}$$

On the one hand, the thermal noise of each resistor is fully integrated by  $C_L$  thus yielding a  $kT/C_L$  noise power term per resistor, totaling:

$$v_n^2 = 3 kT/C_L \tag{27}$$

On the other hand, the slope of the output voltage can be approximated by the ratio of the voltage range, VDD, to the rising time  $t_r$ , the latter obtained from Eq. (23) by considering that the rise time ends when 90% of the final voltage has been reached:

$$\frac{dV_o}{dt} \approx \frac{\text{VDD}}{t_r} = \frac{\text{VDD}}{\tau_n \ln \frac{2R_{eqn}}{R_{eqn} - R_V} + \tau_p \ln \frac{5}{3} \frac{R_{eqn}}{R_{eqn} - R_V}}$$
(28)

The jitter of the half delay cell is then:

$$\sigma_{t_d}^2 \approx 3 \frac{t_r^2}{\text{VDD}^2} \frac{kT}{C_L} \tag{29}$$

and that of the oscillation period is:

$$\sigma_{T_o}^2 = 2M\sigma_{t_d}^2 \tag{30}$$

where M is the number of delay cells in the ring. We have evaluated the impact of scaling individual blocks such as I,  $R_V$  and NAND on the cycle-to-cycle jitter over the full range of the TDC (Fig. 18). The unit MF design choice is justified from the jitter point of view as follows: according to Eq. (17),  $f_0$  could be increased by decreasing  $R_V$ , while the rest of the blocks remain the same. However the jitter also increases (square marker). Moreover the jitter improvement obtained by increasing only the widths of the transistors in I (circle marker), or the improvement obtained by increasing the widths of transistors in all of the blocks jointly (asterisk marker), is not worth it because, as will be shown later, it involves a significant increase of dynamic power.



Fig. 18 Cycle-to-cycle jitter vs. MF

According to Eq. (30), the jitter due to white noise depends on the number of delay cells in the loop, the variable resistor and the regenerative pair. If  $R_{eqn}$  and  $R_{eqp}$  decrease with respect to  $R_V$ , then the slope of  $V_o(t)$  decreases and thus the jitter associated to the output node due to white noise increases. This result is consistent with the theory that the jitter increases with the strength of the positive reaction. It is also shown by the simulation results (Fig. 19-square marker). In order to demonstrate the validity of the proposed model, the predicted jitter by Eq. (30) has been compared to the simulated one (see Fig. 19-circle marker). The parameters of the model are as follows:  $R_V = 3.3 \mathrm{k}\Omega$ ,  $T = 300 \mathrm{K}$ , k is the Boltzmann's constant, M is the number of delay cells and  $VDD = 1.8 \mathrm{V}$ .  $C_L$ ,  $R_{eqn}$  and  $R_{eqp}$  are shown in Fig. 20 and Fig. 21.



Fig. 19 Predicted and simulated cycle-to-cycle jitter vs. MF



Fig. 20  $C_L$  dependence on MF



Fig. 21 Output resistance of the cross coupled cell dependence on MF

# 2) VCRO phase noise due to flicker noise

Let us assume that the large signal oscillation frequency is the inverse of the accumulated delay of the stages, and that all stages have the same delay. Therefore:

$$f_o = (2Mt_d)^{-1} (31)$$

Using Eq. (25), and reformulating it in terms of conductance  $G_V$  and  $G_{eqn}$ , yields:

$$f_o = \frac{1}{2Mt_d} \approx \frac{G_V + G_{eqn}}{2MC_I \left(\ln 2 + G_{eqn}/G_V\right)}$$
(32)

The sensitivity of  $f_0$  to  $G_V$  is calculated from here as:

$$\frac{\partial f_o}{\partial G_V} = -2M f_o^2 \left( \frac{\partial t_d}{\partial G_V} \right) \tag{33}$$

Using this sensitivity, the spectral density of the flicker noise contribution is given as:

$$S_{f_o}^{1/f} = \left(\frac{\partial f_o}{\partial G_V}\right)^2 S_{G_V}^{1/f} = 4M^2 f_o^4 \left(\frac{\partial t_d}{\partial G_V}\right)^2 S_{G_V}^{1/f}$$
(34)

which contains two components:

- the sensitivity of the time delay to  $G_V$ ;
- the spectral density of the flicker noise for  $G_V$ .

The first component can be approximated by:

$$\frac{\partial t_d}{\partial G_V} \approx -\frac{C_L}{{G_V}^2} \cdot \frac{G_V \ln 2 + 2G_{eqn}}{G_V + 2G_{eqn}}$$
 (35)

On the other hand, the spectral density of  $G_V$  is composed of the terms corresponding to both transistors employed to implement it:

$$S_{G_V}^{1/f} = \beta_n^2 S_{V_{C_n}}^{1/f} + \beta_p^2 S_{V_{G_n}}^{1/f}$$
 (36)

The flicker noise of the NMOS transistor is mainly caused by the carrier number fluctuation  $(\Delta N)$  [19], [21] and [22]. According to McWhorter model, the spectral density of 1/f noise referred to the gate of NMOS in linear region is:

$$S_{V_{Gn}}^{1/f} = \left(\frac{q}{C'_{OX}}\right)^2 \frac{kTN_T E_F}{\gamma (WL)_n f} \tag{37}$$

where  $kTN_TE_F$  is the interface state density per unit energy at Fermi energy level, and  $\gamma$  is the McWhorter's tunneling parameter [23].

Regarding PMOS transistors, the 1/f noise within the linear region is mostly due to mobility fluctuation  $(\Delta \mu)$  [23], [24] and [25]. The Hooge's model states that the flicker noise spectral density depends on the gate voltage:

$$S_{V_{Gp}}^{1/f} = \frac{\alpha_H q \left( V_{GSp} - V_{T_p} \right)}{C'_{ox}(WL)_p f}$$
 (38)

where  $\alpha_H$  is the Hooge's parameter.

Combining all previous equations, the spectral density of  $f_o$  increases with the strength of the regenerative switching, as can be seen in the approximate form of the spectral density:

$$S_{f_o}^{1/f} \cong \frac{1}{4 \ln 2M^2 C_L^2} \left( \frac{1 + 6G_{eqn}/G_V}{1 + 8G_{eqn}/G_V} \right) S_{G_V}^{1/f}$$
 (39)

Moreover, the spectral density of  $f_o$  is inversely proportional to the square of the load capacitance. Also Eqs (36) - (39), show that it is inversely proportional to the cube of the length of the transistors of the variable resistor. Nevertheless, decreasing the jitter due to flicker noise by increasing the length of the transistors in  $R_V$  can eventually end up also decreasing the top oscillation frequency (Fig. 22-circle marker). The phase noise has been evaluated at 2MHz offset frequency (Fig. 22-asterisk marker). We have finally chosen the smallest length available because phase noise does not vary too much around this value while the oscillation frequency rapidly degrades for longer transistors.



Fig. 22 Phase noise (asterisk marker) vs. oscillation frequency (circle marker) tradeoff



Fig. 23 Predicted and simulated phase noise of the VCRO

The phase noise predicted by Eq. (39) is compared to the simulated one. The parameters have the following values:

 $V_{GSp} = 900 \,\mathrm{mV}; \ C_L = 70 \,\mathrm{fF}; \ G_V = 333.3 \,\mu\mathrm{S}; \ G_{eqn} = 33.3 \,\mu\mathrm{S};$   $\mu_n = 0.0314 \,\mathrm{m}^2/\mathrm{Vs}; \ v_{thn} = 307 \,\mathrm{mV}; \ \mu_p = 0.0114 \,\mathrm{m}^2/\mathrm{Vs};$   $v_{thp} = -455 \,\mathrm{mV}; \ t_{ox} = 4.2 \,\mathrm{nm}; \ e_{ox} = 35.13 \,\mathrm{pF/m}; \ K_{fn} = 1e - 24:$ 

 $K_{fp} = 1e - 24$ , where  $K_{fn} = q^2kTN_TE_F/\gamma$  and  $K_{fp} = \alpha_H q$  have empirical values [19].

#### V. POWER CONSUMPTION

The power consumption of the TDC is mainly due to the VCRO and the CMOS ripple counter.

## 1) VCRO power consumption

The two main contributions to the power drawn by the VCRO are the dynamic power and the direct-path power. On the one hand, using the model of Fig. 17, the instantaneous dynamic power related to half of the delay cell is:

$$\begin{cases} P_d(t) = \text{VDD}\left(C_L \frac{dV_o}{dt} + \frac{V_o}{R_{eqn}}\right) & \text{if } V_o(t) \le \frac{\text{VDD}}{2} \\ P_d(t) = \text{VDD} \cdot C_L \frac{dV_o}{dt} & \text{if } V_o(t) > \frac{\text{VDD}}{2} \end{cases}$$
(40)

Combining this equation with Eq. (23), the average dynamic power consumption is:

$$\begin{aligned} P_{d,avg} &\approx 2M \frac{VDD^2 C_L}{T} [1 + \\ &+ \frac{R_{eqn} R_V}{\left(R_{eqn} + R_V\right)^2} \left( \ln 2 - \frac{R_{eqn} - R_V}{2R_{eqn}} \right) \end{aligned} \tag{41}$$

On the other hand, the average direct-path power is:

$$P_{sc,avg} = t_{sc}I_{peak}VDDf_o$$
 (42)

where  $t_{sc}$  is the time interval during which both MN1 and MP1 (Fig. 9) are ON:

$$t_{sc} = \tau_p \ln \frac{\text{VDD}}{V_{T_p}} \frac{R_{eqn}}{R_{eqn} - R_V} - \tau_n \ln \frac{\text{VDD}R_{eqn}}{\text{VDD}R_{eqn} - V_{T_n}(R_{eqn} + R_V)} + \tau_n \ln \frac{2R_{eqn}}{R_{eqn} - R_V}$$

$$(43)$$

## 2) Ripple counter power consumption

The dynamic power drawn by the CMOS 8 bits coarse counter is given by:

$$P_{d,avg} = VDD^{2} (C_{1} + C_{2} + C_{QN} + C_{Q}) f_{o} \sum_{k=1}^{8} \frac{1}{2^{k}}$$
 (44)

where the capacitors are shown in Fig. 6.

Eq. (41) shows that the VCRO dynamic power is proportional to the power supply, the load capacitance and the strength of the positive feedback in the regenerative pair. Also, comparative evaluation of Eqs (41) and (42), shows that the VCRO dynamic power is far larger than the direct path power dissipation. This is not surprising because the rising/falling edges of the input and output delay cells are symmetrical (Fig. 24). Any increase in the dimension of the devices employed to implement the VCRO stages ends up in higher power consumption (Fig. 25).

The average dynamic power of the coarse counter is proportional to the capacitance of the CMOS flip-flop. The side effect of decreasing this capacitance is the increase of the minimum input frequency required for the counter to work.

The prediction of Eqs. (41) and (44) has been compared with the simulated average power (Fig. 26). The parameters involved in these equations are: M=4,  $R_{eqn}=60\mathrm{k}\Omega$ ,  $VDD=1.8\mathrm{V}$ ,  $T=1/f_o$ ;  $R_V$ ,  $C_L$  and  $f_o$  are the same as the ones used in Eq. (17);  $C_1=6\mathrm{fF}$ ,  $C_2=4\mathrm{fF}$ ,  $C_{QN}=8\mathrm{fF}$ ,  $C_O=2\mathrm{fF}$ .



Fig. 24 VCRO output waveforms



Fig. 25 VCRO power consumption vs. MF



Fig. 26 Predicted and simulated average power consumption

#### VI. DESIGN GUIDELINES

The overriding parameters of the VCRO optimized for inpixel TDC are area and power consumption. Concurrently, the time bin has to be pushed to its limits for this technology in order to achieve the best depth resolution. Thus, a basic design objective is maximizing the oscillation frequency  $f_o$  by choosing  $C_L$ ,  $R_V$  and the strength of the regenerative pair. In order to meet this objective, it is convenient to use larger values of  $C_L$  and smaller values of  $R_V$ . The reason is that increasing  $C_L$  reduces the jitter due to flicker noise while weakening the positive feedback in the cross-coupled pair lowers the jitter due to white noise.

Furthermore, mismatches, and hence device dimensions, are traded by area and power consumption, using the equations presented in Sections III, IV and V. These equations provide initial values of the design parameters and guidelines for further iterations depending on the outcome of the simulation results. Throughout the manuscript it has been shown that the selected transistor sizes, i.e. unit MFs, represent a good design compromise. This is further confirmed by the FoM of phase noise (FoM\_VCRO) in Fig. 27. It has been calculated as:

$$FoM\_VCRO = PN + 10log_{10} \left[ \left( \frac{\Delta f}{f_o} \right)^2 \frac{P_{d,avg}}{1mW} \right]$$
 (45)

where the phase noise PN is computed by Eq. (39), the offset frequency  $\Delta f$  is set to 2MHz and the average power  $P_{d,avg}$  is computed by Eqs. (41) and (44). Note that FoM is the best in the nominal unit value of MF, which is our design choice.



Fig. 27 FoM\_VCRO as a function of MF

## VII. EXPERIMENTAL RESULTS

The proposed VCRO has been employed in an array of 64×64 TDCs. Fig. 28 shows the microphotograph of the chip along with the floor plan of the pixel. The analog voltage that controls the oscillation frequency of the VCROs array is provided by an on-chip PLL whose core oscillator is an instance of the same VCRO. This enables the implementation of a global compensation mechanism to mitigate the effect of PVT variations on the time accuracy of the TDCs [18].



Fig. 28 Microphotograph of a pixel within the prototype array

# 1) Characterization of the VCRO-based TDC

The first characteristic that we have measured is the code uniformity without any pixel-to-pixel calibration. Deviations are due to the variations of the VCRO oscillation frequency and the duration of EN\_VCRO. In order to measure these deviations, the time bin has been set to 147ps by feeding the appropriate reference voltage. The input time interval is set to the maximum value, as it is the worst case for uniformity. In this case, intervals are of 297.48ns on average with a standard deviation of 56ps. These intervals are provided by a Time Interval Generator (TIG). The standard deviation of the TDC array is of 32 output codes. Furthermore, if needed, these deviations of the time bin can be lowered by applying, for instance, a calibration cycle based on a look-up table.

It is important to properly characterize the TIG, as it is

going to be the instrument to excite the TDC. The TIG reported in [26] delivers time-intervals from hundreds of picosecond to 870ns with 27ps incremental time resolution. Measurements of the intervals have been repeated 10<sup>4</sup> times. The maximum DNL and INL of the TIG across the full range of 870ns are of 0.59 and 4.66 LSB respectively.

The control voltage of the VCRO array can be either external or internal, in which case it comes from the compensation loop. The same PLL is also used to program the TDCs time bin. In this experiment, the PLL division factor ÷N has been swept from the minimum to the maximum value. Fig. 29 shows the output characteristic of the programmable TDC. The minimum and maximum time bins of 147ps and 432ps (red plot) are achieved with external control voltages of 0V and 1.8V, respectively. The rest of the curves have been obtained switching the control voltage to the internal voltage reference which actually is the output of the PLL's loop filter.

The performance of an individual programmable TDC based on the VCRO employed as time interpolator have been measured as well. The time bin has been set to 147ps and 432ps. The DNL and INL are 0.55 and 3.11 LSBs and 0.56 and 4.61 LSBs respectively (Fig. 30 and Fig. 31). RMS DNL and INL computed across the array are less than 0.35LSB and 1.5LSB [26].

In order to measure the single shot precision of the TDC we have considered the following scenario: the TDC is set at the maximum and minimum time bin. In this case the full range of the TDC is about 870ns and 300ns. In both cases the TIG is set to generate 10<sup>5</sup> time intervals of 10% and 90% of the full range. The standard deviations of the TIG at 28.4ns/ 255.9ns and 83.5ns/ 787ns are 17.3ps/ 15.6ps and 16.2ps/ 18.6ps, respectively. The histograms of the input time intervals and TDC output codes are depicted on the left and right sides of Fig. 32 and Fig. 33 respectively. The TDC jitter is computed by subtracting the standard deviation squares of the input time interval from the measured TDC output. The one shot precision of the TDC is affected by the jitter of the VCRO. Moreover a larger time bin is obtained by decreasing  $f_0$ , hence increasing the jitter. Therefore at the same TDC output code, the standard deviation of the single shot precision is bigger when the time bin is larger.



Fig. 29 Output characteristics of the programmable TDC



Fig. 30 TDC DNL for Tbin of 147ps and 432ps



Fig. 31 TDC INL for Tbin of 147ps and 432ps

The potential meta-stability problems of the VCRO have been contemplated as well. It may occur only at one phase at a time when the VCRO is stopped at integer number of oscillation periods. We have performed post-layout simulations to investigate how does the VCRO settles the outputs in this case. The VCRO has been stopped with 10ps step around the switching point of a certain phase. The internal nodes are successfully recovering to the correct states such that the encoder and ripple counter give the correct output codes. The worst case recovering time or the propagation delay through the ripple counter is less than 2ns.

#### 2) Measurements on the VCRO operation

The measured sensitivity of the oscillation frequency to the control voltage,  $K_{VCRO}$  is consistent with the simulated curve (see Fig. 13).  $K_{VCRO}$  computed in both cases is 477MHz/V. The oscillation frequency ranges from 300MHz to 800MHz when the control voltage ranges from 0.67V to 1.7V. Our design has a very good linearity of 99.4%. VCRO linearity is a measure of how linear is the dependence of the oscillation frequency on the control voltage. As the circuit is also employed as the core oscillator for the PLL, a high gain is required to avoid the PLL loop to unlock. Using the Eqs (41) and (44) one may obtain the power drawn by the VCRO, which is of  $663\mu W$  and 142nW respectively. The oscillation frequency has been considered of 850MHz.

The deviation of  $f_o$  is effectively mitigated by activating the compensation loop based on a PLL integrated on-chip [18].

Thus it decreases from 20% down to 2.4% when the temperature varies from  $0^{\circ}$ C to  $100^{\circ}$ C. When the voltage supply changes within  $\pm 10\%$  of its nominal value it decreases from 27% down to 0.27%.



Fig. 32 TIG and TDC jitters at 10% and 90% of full range; Tbin = 147ps



Fig. 33 TIG and TDC jitter at 10% and 90% of full range; Tbin = 432ps

The dependence of the VCRO output frequency on the PLL's frequency division factor  $\div N$  is shown in Fig. 34. Notice that as long as the PLL is locked, the dependence is linear for a wide range of frequencies from 363MHz up to 765MHz.

The proposed VCRO has been tested also as a building block of the on-chip PLL (Fig. 35). As long as the PLL is locked, the synthesized output frequencies and loop filter output voltage are linearly dependent on the frequency division factor. The frequency range is from 400MHz to 850MHz, with a division factor step of 50MHz. The loop filter output, which is later buffered to the control input of the array of VCROs, ranges from 0.81V to 1.67V.

According to post-layout simulations, the phase noise is 102dBc/Hz at 2MHz from 850MHz. The RMS values of the in-pixel VCROs jitter is measured by running the VCRO continuously for the whole range of control voltages (Fig. 36).



Fig. 34 VCRO output frequency vs. PLL division factor ÷N



Fig. 35 Measured output of the PLL's master VCRO (circle marker) and loop filter (dot marker) vs.  $\div N$ 



Fig. 36 Measured in-pixel VCRO jitter vs. external control voltage

The jitter of the TDC has been measured as well for both extremes of the time bin. The standard deviation of the TDC output code at 10% and 90% of the dynamic range is of 0.78 and 13.88 codes at 147ps time bin and 2.36 and 24.44 codes at 432ps time bin.

TABLE I
COMPARISON WITH THE STATE-OF-THE-ART VCO

| Ref.                        | [3]                  | [9]              | [10]      | [11]       | [12]           | [27]                | This<br>work    |  |
|-----------------------------|----------------------|------------------|-----------|------------|----------------|---------------------|-----------------|--|
| Tech.<br>[μm]               | TSMC<br>0.18         | TSMC<br>0.18     | ST 0.090  | 0.13       | 0.028          | 0.35                | UMC<br>0.18     |  |
| Voltage supply [V]          | 1.8                  | 3.3              | 1.75      | 1.1        | 0.85-<br>1.05  | 3.3                 | 1.8             |  |
| Delay cell                  | PS                   | SE               | LC        | PS         | PS             | Diff.               | PS              |  |
| No. of cells                | 2                    | 3                | -         | 3          | 4              | 4                   | 4               |  |
| Freq. range [MHz]           | 440-<br>1595,<br>72% | 16-367,<br>95.6% | 3364.8    | 364.8 1500 |                | 1070<br>and<br>2060 | 400-850,<br>53% |  |
| No. of phases               | 4                    | 3                | 1         | -          | 8              | 8                   | 8               |  |
| $K_{VCO}\left[MHz/V\right]$ | 825                  | 153              | -         | -          | -              | 561                 | 477             |  |
| Linearity                   | 87.3%                | 68.6%            | -         | -          | -              | -                   | 99.4%           |  |
| PN [dBc/Hz];<br>Δf [MHz]    | -93;<br>1            | -88;<br>0.1      | -116; 0.4 | -88;<br>1  | -              | -99;<br>2           | -102;<br>2      |  |
| Area/A <sub>VCRO</sub>      | 3.11                 | 1.53             | 460(*)    | 126(*)     | 0.8(*)         | -                   | 1               |  |
| Avg. power [mW]             | 26                   | 35.5             | 121(**)   | 0.25       | 2.2<br>5.3(**) | 14.6                | 1.17            |  |
| FoM_VCRO<br>[dBc/Hz]        | -143                 | -129             | -173      | -158       | -              | -147                | -156.3          |  |

PS = Pseudo-differential; SE = Single Ended

(\*) Estimation of the oscillators' area based on the chip microphotograph (\*\*) Total power of the DPLL

Comparison with the state-of-the-art is provided in TABLE I. With respect to the references [10], [11] and [12], they are all VCRO's controlled by a digital word and a DAC generating the tuning voltage. In [10] the mechanism for TDC operation relies in time amplification. The reported phase noise is of -116dBc/Hz @0.4MHz. This is a smaller phase noise than ours. In fact, it has a better FoM VCRO. But this it has been obtained by a circuit with much larger area, which is not acceptable for the inclusion of a per-pixel TDC. In [11] the resistance of a transistor introducing some delay between cells is modified, a mechanism similar to the one that we are implementing. The main difference being that our variable resistor is in the path of the signal while the one in [11] is incorporating some losses path, which can have an incidence in power consumption. Our VCRO has a better phase noise and a FoM VCRO close to the one reported by [11] which however employs a larger area. Concerning [12], the reported occupied area is less than the one of our VCRO. However, it is achieved in 28nm technology which, at this time, is hardly suitable to also integrate SPAD detectors on the same chip. They report a jitter between 16.1ps and 19.3ps, which is close to our cycle-to-cycle jitter of 20ps. With respect to references [3], [9] and [27], the VCRO reported in this paper presents better phase noise and FoM VCRO with less area. This achievement in terms of power is explained as follows: the references [3] and [9] have the same voltage supply and transistors channel length, but use transistors 10 times larger. Besides the oscillation frequency and the number of stages are different. Higher oscillation frequency increases the dynamic power. Instead, the design in reference [27] is implemented in 350nm, where the power consumption is higher. Besides, the design draws static power as well. Moreover the voltage supply is 3.3V, while we are using 1.8V. It makes big

| COMPARISON WITH THE STATE-OF-THE-ART TDCS |      |                       |      |      |      |               |        |       |       |                 |          |                     |                     |       |
|-------------------------------------------|------|-----------------------|------|------|------|---------------|--------|-------|-------|-----------------|----------|---------------------|---------------------|-------|
| Ref. Year                                 |      | Architecture          | Tech | Bits | LSB  | Meas.<br>rate | Range  | DNL   | INL   | Power           | Area     | ENOB <sup>(*)</sup> | FoM <sup>(**)</sup> | App.  |
|                                           |      |                       | [nm] |      | [ps] | [MHz]         | [ns]   | [LSB] | [LSB] | [mW]            | $[mm^2]$ |                     | [pJ/conv.]          |       |
| [28]                                      | 2015 | Gateable Vernier RO   | 130  | 11   | 7.3  | 1             | 9      | 3.2   | 4     | 1.2             | 0.03     | 8.68                | 2.93                | PET   |
| [29]                                      | 2015 | Cyclic interp. + RO   | 350  | 16   | 0.61 | 0.8           | 327000 | 0.4   | 7.37  | 80              | 0.64     | 13                  | 12.21               | ToF   |
| [30]                                      | 2015 | TVC + SAR ADC         | 65   | 9    | 0.63 | 120           | 0.3    | 0.98  | 3.01  | 3.7             | 0.064    | 6.99                | 0.244               | _     |
| [31]                                      | 2015 | 2-step time amp.      | 65   | 9    | 1.2  | 10/150        | 0.614  | 0.67  | 0.62  | 0.602/<br>8.299 | _        | 8.30                | 0.191/<br>0.176     | ADPLL |
| [32]                                      | 2015 | GRO+ $\Sigma\Delta$   | 65   | 11   | 0.48 | 150           | 1.8    | _     | 2.24  | 3.52            | 0.03     | 9.30                | 0.037               | ADPLL |
| [33]                                      | 2015 | TA+DL                 | 65   | 4    | 0.9  | _             | _      | 0.2   | 0.25  | 0.2             | 0.045    | 3.68                | 0.310               | DPLL  |
| [34]                                      | 2014 | Vernier DL            | 65   | 7    | 5.7  | 100           | 0.73   | 1     | 2.5   | 1.75            | 0.004    | 5.19                | 0.479               | DPLL  |
| [35]                                      | 2014 | 2-step 3D-vernier sp. | 130  | 11   | 6.98 | 1             | 14     | 0.8   | 1.5   | 0.329           | 0.28     | 9.68                | 0.400               | ToF   |
| [36]                                      | 2014 | GmC int+SAR ADC       | 90   | 9    | 1    | 10            | 0.256  | 0.7   | 2.3   | 20.4            | 0.31     | 7.28                | 13.13               | ToF   |
| [37]                                      | 2014 | Pulse shrinking       | 350  | ≈9   | 40   | 0.00001       | 22     | _     | 0.6   | 0.0017          | 0.025    | 8.3                 | 539.4               | _     |
| [38]                                      | 2014 | Analog time exp.      | 350  | ≈10  | 844  | 131           | 75     | _     | 0.03  | 7.9             | 0.012    | 9.96                | 0.060               | ToF   |
| [39]                                      | 2013 | 2-step pulse amp.     | 65   | 7    | 3.75 | 200           | _      | 0.9   | 2.3   | 3.6             | 0.02     | 5.28                | 0.463               | ADPLL |

297

0.55

3

0.009

TABLE II

COMPARISON WITH THE STATE-OF-THE-ART TDCS

180

147

2

Multiphase VCRO

This work 2017

difference because the dynamic power consumption is proportional with the square of the voltage supply.

In order to provide a straightforward comparison with state-of-the-art TDCs, we have composed TABLE II. In addition, we have computed the FoM\_TDC employed in [39], and plotted it vs. the time resolution (Fig. 37).



## VIII. CONCLUSION

The modeling, design and measurement of a pseudodifferential VCRO aimed for in-pixel TDC for d-ToF image sensors is reported. The proposed VCRO has been tested both as a PLL building block and as a time interpolator for the pixel-level TDC. We have provided a detailed analysis of the oscillation frequency, the impact of the mismatch on the deviation of the TDC time bin, the jitter due to white noise, the phase noise due to flicker noise and the power consumption of the VCRO and ripple counter. All the proposed models are meant to obtain the first order approximation in an iterative simulator-assisted design procedure. All models have been demonstrated by comparing them with simulations and/or measurement results. Comparison with the state-of-the-art VCRO and TDC has been provided as well.

0.0017

9.00

0.009

ToF

## REFERENCES

- [1] M. Gersbach, Y. Maruyama, R. Trimananda et al., "A time-resolved, low-noise single-photon image sensor fabricated in deep-submicron CMOS technology", *IEEE Journal of Solid-State Circuits*, Vol. 47, No. 6, pp. 1394-1407, June 2012
- [2] Z. Cheng, X. Zheng, M. J. Deen, and H. Peng, "Recent developments and design challenges of high-performance ring oscillator CMOS timeto-digital converters", *IEEE Transactions on Electron Devices*, Vol. 63, No. 1, pp. 235–251, Jan. 2016.
- [3] Y. Chuang, S. Jang, J. Lee, S. Lee, "A low voltage 900MHz voltage controlled ring oscillator with wide tuning range", *IEEE Asia-Pacific Conference on Circuits and Systems*, Vol. 1, pp. 301-304, Dec. 2004
- [4] M. Sheu, T. Lin, W. Hsu, "Wide frequency range voltage controlled ring oscillators based on transmission gates", *IEEE International Symposium* on Circuits and Systems, Vol. 3, pp. 2731-2734, May 2005
- [5] F. Kallel, M. Fakhfakh, M. Loulou, S. Oumansour, M. Sbaa, "Modelling the frequency sensitivity Kvco of a ring oscillator", 15<sup>th</sup> IEEE International Conference on Electronics, Circuits and Systems, pp. 794-797, Aug. 2008
- [6] J. A. McNeil, D. Ricketts, The designer's guide to jitter in ring oscillators, New York, NY 10013, Springer Science + Business Media, 2009, Classification of ring oscillators, pp. 10-34
- [7] N. Retdian, S. Takagi, N. Fujii, "Voltage controlled ring oscillator with wide tuning range and fast voltage swing", *IEEE Asia-Pacific Conference on ASIC*, pp. 201-204, Aug. 2002
- [8] M. Deen, M. Kazemeini, S. Naseh, "Performance characteristics of an ultra-low power VCO", *IEEE International Symposium on Circuits and Systems*, pp. I-697 – I-700, May 2003
- [9] N. Gupta, "Voltage-controlled ring oscillator for low phase noise application", *IEEE International Journal of Computer Applications*, Vol. 14- No. 5, pp 23-27, Jan. 2011
- [10] M. Lee, M. E. Heidari, A. A. Abidi, "A low-noise wideband digital phase-locked loop based on a coarse–fine time-to-digital converter with subpicosecond resolution", *IEEE Journal of Solid-State Circuits*, Vol. 44, No. 10, June 2009

 $<sup>\</sup>overline{\text{(*)}}$  Effective number of linear bits: ENOB = Bits-log<sub>2</sub>(INL+1) [39]

<sup>(\*\*)</sup> Fom TDC = Power/ $(2^{ENOB} *Fs)$  [39]

- [11] A. Elshazly, R. Inti, B. Young, P. K. Hanumolu, "Clock multiplication techniques using digital multiplying delay-locked loops", *IEEE Journal* of Solid-State Circuits, Vol. 48, No. 6, June 2013
- [12] T. Jang, X. Nan, F. Liu, et al.," A 0.026mm2 5.3mW 32-to-2000MHz Digital Fractional-N Phase Locked-Loop Using a Phase-Interpolating Phase-to-Digital Converter", *IEEE International Solid-State Circuits Conference*, pp. 254-255, Feb. 2013
- [13] M. Afghahi, C. Svensson, "A unified single-phase clocking scheme for VLSI systems", *IEEE Journal of Solid-State Circuits*, Vol. 25, No. 1, Feb. 1990
- [14] X. P. Yu, M. A. Do, W. M. Lim, K. S. Yeo, J.-G. Ma, "Design and optimization of the extended true single-phase clock-based prescaler", *IEEE Transaction on Microwave Theory and Techniques*, Vol. 54, No. 11, Nov. 2006
- [15] H. Chen, R. Geiger, "Transfer characterization of CMOS ring voltage controlled oscillators", 44th IEEE Midwest Circuits and Systems (MWSCAS), Vol. 1, pp. 66-70, Aug. 2001
- [16] J. M. Rabaey, A. Chandrakasan, B. Nikolic, *Digital integrated circuits—A design perspective*, New Jersey 07458, Pearson Education Inc., 2003, *The CMOS inverter*, pp. 191
- [17] B. Razavi, Design of analog integrated circuits, New York, NY 10020, McGraw-Hill, 2001, Oscillators, pp. 482-531
- [18] I. Vornicu, R. Carmona-Galán, Á. Rodríguez-Vázquez, "Compensation of PVT variations in ToF imagers with in-pixel TDC", Sensors, Vol. 17, No. 5, pp. 1072, May 2017
- [19] A. A. Abidi, "Phase noise and jitter in CMOS ring oscillators", IEEE Journal of Solid-State Circuits, Vol. 41, No. 8, Aug. 2006
- [20] J. A. McNeil, Jitter in ring oscillators, Ph.D. thesis, http://users.wpi.edu/~mcneill/papers/thesis.pdf, pp. 83, 1994
- [21] D. Xie, M. Cheng, L. Forbes, "SPICE models for flicker noise in n-MOSFETs from subthreshold to strong inversion", *IEEE Transaction on Computer-Aided Design of Integrated Circuits and Systems*, Vol. 19, No. 11, Nov. 2000
- [22] J. Rhayem, R. Gillon, M. Tack, M. Valenza, A. Hoffmann, A. E. Mvongbote, D. Rigaud, "Comments on existing 1/f noise models: Spice, Hspice and BSIM3v3 for MOSFETs in circuit simulators", <a href="http://www.essderc2002.deis.unibo.it/data/pdf/Joseph.pdf">http://www.essderc2002.deis.unibo.it/data/pdf/Joseph.pdf</a>, 2002
- [23] J. Chang, A. A. Abidi, C. R. Viswanathan, "Flicker noise in CMOS transistors from subthreshold to strong inversion at various temperatures", *IEEE Transactions on Electron Devices*, Vol. 41, No. 11, Nov. 1994
- [24] L. K. J. Vandamme, X. Li, D. Rigaud, "1/f noise in MOS devices, mobility or number fluctuations?", *IEEE Transactions on Electron Devices*, Vol. 41, No. 11, Nov. 1994
- [25] K.-L. Yeh, C.-Y. Ku, W.-L. Hong, J.-C. Guo, "Flicker noise in nanoscale pMOSFETs with mobility enhancement engineering and dynamic body biases", IEEE Radio Frequency Integrated Circuits Symposium, pp. 347-350, Jun. 2009
- [26] I. Vornicu, R. Carmona-Galán, Á. Rodríguez-Vázquez, "Time interval generator with 8ps resolution and wide range for large TDC array characterization", Analog Integrated Circuits and Signal Processing, Vol. 87, Issue 2, pp. 181-189, May 2016
- [27] Y. Han, K. Kim, J. Kim, K. Yoon, "A dual band CMOS VCO with a balanced duty cycle buffer", *Current Applied Physics*, Vol. 5, No. 3, pp. 277-280, Mar. 2005.
- [28] Z. Cheng, M. J. Deen, and H. Peng, "A Low-Power Gateable Vernier Ring Oscillator Time-to-Digital Converter for Biomedical Imaging Applications," *IEEE Transactions on Biomedical Circuits and Systems*, Vol. 10, No. 2, pp. 445–454, 2016.
- [29] P. Keränen and J. Kostamovaara, "A wide range, 4.2 ps(rms) precision CMOS TDC with cyclic interpolators based on switched-frequency ring oscillators," *IEEE Transactions on Circuits and Systems I, Regular Paper*, Vol. 62, No. 12, pp. 2795–2805, 2015.
- [30] J. Kim, Y.-H. Kim, K. Kim, W. Yu, and S. Cho, "A Hybrid-Domain Two-Step Time-to-Digital Converter Using a Switch-Based Time-to-Voltage Converter and SAR ADC", *IEEE Transactions on Circuits and Systems II, Express Briefs*, Vol. 62, No. 7, pp. 631–635, Jul. 2015.
- [31] A. Hamza, S. Ibrahim, M. El-Nozahi, and M. Dessouky, "A low-power, 9-Bit, 1.2 ps resolution two-step time-to-digital converter in 65 nm CMOS", *IEEE 13th International New Circuits and Systems Conference* (NEWCAS), pp. 1–4, 2015.
- [32] W. Yu, K. Kim, and S. Cho, "A 0.22 ps rms Integrated Noise 15 MHz Bandwidth Fourth-Order  $\Sigma\Delta$  Time-to-Digital Converter Using Time-

- Domain Error-Feedback Filter", *IEEE Journal of Solid-State Circuits*, Vol. 50, No. 5, pp. 1251–1262, May 2015.
- [33] A. Elkholy, T. Anand, W.-S. Choi, A. Elshazly, and P. K. Hanumolu, "A 3.7 mW Low-Noise Wide-Bandwidth 4.5 GHz Digital Fractional-N PLL Using Time Amplifier-Based TDC", *IEEE Journal of Solid-State Circuits*, Vol. 50, No. 4, pp. 867–881, Apr. 2015.
- [34] N. U. Andersson and M. Vesterbacka, "A Vernier Time-to-Digital Converter With Delay Latch Chain Architecture," *IEEE Transactions on Circuits and Systems II, Express Briefs*, Vol. 61, No. 10, pp. 773–777, Oct. 2014.
- [35] Y. Kim and T. W. Kim, "An 11 b 7 ps Resolution Two-Step Time-to-Digital Converter With 3-D Vernier Space," *IEEE Transactions on Circuits and Systems I, Regular Paper*, Vol. 61, No. 8, pp. 2326–2336, Aug. 2014
- [36] Z. Xu, M. Miyahara, and A. Matsuzawa, "Picosecond Resolution Time-to-Digital Converter Using GmC Integrator and SAR-ADC," *IEEE Transactions on Nuclear Science*, Vol. 61, No. 2, pp. 852–859, Apr. 2014.
- [37] C.-C. Chen, S.-H. Lin, and C.-S. Hwang, "An Area-Efficient CMOS Time-to-Digital Converter Based on a Pulse-Shrinking Scheme," *IEEE Transactions on Circuits and Systems II-Express Briefs*, Vol. 61, No. 3, pp. 163–167, 2014.
- [38] M. Tanveer, I. Nissinen, J. Nissinen, J. Kostamovaara, J. Borg, and J. Johansson, "Time-to-digital converter based on analog time expansion for 3D time-of-flight cameras," *Image Sensors Imaging Systems*, Febr. 5, 2014 Febr. 6, 2014, Vol. 9022.
- [39] K. Kim, Y.-H. Kim, W. Yu, and S. Cho, "A 7 bit, 3.75 ps Resolution Two-Step Time-to-Digital Converter in 65 nm CMOS Using Pulse-Train Time Amplifier," *IEEE Journal of Solid-State Circuits*, Vol. 48, No. 4, pp. 1009–1017, Apr. 2013.



Ion Vornicu graduated in Electronics Engineering (specialization in Micro-technologies) in 2008 and got a M.Sc. degree in Modern Signal Processing Techniques and Ph.D. degree in Microelectronics in 2011 from the Faculty of Electronics and Telecommunications, Gh. Asachi Technical University of Iasi, Romania. During the Ph.D. his main research focus has been on CMOS

implementation of a type of cellular neural networks for spatiotemporal analog image processing, log-domain circuits, smart CMOS imagers. Since December 2011, he is an Associate Researcher at the Institute of Microelectronics of Seville, Spain. His current research interests lie in the design and test of CMOS sensors based on singlephoton avalanche diodes (SPADs) used for 3D vision and nuclear medicine imaging such as positron emission tomography (PET).



**Ricardo Carmona-Galán** graduated in Physics in 1993 and got a Ph.D. in Microelectronics in 2002 from the University of Seville, Spain. From July 1996 to June 1998, he worked as a Research Assistant at Prof. Chua's laboratory in the EECS Department of the University of California, Berkeley. From 1999 to 2005 he was an Assistant Professor of the Department of Electronics of the

University of Seville. Since 2005, he is a Tenured Scientist at the Institute of Microelectronics of Seville (IMSE-CNM-CSIC). His main research focus has been on VLSI implementation of concurrent sensor/processor arrays for real time image processing and vision. He also held a Postdoc at the University of Notre Dame, Indiana (2006 - 2007), where he worked in interfaces for CMOS compatible nanostructures for multispectral light sensing. He has collaborated with start-up companies in Seville (Anafocus) and Berkeley (Eutecus). He has designed several vision chips implementing different focal plane operators for early vision processing. His current

research interests lie in the design of low-power smart image sensors and 3-D integrated circuits for autonomous vision systems. He has co-authored more than 120 papers in refereed journals and conferences and several book chapters. Dr. Carmona-Galán received a Best Paper Award from the International Journal of Circuit Theory and Applications. He is a co-recipient of an award of the ACET and a Certificate of Teaching Excellence from the University of Seville.



Ángel Rodríguez-Vázquez is currently a Full Professor of electronics with the University of Seville, Seville, Spain and is appointed for research at the Institute of Microelectronics of Seville, Centro Nacional de Microelectrónica, Consejo Superior de Investigaciones Científicas-University of Seville. He has authored eight books; approximately 50 chapters in edited books, including original

tutorials on chaotic integrated circuits, design of data converters, and design of chips for vision; and some 500 articles in peer-reviewed specialized publications. His research work is widely quoted, and he has an h-index of 35. His current research interests are in the areas of imagers and vision systems using 3-D integration technologies and of ultra-low-power medical electronic devices. Prof. Rodríguez-Vázquez has served and is currently serving as an Editor, an Associate Editor, and a Guest Editor for different IEEE and non-IEEE journals. He is in the committee of many international journals and conferences and has chaired different international IEEE and Society of Photo-Optical Instrumentation Engineers conferences. He has received a number of international awards for his research work (IEEE Guillemin Cauer Best Paper Award, two Best Paper awards from Wileys International Journal of Circuit Theory and Applications, IEEE European Conference on Circuit Theory and Design Best Paper Award, and IEEE International Symposium on Circuits and Systems Best Demo-Paper Award).