properly reconstruct 3D scene maps. It requires large frame rates, in the range of 1-10 kfps, and naturally calls for parallelism and hence for the usage of per-pixel TDCs.
This paper reports a TDC circuit which is conceived to be embedded per-pixel into a d-ToF-CIS based on SinglePhoton Avalanche-Diodes (SPADs). As required for the perpixel implementation, the TDC is designed for low area and low power consumption when implemented in a CMOS technology. Design equations are reported and measurements from a 64 × 64 array implemented in a digital 180 nm CMOS technology are presented for validation purposes.
Most compact options to implement TDCs are based on a coarse counter and the use of either delay lines or oscillators to encode the finer bits [2] . However, the former still requires the distribution of a high-speed clock across the pixel array. Regarding oscillator implementation, different options can be considered as well, such as ring oscillators and LC tanks. Although LC tanks feature better phase noise than ring oscillators, the latter are better suited for standard CMOS technologies. Besides that, considerations regarding versatility, compactness, power dissipation, frequency tuning range and simultaneous multi-phase generation lead us to choose Voltage Controlled Ring Oscillators (VCRO) for the implementation of per-pixel TDCs.
Seeking to address speed challenges, the TDC reported in this paper employs a two-step architecture that requires a VCRO with an even number of phases. Either a true or a pseudo-differential ring oscillator can be used to this purpose. The latter [3] , [4] has some advantages over the former [5] . First, pseudo-differential ring oscillators minimize the jitter due to thermal noise by maximizing the waveform amplitude [6] . Second, they have zero static power consumption. As disadvantages, they have a worse supply noise rejection and higher jitter due to the positive reaction of the cross-coupled inverters. This paper concentrates on the analysis and design of the pseudo-differential scheme.
Frequency control is another relevant feature for TDC implementation. It can be achieved either by using current starved techniques [7] , [8] or by resistive tuning of the delay cell [9] . Another widely used technique is based on tuning the voltage supply or the load capacitor of the delay cell [10] , [11] . Our architecture employs resistive tuning. Specifically, tuning is achieved by connecting a variable resistor to the charging/discharging path of the individual pseudo-differential delay cell output nodes. In order to achieve minimum area and power consumption, this variable resistor is implemented by using transmission gates. To the best of our knowledge, such approach has never been used before for this particular type of delay cell. As compared to the current starved technique, the variable resistor implemented with a transmission gate allows full swing between power rails and much higher oscillation frequencies (see Fig. 1 ). Moreover the maximum deviation from 50% duty cycle is lowered from 9% down to less than 3.5% along the entire range of frequency control voltage (see Fig. 2 ).
Besides describing the proposed architecture and reporting measurement results, this paper also includes calculations for the oscillation frequency, the jitter due to white and flicker noise and the power consumption. These calculations are employed to support the VCRO design procedure by providing initial, rough estimations of the design parameters. Also, the insight provided by the analysis outcomes is useful for making refinements during an iterative design procedure. This paper is organized as follows: Section II presents a short overview of the design and operation of the TDC building blocks. Section III concentrates on the model of the VCRO and the computation of the oscillation frequency. Section IV develops the analytical analysis of the VCRO limitations that have an impact on the TDC performance. A thorough analysis of the VCRO mismatch and noise gives a better insight of the design. Section V is meant to compute the power consumption of the in-pixel TDC. Section VI indicates a possible design guideline and Section VII is dedicated to describe the experimental setup and several measurement results. Section VIII draws the conclusion of this work.
II. ARCHITECTURE OF THE TDC BASED ON VCRO
Although TDCs can be implemented by using just one counter, this would require very large clock frequencies to achieve small time bins. It is overcome by performing the conversion in two steps: coarse and fine. Fig. 3 displays the concept of such two-step TDC. The first step of the conversion is completed by a counter that is fed by the first phase of the VCRO. This counter operates at much lower frequency than required for one-step architectures. The second step occurs by the end of the conversion interval, when the oscillation is stopped, and consists of encoding the VCRO phases in the final state. A thermometric-to-binary encoder is employed for this purpose.
The VCRO in this paper delivers 8 phases from 4 pseudodifferential stages. This number provides a reasonable balance between area and oscillation frequency, i.e. the more phases the more area and the lower the oscillation frequency required for the same temporal resolution. In these conditions, time intervals between the edges established by the input logic are measured by counting the integer number of oscillation periods, which renders the coarser bits of the conversion (8 bits in this occasion) and then interpolating the 8 phases of the VCRO to get the 3 finest bits.
The building blocks of the in-pixel TDC are: the start/stop logic, the VCRO, the ripple counter, and the phase encoder. The time-to-digital conversion is realized as follows:
1) The Start/ Stop logic (Fig. 4) defines the limits of the time interval to be measured. The output signal EN_VCRO equals to the time elapsed between the rising edges of the Start and Stop signals. The Start signal can be provided either externally, Ext_Start, or by the local SPAD detector, Vout. The Stop signal, Ext_Stop, is the synchronization signal of the pulsed laser which triggers the light pulse. In this scenario, first occurs the synchronization pulse, then the light pulse which travels to the scene and back to the sensor. The light pulse is eventually detected by the SPAD which provides the Start pulse for the TDC. Finally the Stop pulse is given by the next synchronization pulse. This technique is called reverse start-stop. The most important feature of this block is that its output stays disabled as long as no Start pulse precedes a Stop pulse. This is the key of the power saving strategy: the TDC remains OFF if the SPAD detector is not fired. The other way around means that the TDC consumes power even if no light is detected which is not power efficient. This is not desirable especially for in-pixel TDC architectures because in this case all TDCs will turn ON at the same time which for large resolution means a tremendous current peak.
2) Signal EN_VCRO turns ON the VCRO (Fig. 5) . It is composed by 4 pseudo-differential stages, with positive feedback between each pair of complementary outputs. This shortens the start-up time, hence improving the overall TDC accuracy. Also, auto-alignment is achieved by forcing the oscillator to start each time with the same phase through the reset signal, R. The block labeled Tune, to be explained later, is employed to provide wide-range linear control of the oscillation frequency.
Post-layout simulations have been performed to evaluate the delays between the Ext_Start or Vout signal and VCRO output and also the delays between the Ext_Stop signal and VCRO output. These delay paths are matched such that the difference between them is less than 110 ps. It is worth to mention that the delay between the rising edge of EN_VCRO signal and VCRO output is about 50 ps. It matches with the delay between the falling edge of EN_VCRO signal and VCRO output. The overall mismatch of the delay paths translates into an offset error which can be easily canceled in the calibration phase. However this error is much less than the FWHM jitter of the SPAD plus TDC ensemble.
The first phase of the VCRO (labeled out1) drives the ripple counter (Fig. 6 ), whose 8 bits output represents the most significant bits of the conversion (B10…B3). On the rising edge of Ext_Stop, the counter keeps the number of the full oscillation periods which is the coarse approximation of the input time interval. Signal R is an asynchronous reset that is also employed to reset the VCRO. Seeking to reduce the area and the switching power without losing from the maximum allowed input frequency, the ripple counter is based on CMOS D-type Flip-Flops (DFF) [13] , [14] . Hence the channel length of the transistors is the minimum allowed by the technology. Worst case post layout simulations have been performed. The DFF has been proved to work properly from 20 kHz up to 2 GHz. The lower input frequency in this circuit is given by the refresh rate requirement or the minimum retention time of the DFF internal capacitive nodes (Fig. 6-inset) .
3) On the rising edge of Ext_Stop, signal EN_VCRO turns the VCRO OFF. The frozen oscillator phases are fed into an encoder (Fig. 7 ) to obtain the 3 least significant bits of the conversion. The encoder's outputs are described by:
( 1 )
By employing a CMOS XNOR, the total area of the encoder is less than 260 μm 2 in this prototype. Basically at the end of the input time interval, on the rising edge of Ext_Stop, the coarse counter holds the 8 most significant bits of the conversion. Right after that, the encoder provides the finest 3 bits. The 11 bits conversion code is stored in an in-pixel SRAM memory. Let us consider that the conversion time, τ conv is the time elapsed between the end of the input time interval and the moment when the digital code is available at the output of the TDC.
where τ NAND , τ NOR , τ INV , τ DFF and τ recov are the delays introduced by the logic gates and CMOS DFF and the recovery time of the VCRO internal nodes. The conversion time is about 2ns. This feature renders the proposed architecture very well suited for high frame rate d-ToF imagers. III. MODEL OF THE VCRO Fig. 8(a) shows the block diagram of the pseudo-differential stages composing the VCRO where the asynchronous reset R is used for auto-alignment. The tunable element, labeled Tune, is basically a transmission gate ( Fig. 8(b) ) that is enabled by the signal EN_VCRO. When this signal is low, the transmission gate is in open circuit, and therefore no oscillation takes place. When EN_VCRO is high, the transmission gate is a voltage controlled resistor, which resistance, called R V is tuned through the voltage labeled TUNE to set the oscillation frequency. The larger R V the larger the delay introduced by the delay cell and thus the lower the oscillation frequency.
The behavior of the oscillator is not easy to describe because of its nonlinearity. Usual techniques employed for linear circuits do not apply here. However, it is possible to do a progressive analysis that starts with a linear step and then, in order to have sustained oscillations, introduces a nonlinear amplitude control. The good thing is that the oscillation frequency can be predicted in the first step from a linearized model of the delay cell [15] . First of all, we are interested in the delay cell when EN_VCRO is high and the reset signal R is also high. This results into the simplified schematic of Fig. 9 where transistors MP1 and MN1 are the components of the inverters I in Fig. 8(a) , transistors MP2 and MN2 correspond to the NAND (the other input is high and the corresponding branch is not shown), and R V is the resistance of the transmission gate inside the element Tune.
This simplified schematic can be modeled by the linearized equivalent of Fig. 10 , from where the following transfer function is obtained:
where g mn1 , g mp1 are the transconductances of MN1 and MP1, R o is their equivalent output resistance, and R V is given by the inverse of the conductance of the transmission gate, G V :
with β n = μ n C ox (W/L) n for the nMOS transistor of the transmission gate and
is the voltage at the output of the transmission. Hence, Eq. (6) holds as long as V o (t) < TUNE−V T n . In addition to this, Z eq in Eq. (5) is given by: where R N is the negative input resistance of the feedback differential pair ( Fig. 9 ): (8) and R L captures the equivalent positive output resistance of transistors MN2 and MP2:
Finally, C L is the capacitance in the output node:
where C N , C P are the lumped capacitances of the transistors MN1, MN2, MP1, MP2, and transistors MN and MP of the transmission gate:
These expressions employ the Miller effect for the calculation of parasitic capacitances of digital inverters [16] . By replacing Eq. (7) and (8) into (5), H (s) can be written as:
Let us assume that all the delay stages are described by H (s) so that the open loop gain is:
According to Barkhausen criterion, this open loop gain must yield a phase shift of 2π and a gain of unity at the oscillation frequency, f o [17] . Therefore the following must be fulfilled:
The oscillation frequency is then:
This expression can be simplified taking into account that MN1 and MP1 act as switches and hence that the following assumptions apply: Resulting into the following simplified oscillation frequency expression:
R V is therefore, the key parameter for oscillation frequency control as long as Eqs. (18) and (19) hold. The previous analysis has been employed to support the design procedure for the chip in this paper. All the transistors have minimum length, L = 180 nm. The nominal values for the widths are:
Adequacy of the procedure and the calculations beneath is illustrated in Fig. 11 . The horizontal axis corresponds to the Multiplication Factor (MF) that varies between 0.5 and 3. When the sizes of the transistors of either R V or I or the NAND gate are varied, the multiplying factor MF is applied to either W M P and W M N , or W M P 1 and W M N 1 , or W M P 2 and W M N 2 , respectively. When all the transistors vary jointly, labeled in the figures with I = R V = NAND, MF applies to all of them. This simulation is required to show the design tradeoff for sizing the positive reaction gain, seeking at the same time to minimize area without severely decreasing the oscillation frequency which is crucial to get a small time resolution.
The unity value of these scaling factors corresponds to the nominal design case. The vertical axis corresponds to the oscillation frequency obtained by electrical simulations. The set of curves show that the selected widths feature a reasonable high frequency. Note that around 20% larger frequencies might have been obtained by making the transistors involved in the block NAND more resistive. However, this choice brings the design closer to oscillation failure. The reason is that the positive reaction disappears to the limit, failing in this way to ensure the oscillation phase condition. Therefore MFs of NAND smaller than unity are not recommended. Note also that around 10% larger frequencies might have been obtained by using more conductive transistors for I and R V blocks. However, this penalizes area occupation and power consumption -see Fig. 25 .
Moreover, besides lowering the oscillation frequency, stronger positive reaction also decreases the tuning range (see Fig. 12 ).
The accuracy of the proposed linearized model of the VCRO oscillation frequency has been demonstrated by successfully fitting the parameters of Eqs. (17) and (20) to the measurement results (Fig. 13) . Thus R L = 1M , R o = 700 and Although temperature and voltage supply variation have also a significant impact on the TDC time bin, they can be compensated by a global scheme [18] .
A. Mismatch
Taking into account that the VCRO has eight phases, the time bin, T bin , of the in-pixel TDC can be expressed as:
Hence, the local time bin deviation is given as: (22) showing that time bin uniformity is linked to the pixel-to-pixel mismatch of the oscillation frequency.
We have analyzed the effect of mismatch by making use of Monte Carlo simulation. We have simulated the behavior of the VCRO allowing a 3σ spread of transistor mismatch parameters. We have then obtained maximum and minimum oscillation frequencies for each value of the multiplication factor (MF) that describes the scaling of the transistors in the delay cell. With this, we have calculated the deviation in the time bin for each value of MF using Eq. (22) . These values are represented in Fig. 16 -square markers.
Moreover we have measured the maximum deviation of the time bin across the TDC array for 29 chips (Fig. 6-circle  marker) . However, while increasing device dimensions improves mismatch, it is detrimental to power consumption and oscillation frequency. Such trade-off has been addressed during the design process. In fact the deviations of the time bin (Fig. 16 ) are smaller for MFs larger than the nominal value of unity. However, penalties regarding area and power of these larger factors may not be assumable for a rather modest mismatch improvement. This becomes evident by looking at the Figure-of-Merit (FoM) presented in Section VI.
B. Noise
Timing accuracy of the TDC is limited by the period jitter of the VCRO. It is defined as the standard deviation σ T o of the oscillation period, T o = 1/ f o [19] . The positive feedback action of transistors MN2 and MP2 in Fig. 9 is a source of time uncertainty. Hence, careful design is needed to preclude the overall positive feedback exceeding unity gain [6] . Still, the positive feedback features prompt start-up thus preventing further increases of the jitter through period-to-period coupling mechanisms − jitter from one delay cell that can affect jitter in another delay cell [20] . The impact of thermal and flicker noise on jitter is addressed in the next sections.
1) VCRO Jitter Due to White Noise: Let us consider the half circuit of the delay cell depicted in Fig. 17(a) . Assume that a negative step signal is applied at the input and the trip point of the inverters is located at VDD/2. It turns MN1 OFF and places MP1 in triode. Because the output voltage is initially set to the ground, MN2 is placed in triode and MP2 is turned OFF. Under these conditions, the voltage V o− on the capacitor C L starts to build up. But V o− is also connected to a cross coupled inverters cell which has been set low. Therefore it slows down the charging of C L until its switching point is reached. The output voltage is calculated as:
where τ n = τ p to approximate the lower branch of the equation. The time constants τ n , τ p are given by:
and R eqn , R eq p are the equivalent resistances loading the output node of the delay cell. Their particular values are obtained by fitting the model to the simulation or measurement results. The propagation delay t d is defined as the time interval between the ideal input step and the moment when the output ramp crosses the trip point of the next delay cell. The propagation delay is therefore given by:
There are three resistors contributing noise to V o (t), namely R V , R eqn and R eqp . The relation between this voltage noise and the jitter of the time delay is [19] , [20] :
On the one hand, the thermal noise of each resistor is fully integrated by C L thus yielding a kT /C L noise power term per resistor, totaling:
On the other hand, the slope of the output voltage can be approximated by the ratio of the voltage range, VDD, to the rising time t r , the latter obtained from Eq. (23) by considering that the rise time ends when 90% of the final voltage has been reached:
The jitter of the half delay cell is then:
kT C L (29) and that of the oscillation period is:
where M is the number of delay cells in the ring. We have evaluated the impact of scaling individual blocks such as I, R V and NAND on the cycle-to-cycle jitter over the full range of the TDC (Fig. 18) . The unit MF design choice is justified from the jitter point of view as follows: according to Eq. (17), f o could be increased by decreasing R V , while the rest of the blocks remain the same. However the jitter also increases (square marker). Moreover the jitter improvement obtained by increasing only the widths of the transistors in I (circle marker), or the improvement obtained by increasing the widths of transistors in all of the blocks jointly (asterisk marker), is not worth it because, as will be shown later, it involves a significant increase of dynamic power. According to Eq. (30), the jitter due to white noise depends on the number of delay cells in the loop, the variable resistor and the regenerative pair. If R eqn and R eq p decrease with respect to R V , then the slope of V o (t) decreases and thus the jitter associated to the output node due to white noise increases. This result is consistent with the theory that the jitter increases with the strength of the positive reaction. It is also shown by the simulation results (Fig. 19-square marker) . In order to demonstrate the validity of the proposed model, the predicted jitter by Eq. (30) has been compared to the simulated one (see Fig. 19-circle marker) . The parameters of the model are as follows: R V = 3.3 k , T = 300 K, k is the Boltzmann's constant, M is the number of delay cells and VD D = 1.8 V. C L , R eqn and R eq p are shown in Fig. 20 and Fig. 21 .
2) VCRO Phase Noise Due to Flicker Noise: Let us assume that the large signal oscillation frequency is the inverse of the accumulated delay of the stages, and that all stages have the same delay. Therefore:
Using Eq. (25) , and reformulating it in terms of conductance G V and G eqn , yields:
The sensitivity of f o to G V is calculated from here as:
Using this sensitivity, the spectral density of the flicker noise contribution is given as:
which contains two components:
• the sensitivity of the time delay to G V ;
• the spectral density of the flicker noise for G V . The first component can be approximated by:
On the other hand, the spectral density of G V is composed of the terms corresponding to both transistors employed to implement it:
The flicker noise of the NMOS transistor is mainly caused by the carrier number fluctuation ( N) [19] , [21] , [22] . According to McWhorter model, the spectral density of 1/ f noise referred to the gate of NMOS in linear region is:
where kT N T E F is the interface state density per unit energy at Fermi energy level, and γ is the McWhorter's tunneling parameter [23] .
Regarding PMOS transistors, the 1/ f noise within the linear region is mostly due to mobility fluctuation ( μ) [23] , [24] , [25] . The Hooge's model states that the flicker noise spectral density depends on the gate voltage:
where α H is the Hooge's parameter. Combining all previous equations, the spectral density of f o increases with the strength of the regenerative switching, as can be seen in the approximate form of the spectral density:
Moreover, the spectral density of f o is inversely proportional to the square of the load capacitance. Also Eqs (36) - (39) , show that it is inversely proportional to the cube of the length of the transistors of the variable resistor. Nevertheless, decreasing the jitter due to flicker noise by increasing the length of the transistors in R V can eventually end up also decreasing the top oscillation frequency (Fig. 22-circle marker) . The phase noise has been evaluated at 2 MHz offset frequency Fig. 22-asterisk marker) . We have finally chosen the smallest length available because phase noise does not vary too much around this value while the oscillation frequency rapidly degrades for longer transistors.
The phase noise predicted by Eq. (39) is compared to the simulated one (Fig. 23) V. POWER CONSUMPTION The power consumption of the TDC is mainly due to the VCRO and the CMOS ripple counter.
A. VCRO Power Consumption
The two main contributions to the power drawn by the VCRO are the dynamic power and the direct-path power. On the one hand, using the model from Fig. 17 , the instantaneous dynamic power related to half of the delay cell is:
Combining this equation with Eq. (23), the average dynamic power consumption is:
On the other hand, the average direct-path power is: where t sc is the time interval during which both MN1 and MP1 ( Fig. 9) are ON:
B. Ripple Counter Power Consumption
The dynamic power drawn by the CMOS 8 bits coarse counter is given by:
where the capacitors are shown in Fig. 6 . Eq. (41) shows that the VCRO dynamic power is proportional to the power supply, the load capacitance and the strength of the positive feedback in the regenerative pair. Also, comparative evaluation of Eqs (41) and (42), shows that the VCRO dynamic power is far larger than the direct path power dissipation. This is not surprising because the rising/falling edges of the input and output delay cells are symmetrical (Fig. 24) . Any increase in the dimension of the devices employed to implement the VCRO stages ends up in higher power consumption (Fig. 25) .
The average dynamic power of the coarse counter is proportional to the capacitance of the CMOS flip-flop. The side effect of decreasing this capacitance is the increase of the minimum input frequency required for the counter to work.
The prediction of Eqs. (41) and (44) has been compared with the simulated average power (Fig. 26) . The parameters involved in these equations are:
and f o are the same as the ones used in Eq. (17);
VI. DESIGN GUIDELINES The overriding parameters of the VCRO optimized for inpixel TDC are area and power consumption. Concurrently, the time bin has to be pushed to its limits for this technology in order to achieve the best depth resolution. Thus, a basic Furthermore, mismatches, and hence device dimensions, are traded by area and power consumption, using the equations presented in Sections III, IV and V. These equations provide initial values of the design parameters and guidelines for further iterations depending on the outcome of the simulation results. Throughout the manuscript it has been shown that the selected transistor sizes, i.e. unit MFs, represent a good design compromise. This is further confirmed by the FoM of phase noise (FoM_VCRO) in Fig. 27 . It has been calculated as:
where the phase noise P N is computed by Eq. (39), the offset frequency f is set to 2 MHz and the average power P d,avg is computed by Eqs. (41) and (44). Note that FoM is the best in the nominal unit value of MF, which is our design choice.
VII. EXPERIMENTAL RESULTS
The proposed VCRO has been employed in an array of 64×64 TDCs. Fig. 28 shows the microphotograph of the chip along with the floor plan of the pixel. The analog voltage that controls the oscillation frequency of the VCROs array is provided by an on-chip PLL whose core oscillator is an instance of the same VCRO. This enables the implementation of a global compensation mechanism to mitigate the effect of PVT variations on the time accuracy of the TDCs [18] .
A. Characterization of the VCRO-Based TDC
The first characteristic that we have measured is the code uniformity without any pixel-to-pixel calibration. Deviations are due to the variations of the VCRO oscillation frequency and the duration of EN_VCRO. In order to measure these deviations, the time bin has been set to 147 ps by feeding the appropriate reference voltage. The input time interval is set to the maximum value, as it is the worst case for uniformity. In this case, intervals are of 297.48 ns on average with a standard deviation of 56 ps. These intervals are provided by a Time Interval Generator (TIG). The standard deviation of the TDC array is of 32 output codes. Furthermore, if needed, these deviations of the time bin can be lowered by applying, for instance, a calibration cycle based on a look-up table.
It is important to properly characterize the TIG, as it is going to be the instrument to excite the TDC. The TIG reported in [26] The control voltage of the VCRO array can be either external or internal, in which case it comes from the compensation loop. The same PLL is also used to program the TDCs time bin. In this experiment, the PLL division factor ÷N has been swept from the minimum to the maximum value. Fig. 29 shows the output characteristic of the programmable TDC. The minimum and maximum time bins of 147 ps and 432 ps (red plot) are achieved with external control voltages of 0 V and 1.8 V, respectively. The rest of the curves have been obtained switching the control voltage to the internal voltage reference which actually is the output of the PLL's loop filter.
The performance of an individual programmable TDC based on the VCRO employed as time interpolator have been measured as well. The time bin has been set to 147 ps and 432 ps. The DNL and INL are 0.55 and 3.11 LSBs and 0.56 and 4.61 LSBs respectively (Fig. 30 and Fig. 31 ). RMS DNL and INL computed across the array are less than 0.35 LSB and 1.5 LSB [26] .
In order to measure the single shot precision of the TDC we have considered the following scenario: the TDC is set at the maximum and minimum time bin. In this case the full range of the TDC is about 870 ns and 300 ns. In both cases the TIG is set to generate 10 5 time intervals of 10% and 90% of the full range. The standard deviations of the TIG at 28.4 ns/ 255.9 ns and 83.5 ns/ 787 ns are 17.3 ps/ 15.6 ps and 16.2 ps/ 18.6 ps, respectively. The histograms of the input time intervals and TDC output codes are depicted on the left and right sides of Fig. 32 and Fig. 33 respectively. The TDC jitter is computed by subtracting the standard deviation squares of the input time interval from the measured TDC output. The one shot precision of the TDC is affected by the jitter of the VCRO. Moreover a larger time bin is obtained by decreasing f o , hence increasing the jitter. Therefore at the same TDC output code, the standard deviation of the single shot precision is bigger when the time bin is larger.
The potential meta-stability problems of the VCRO have been contemplated as well. It may occur only at one phase at a time when the VCRO is stopped at integer number of oscillation periods. We have performed post-layout simulations to investigate how does the VCRO settles the outputs in this case. The VCRO has been stopped with 10 ps step around the switching point of a certain phase. The internal nodes are successfully recovering to the correct states such that the encoder and ripple counter give the correct output codes. The worst case recovering time or the propagation delay through the ripple counter is less than 2ns.
B. Measurements on the VCRO Operation
The measured sensitivity of the oscillation frequency to the control voltage, K VCRO is consistent with the simulated curve (see Fig. 13 ). K VCRO computed in both cases is 477 MHz/V. The oscillation frequency ranges from 300 MHz to 800 MHz when the control voltage ranges from 0.67 V to 1.7 V. Our design has a very good linearity of 99.4%. VCRO linearity is a measure of how linear is the dependence of the oscillation frequency on the control voltage. As the circuit is also employed as the core oscillator for the PLL, a high gain is required to avoid the PLL loop to unlock. Using the Eqs (41) and (44) one may obtain the power drawn by the VCRO, which is of 663μW and 142nW respectively. The oscillation frequency has been considered of 850 MHz.
The deviation of f o is effectively mitigated by activating the compensation loop based on a PLL integrated on-chip [18] . Thus it decreases from 20% down to 2.4% when the temperature varies from 0°C to 100°C. When the voltage supply changes within ±10% of its nominal value it decreases from 27% down to 0.27%.
The dependence of the VCRO output frequency on the PLL's frequency division factor ÷N is shown in Fig. 34 . Notice that as long as the PLL is locked, the dependence is linear for a wide range of frequencies from 363 MHz up to 765 MHz.
The proposed VCRO has been tested also as a building block of the on-chip PLL (Fig. 35) . As long as the PLL is locked, the synthesized output frequencies and loop filter output voltage are linearly dependent on the frequency division factor. The frequency range is from 400 MHz to 850 MHz, with a division factor step of 50 MHz. The loop filter output, which is later buffered to the control input of the array of VCROs, ranges from 0.81 V to 1.67 V.
According to post-layout simulations, the phase noise is 102 dBc/Hz at 2 MHz from 850 MHz. The RMS values of the in-pixel VCROs jitter is measured by running the VCRO continuously for the whole range of control voltages (Fig. 36) . The jitter of the TDC has been measured as well for both extremes of the time bin. The standard deviation of the TDC output code at 10% and 90% of the dynamic range is of 0.78 and 13.88 codes at 147 ps time bin and 2.36 and 24.44 codes at 432 ps time bin.
Comparison with the state-of-the-art is provided in TABLE I. With respect to [10] [11] [12] , they are all VCRO's controlled by a digital word and a DAC generating the tuning voltage. In [10] the mechanism for TDC operation relies in time amplification. The reported phase noise is of -116 dBc/Hz @0. 4 MHz. This is a smaller phase noise than ours. In fact, it has a better FoM_VCRO. But this it has been obtained by a circuit with much larger area, which is not acceptable for the inclusion of a per-pixel TDC. In [11] the resistance of a transistor introducing some delay between cells is modified, a mechanism similar to the one that we are implementing. The main difference being that our variable resistor is in the path of the signal while the one in [11] is incorporating some losses path, which can have an incidence in power consumption. Our VCRO has a better phase noise and a FoM_VCRO close to the one reported by [11] which however employs a larger area. Concerning [12] , the reported occupied area is less than the one of our VCRO. However, it is achieved in 28 nm technology which, at this time, is hardly suitable to also integrate SPAD detectors on the same chip. They report a jitter between 16.1 ps and 19.3 ps, which is close to our cycle-to-cycle jitter of 20 ps. With respect to [3] , [9] , and [27] , the VCRO reported in this paper presents better phase noise and FoM_VCRO with less area. This achievement in terms of power is explained as follows: [3] , [9] have the same voltage supply and transistors channel length, but use transistors 10 times larger. Besides the oscillation frequency and the number of stages are different. Higher oscillation frequency increases the dynamic power. Instead, the design in [27] is implemented in 350 nm, where the power consumption is higher. Besides, the design draws static power as well. Moreover the voltage supply is 3.3V, while we are using 1.8V. It makes big difference because the dynamic power consumption is proportional with the square of the voltage supply.
In order to provide a straightforward comparison with stateof-the-art TDCs, we have composed TABLE II. In addition, we have computed the FoM_TDC employed in [39] , and plotted it vs. the time resolution (Fig. 37) . 
VIII. CONCLUSION
The modeling, design and measurement of a pseudodifferential VCRO aimed for in-pixel TDC for d-ToF image sensors is reported. The proposed VCRO has been tested both as a PLL building block and as a time interpolator for the pixel-level TDC. We have provided a detailed analysis of the oscillation frequency, the impact of the mismatch on the deviation of the TDC time bin, the jitter due to white noise, the phase noise due to flicker noise and the power consumption of the VCRO and ripple counter. All the proposed models are meant to obtain the first order approximation in an iterative simulator-assisted design procedure. All models have been demonstrated by comparing them with simulations and/or measurement results. Comparison with the state-of-theart VCRO and TDC has been provided as well.
