This paper presents an architecture and achievable performance for a time-to-digital converter, for 3D time-offlight cameras. This design is partitioned in two levels. In the first level, an analog time expansion, where the time interval to be measured is stretched by a factor k, is achieved by charging a capacitor with current I, followed by discharging the capacitor with a current I k . In the second level, the final time to digital conversion is performed by a global gated ring oscillator based time-to-digital converter. The performance can be increased by exploiting its properties of intrinsic scrambling of quantization noise and mismatch error, and first order noise shaping. The stretched time interval is measured by counting full clock cycles and storing the states of nine phases of the gated ring oscillator. The frequency of the gated ring oscillator is approximately 131 MHz, and an appropriate stretch factor k, can give a resolution of ≈ 57 ps. The combined low nonlinearity of the time stretcher and the gated ring oscillator-based time-to-digital converter can achieve a distance resolution of a few centimeters with low power consumption and small area occupation. The carefully optimized circuit configuration achieved by using an edge aligner, the time amplification property and the gated ring oscillator-based time-to-digital converter may lead to a compact, low power single photon configuration for 3D time-of-flight cameras, aimed for a measurement range of 10 meters.
INTRODUCTION
Time resolved imaging in three dimensional image sensors has shown remarkable potential in recent years as it offers advantages over conventional intensity based imaging, in vision, engineering and medical research.
1
The continual emergence of new applications requiring depth data often surpasses the capabilities of these existing products and technologies, in attributes such as higher accuracy and precision, reduced acquisition time, smaller size and costs. In 3D time-of-flight (TOF) cameras the time information is processed from each pixel to compute the depth and distance information. 2, 3 The most critical component involved in the design of picosecond-resolution time resolved imagers is the time-to-digital converter (TDC). These devices might have a global TDC to acquire fast images over the entire pixel array or many TDCs might be needed to enable independent and simultaneous acquisition and time discrimination at the pixel basics.
1 However, to integrate many more TDCs on a chip, possibly one per detector, very compact TDCs need to be designed. 
Background
Time interval measurement methods are described in the literature by using either digital, analogical or combined approaches i.e., by using the Nutt method, 5, 6 all depending on the targeted application. In analog TDCs the time interval is measured by counting cycles of the low clock frequency and linearly expanding fractional clock cycles. 7, 8, 9, 10, 11 On the other hand, digital interpolation uses a gate delay to estimate the fractional clock cycles. 12, 13, 14, 15, 16, 17, 18, 19, 20 A simple TDC consists of a high frequency clock and a counter incremented at each clock edge. In such a case, the resolution is limited by the use of the reference clock frequency. For example, as f clk = 1 t is needed for a resolution of t, then t = 100 ps implies f clock = 10GHz. This frequency is not feasible for use across the array of pixels as it can cause several problems at certain levels, particularly noise, crosstalk and power dissipation. One way to improve the performance of TDC is to use analog time expansion, 5,28 based on current integration, where the time interval to be measured is stretched by a factor k. The stretched time interval thus obtained can be measured by any TDC with improved resolution. The timestretchers (TS) with precisely matched charging/discharging current can possibly achieve better linearity than their digital counter parts. 8 Additionally, there might exist a possibility that by using methods based on in-pixel analog electrical storage and exploiting, such storage using pulse time-stretching techniques a better resolution, low power consumption and high geometry fidelity can be achieved. 11, 12, 13 The majority of published devices mentioned above function as stand-alone custom integrated circuits, and hence the requirements on their area and power consumption are much relaxed. However, the fabrication of an array of pixels with in-pixel TDCs places strict demands in terms of fidelity of geometry, power consumption and homogeneity in performance. 
Aim and Proposed Idea
The aim of this work is to develop a high precision time-to-digital converter as an integrated CMOS chip to be used in 3D time-of-flight cameras for a number of application fields, such as obstacle recognition, automotive industry and video surveillance by measuring the depth and distance of an optically visible object. In this paper, we propose an architecture for a global gated ring oscillator-based time-to-digital converter (GRO-TDC) along with per pixel time stretchers as analog blocks. To achieve sensitivity to individual photons and to extract 3D time-of-flight information, the time stretcher needs to be coupled to each pixel. Additionally, in an efficient SPAD sensor operating in direct time-of-flight mode, and to reduce parasitics associated with the detector, a time stamp must be generated for every impinging photon at every pixel detector. This necessitates the combination of per-pixel time-stretcher with a global TDC (time-digitization circuitry). Figs. 1(a) and (b) show a simplified front-end circuit and timing diagram of the concept. Radiation from a pulsed laser diode with short (≈ 100ps) pulse width will illuminate the target surface and set a start timing mark. The reflected photons are absorbed by the SPAD, 21, 25, 26, 27 which will set a stop timing mark. Meanwhile a pixel-based capacitor is charged from the rising edge of the start timing mark until the rising edge of the stop timing mark with a constant current. The charged capacitor is then discharged with a much smaller current functioning as a time stretcher based on analog time expansion. The global GRO-TDC is run during the discharge of the capacitor, between the timing marks Start-GRO and TDC-Stop as shown in Fig. 2 . This discharge (stretched) time interval is thereby measured by the GRO-TDC by counting full clock cycles and registering the states of interpolators within the clock period.
THE ARCHITECHTURE OF THE PROPOSED TIME-TO-DIGITAL CONVERTER
The conceptual block diagram of the time-of-flight distance measurement system with multiple-stop signals at different time intervals is illustrated in Fig 2. The system consists of pixel-based time-stretcher and comparators, TDC-input logic, nine-phase gated ring oscillator (GRO), edge aligner (EA) and registers for storing the state of the ring oscillator by the timing signals (EN-GRO and TDC-Stop). A monostable multivibrator (MMV) can be used to generate an output pulse of a duration with a time interval corresponding to the final stop signal as presented in references. 28, 29 The logic level signal from the comparator and TDC-input logic block will start and stop the GRO for stretched time interval measurement. The global TDC operates by counting full clock cycles from the rising edge of the comparator output (Start-GRO), which will start the GRO. The falling edge of the comparator output (TDC-Stop) will disable the GRO, and hence, the counter is also settled. The registers determine the timing position within the clock cycle. The resolution of the GRO-based TDC can be defined by dividing the period of the ring oscillator by the number of phases of the gated ring oscillator, divided by the stretch factor k. 
Time Stretcher and Sub-circuits
The time stretching method converts the measurement time interval first to a corresponding analog voltage.
In the second step, this analog voltage is then stretched by a factor k to achieve time amplification. Hence, two conversions are performed: time to charge and charge to time. Whereas these will be performed in each individual pixel, the final time to digital conversion is performed by the global GRO-based TDC. The simplest topology to obtain a time-stretcher is similar to the Wilkinson ADC. 22 ,23 In this topology the analog voltage is created by charging a capacitor during the measurement time interval and then discharging the capacitor with a much smaller current after the arrival of the stop signal. The ratio between the charge and the discharge currents determine the stretch factor k. If the stretch factor is large enough, even a low resolution TDC can be used to measure the discharge time(stretched-time), and finally, the original time measurement with improved resolution can be obtained. The sensitivity to noise in the integrating node and non-linearity in the capacitor can be minimized by using a differential design with two identical capacitors that are discharged by different currents on the arrival of the start and stop signals. 24 Other design challenges are demanding requirements in terms of the comparator offset, propagation delay stability and area constraints that must be considered for 3D image sensor composed of pixel array-based structures. For an in-pixel solution, the area is especially critical. Therefore, to save area a single-ended solution is selected for the work presented in this paper.
Design and operation of open-loop integration-based time stretcher
An open-loop integration-based approach was studied previously, 3 where wide swing cascode current source/sink and a capacitor were used to implement the time stretcher, as shown in Fig. 3a . In an open-loop integration approach based on current sinks/sources, important features that need to be considered are high output impedance, high linearity and small area. This is achieved by using wide-swing cascode structures. 38 The schematic of the time stretcher comprises a wide-swing cascode current source and sink, a conversion capacitor C conv , and switches to initialize the circuit and to control charging and discharging of the capacitor. In the schematic Fig. 3a , M1 is the current source, M2 through M5 correspond to the biasing transistors, and M6 through M9 form a cascode current source. Similarly in the current sink, M11 serves as the current sink, M10 through M14 correspond to the biasing transistors, M15 through M18 serve as the cascode current sink. The Start opens the nmos switch M s and the voltage across the capacitor increases linearly until the Stop signal arrives. Upon arrival of the final stop, the conversion capacitor is discharged with much smaller current to achieve analog time expansion. The final stop that also corresponds to the rising edge of the comparator, will simultaneously start GRO-TDC for time interval measurement. Fig. 7(a) . The resulting voltage across the capacitor is a linear ramp directly proportional with time t: V out (t) = I 1 * t/C conv , where C conv = 300 fF is the conversion capacitance and I 1 = 8.4 µA is the constant charging current. After the arrival of the Stop signal, the conversion capacitor is isolated and the voltage across the capacitor will remain constant until the final stop has arrived. A current sink with nmos transistors is used to start the discharge of the capacitor with much smaller current I 2 = 557 nA from a timing mark set by the final Stop.
Design of the comparator
For the realization of the stretched time interval, an analog decision circuit such as a comparator is required for each pixel. The comparator design and performance presents major obstacles to achieving high-precision conversion. The parameters of the comparator, considered in our design, include a) minimum detectable input voltage, b) small delay to reduce non-linearity, and c) small offset and gain variation. 31 Additionally, to reduce the pixel area, while conserving the fill factor, it is important to simplify the comparator structure to operate at low supply voltage in a specific technology. There are design trade-offs that might be necessary e.g., among architecture complexity, area efficiency, delay requirements and power consumption. Here, a comparator with internal hysteresis has been selected for its robustness, compactness, low power dissipation, reduced offset and low noise. The comparator with internal hysteresis also achieves short delay response and has good linearity.
31
The continuous time comparator shown in Fig. 3b consists of a PMOS M1 and M2 input differential pair, that sets the transconductance and speed of the comparator, and also contributes an equal current at the detectable operating point by keeping the differential pair tail's current in saturation. The decision circuit is the heart of the comparator and must be able to discriminate mV level signals. The circuit uses positive feedback from the cross-gate connection of M6 and M7 to increase the gain of the decision element. The input step (initial saturated input condition) of the comparator is Vout-TS, which is driven to the output trip state at V ref = 500 mV as shown in Fig 3. (a), (b) . The logic operation AND between the output of the comparator and Start-GRO give an output pulse as a realization of the stretched time interval as shown in Fig. 7 (a) , which enables the GRO and also sets the stop timing mark for the global TDC as TDC-Stop.
Proposed Global GRO-based TDC and sub-circuits
The basic TDC architectures are based on linear delay elements as shown in Fig. 4a and discussed earlier in section 1.1. These architectures basically operate on the rising edge of the Start signal, representing the first event, which is then successively delayed by a series of inverter gates ( ignoring polarity), each with an average delay T q . The outputs from each of these inverters are inputs to a register, which is clocked with the rising edge of the Stop signal representing the Stop event. The registers at the output generate a thermometer code, which corresponds to the number of delay elements that have transitioned within a measurement interval T in . The average mapping from input time difference, T in to digital code output Out has a limited resolution and nonlinearity due to mismatch between delays elements.
The main idea of the gated ring oscillator-based TDC 32, 34, 35 as shown in Fig. 4b is similar to an oscillator-based TDC with a counter, 33 as it measures the clock cycles and the number of delay element transitions, generated by the ring oscillator structure during the time interval measurement. However, unlike traditional oscillator-based TDC, the GRO structure only allows the oscillator to perform time interval measurement when it is gated ON, by Start-GRO, from the rising edge of the comparator, and strives to freeze when the time interval measurement has been made (on the falling edge of the comparator). Fig. 4b shows a practical implementation of the GRO. Gating functionality is added to the conventional inverter-based ring oscillators by placing two switches in series with the positive and negative supplies for each inverter. When the switches are closed, oscillation is enabled, and the oscillation is suspended when the switches are open.
Two important factors that improve the performance of the GRO-TDC are quantization noise scrambling and mismatch of first-order shaped delay elements, 32, 34, 35 which can improve the linearity even in the presence of considerable mismatch. A number of GRO with differential delay elements are also used in TDCs to achieve good differential nonlinearity performance, mitigating the mismatch between rising and falling edges. However, a single-ended configuration may be preferable because differential nonlinearity is first-order shaped, and requires only modest levels of complexity that could be implemented with smaller area and reduced power consumption.
The GRO-TDC operates by counting the pulses during the stretched time interval and recording the states of its nine phases at the moment of arrival of the timing signals. The clock signal P1 in the timing diagram for the operation of the 9-phase GRO-TDC, as shown in Fig. 2 , corresponds to one phase of the GRO. A counter is started by the next rising edge of clock P1, after that GRO is enabled (for counting of the stretched time interval), and the counter is stopped by the next rising edge of the clock after the TDC-Stop signal as shown in Fig. 2 . Special attention has been given to the layout design of the GRO. The GRO inverters are placed in two rows but in opposite directions, to minimize the variation in parasitic loading of capacitances at each stage. 
Edge Aligner and GRO-TDC input logic
If a time difference is present between EN/ENB i.e., if they are not perfectly aligned as shown in Fig 5a, only PMOS or only NMOS turns on during one inverter delay, contaminating the timing regions. Hence, the intrinsic phase storing ability of GRO-TDC will contain unwanted inband noise. 35 Therefore, GRO-TDC uses an edge aligner to generate perfectly aligned timing signals in order to achieve lower inband quantization noise and stronger noise shaping. In the edge aligner, SR latch with an inverter has been used to generate perfectly aligned single-ended to differential-ended signals for the GRO-TDC. The output of the comparator is fed to TDC-in-logic block, which contains an edge aligner and a digital flip-flop (DFF) as shown in Fig. 2 . This input logic block detects the rising edge of the comparator and generates the start signal EN-GRO for the GRO through a DFF. The falling edge of the comparator puts the stop timing mark for CDES of the GRO-TDC as TDC-stop. To achieve effective resolution, and to operate the GRO-TDC in the vicinity of ideal first-order noise shaping, the EN-GRO signal for GRO is again directed into the edge aligner to turn on/off PMOS and NMOS simultaneously and to generate perfectly aligned EN/ENB signals as shown in Fig. 5a. 
Counter Dual Edge Synchronizer
The timing signals (EN-GRO and TDC-Stop) are typically asynchronous inputs to the synchronous measurement logic. Their arrival is expected at any time within the clock period, even when the signal recording by the registers is uncertain. This may cause a metastability problem, i.e., if the propagation delay, setup and hold times are violated. 36 Additionally, the differences in register processes, layout parameters, differences in routing and noise can shift the recording of the timing signal in the counter to the next clock cycle, creating an error of one reference clock cycle in the final GRO-TDC result.
This metastability problem can be handled by a dual edge synchronization scheme, 36 where the counting is delayed by one reference clock cycle and the counter control (enable/disable) is headed into the correct, safe position by the settled register result, as shown in Fig. 6 The stop signal TDC-Stop is also synchronized to both the rising and falling edges of the reference clock of the GRO. In Fig. 2 , inp3 determine whether stop-q1 appeared during the negative or positive half cycle of the clock signal. The selection of interpolator result inp3 for the counter control is made on the basis of the next rising edge of the reference clock P1. The counter, synchronized by using a dual edge synchronization scheme, ensures reliable recording of the timing signals, as one of the two flip-flops always has enough delay between data change and the clock edge. It is evident from Fig. 6 that the counter is enabled/disabled one clock cycle later because the counter is also clocked in conjunction with the rising edge.
RESULTS
The design discussed above has been implemented in a 0.35 µm CMOS process with a supply voltage of 3.3 V. The complete architecture (GRO-TDC and time stretcher) occupies an area of 150x100 µm as shown in Fig.  7 . The per-pixel time stretcher can be configured further to occupy an area of 40x40 µm, and a fill-factor of ≈ 25% can be obtained by using 35x35 µm SPADs. Simulations have been performed using Cadence Spectre with post-layout parasitics effects included. The waveform of the time stretcher and comparator output is shown in Fig. 8(a) . The analog voltage Vout-TS is created on the on-pixel capacitor for time interval measurement, and stretched by a factor of k to achieve time amplification and sufficient resolution. The capacitor and the show the gated ring oscillator mismatch and corresponding differential (DNL) and integral nonlinearities of the interpolators from post layout simulations. The standard deviation of the interpolator is under 10 ps and INL is 24.7 ps, which is very small compared to the LSB of the GRO-TDC. Hence, their impact on the precision is minimal. The proposed design can result in very low current consumption even during the time interval measurements. The average current consumption of the design, when measurements are performed, is 2.4 mA. The obtained results are comparable to already published designs as shown in Table 1 . 
CONCLUSION
The concept of time amplification has been exploited by using an in-pixel time stretcher to achieve sufficiently improved resolution. Additionally, a gated ring oscillator-based time-to-digital converter is described that takes advantage of quantization noise scrambling of mismatch error and first order noise shaping. The simulation results show that by using an appropriate stretch factor and suitable frequency of the ring oscillator, a distance resolution of a few centimeters can be obtained. This design is comparable to other published designs in aspects of low power consumption, compactness and high resolution. The design can be exploited further to achieve a reasonable fill factor and good linearity for in-pixel signal processing for the first prototype of 3D time-of-flight camera based on 2D arrays of single photon avalanche diodes.
