A four channel, self-calibrating, High Resolution Time to Digital Converter (HRTDC) with an RMS error of 35 ps over a dynamic range of 3.2 ps has been developed. Its architecture is based on an array of delay locked loops and an 8-bit coarse time counter driven by an 80 MHz reference clock. Time measmments are buffered in two time registers per channel followed by a common 32 words deep read-out FIFO. The HRTDC has been built in a 0.7 pm CMOS process using 23 mm2 of silicon area.
Abstract.
A four channel, self-calibrating, High Resolution Time to Digital Converter (HRTDC) with an RMS error of 35 ps over a dynamic range of 3.2 ps has been developed. Its architecture is based on an array of delay locked loops and an 8-bit coarse time counter driven by an 80 MHz reference clock. Time measmments are buffered in two time registers per channel followed by a common 32 words deep read-out FIFO. The HRTDC has been built in a 0.7 pm CMOS process using 23 mm2 of silicon area.
Introduction.
High Energy Physics experiments require high resolution time measurements on a very large number (-le) of detector channels. The time resolution needed for measuring the flight time of particles ranges from 10 ps to 100 ps RMS over a limited dynamic range. Extended dynamic range is however useful at the system level to time tag events for future off-line analysis. Traditionally time measurements have been performed using time to amplitude converters followed by an ADC [ 11. These converters can achieve the resolution required but are not easy to integrate into a single IC and usually feature small dynamic range. Issues such as temperature stability, supply voltage sensitivity and linearity are often problematic for these devices.
In this paper, a High Resolution TDC (HRTDC) developed to serve the needs of detectors such as the ALICE Pestov spark canter [23 will be described. This detector requires precise t i m e measurements on -380,000 channels in a highly integrated environment.
HRTDC Architecture.
The converter architecture can be divided into two main building blocks (see Fig. I ): In the timing core an array of Delay Locked Loops (DLL) performs time interpolation within a period of the refemce clock (fine t i m e ) . A clock synchronous counter is used to obtain a larger dynamic range (coarse time). Measurements are stored in two levels of temporary asynchronous channel buffer. In the read-out and logic block, the data stored in the channel buffers are multiplexed into a single data stream.
The status of the array (fine time) is converted into a binary code and the correct coarse time is selected. These 409 data (fine and coarse time, channel id, status flags) are then stored as a single word in the read-out FIFO. By using a common derandomizer FIFO to gather data from all channels, the channel buffers can be freed to acquire new hits. This buffering architecture, with small buffers per channel and a common derandomizer is very efficient in terms of occupied area and small dead time.
Timing core.
An array of D U s is at the core of the architecture implemented. DLL A single DLL has a time resolution limited to the basic cell delay ( T , ). The resolution can be improved by using an array of DLLs 151, In this scheme a small phase shift between consecutive DLLs is used to obtain a bin size that is a fraction F-' 1 T , of the gate delay ( F > 1 ). Such a small delay cannot be obtained directl . It is. however, possible to generate a delay of 1 +FY times the basic delay cell. This is done using an additional phase shifting D U . with a smaller number (M) of delay elements than the other DLLs in the array (N delay elements each), locked to the same reference. An a r r y m e n t such as this (Fig. 3) results in a delay of (1 + F-) . T , between corresponding taps in consecutive DLLs. Due to the symmetry of the array, one cell delay can easily be subtracted. resulting in a delay between consecutive taps that is only F-' . T, .
The t i m e bins are obtained from small time differences of accumulated delays in consecutive DLLs. Any error in these delays will be amplified by the nature of the interpolation and will severely degrade the linearity of the converter. It is therefore essential to obtain good matching characteristics between devices throughout the time critical paths. Unfortunately the array scheme is unable to produce a number of bins that is a power of 2. This results in a fine timewordthatiscodedinaN . F baseandnotintheusua1 2' binary code (where b is the number of bits). However conversion to normal binary code can easily be performed off-line.
Since all DLLs are locked to the same reference, dynamic range extension is easily achieved by introducing a coarse counter that counts reference clock cycles. In order to avoid metastability due to the sampling of the counter's outputs, close to their transition, two counters synchronous to opposite phases of the reference clock are used. Selection of the correct value is done in accordance with the relative position of the hit arrival within the clock period, as given by the fine time (Fig. 4) . The time information stored in a channel buffer contains the status of the array and of the coarse time counters. In order to minimize dead-time between measurements, the architectwe includes a small buffer per channel, controlled via an asynchronous state machine.
Hit

Implementation Issues.
Two main criteria drove the design effort of the time critical circuitry of the converter: matching and noise immunity. They are reflected in the topologies chosen for the circuitry. In the following the implementation of the most important circuits is discussed.
Delay cells.
A delay cell is schematically drawn in Fig. 5 and also depends on the operating point of the device. All critical devices were designed with gate areas large enough to guarantee an acceptable level of matching. To maintain the matching characteristics in any process and environment conditions, the delay control transfer curve was divided in several partially overlapping ranges. The corresponding loop gain is determined by a programmable bias current (signals sel-rungeN and sei-rungeP in Fig. 5 ) and by the connection of one or more current-starved transistors to the control voltage generated in the charge pump (signals control0 ... 2). In this way it is always possible to obtain lock in an operating point that is favourable in terms of matching. An added advantage of this scheme is a reduced noise sensitivity due to the smaller cell gain (slope of the control voltagecell delay transfer curve in Fig. 6 ). for any selected range. With this topology, the circuit can acquire lock in any process or environment conditiys. Lock can be maintained over variations of +15 C in temperature and f300 mV in power supply. This is considered sufficient for the controlled environment where the circuit will work.
Phase detector.
The phase detector (PD) samples the status of the last tap in the DLL at the instant a rising edge of the reference enters the DLL. A basic D flip-flop can, in principle, be used to perform this task. The loop is then controlled using a "bang-bang" codiiguration. Standard D flip-flops usually display different setup and hold times, and a badly defined sampling time. This may result in a significant static phase error. A thoroughly balanced design (Fig. 7 [7] ), and good device matching characteristics are therefore required to ensure minimal static phase error. The delay of signals to the PD must also match very closely, due to their direct contribution to the phase error.
For the time resolution required, distributed parasitic RC models of signal wires must be used to evaluate and minimize these potential skews.
Input signal integrity, noise avoidance.
In contrast to a PLL, a DLL is unable to filter jitter present on the reference clock. To obtain the high-resolution required, special care must therefore be taken to ensure that the handlii of the critical signals (reference clock, but also hit inputs) do not compromise the r e q h d resolution. Differential signalling levels have therefore been used for these input signals.
Inside the circuit, special care has been taken to avoid coupling between the potentially noisy digital read-out block and time-critical DLLs via the power supply or the common substrate. Separate power supplies and careful use of guard-rings minimizes noise coupling.
Demonstrator.
A demonstrator circuit was designed in a 0.7 p CMOS technology. The main performance design goal was a resolution better than 50 ps RMS across the 3.2 ps dynamic range. An 80 MHz reference clock is propagated through four (F= 4) Matching limitations and jitter associated with the reference and the closed loop were estimated to contribute with a Q of -15ps. degrading the expected resolution to -3Ops. An 8-bit coarse t i m e counter was used for dynamic range expansion. Characterization of the converter's Differential and Integral Non-Linearities (DNL and INL. respectively) was performed using Statistical Code Density Tests, from a data set of 840,000 random hits. The resulting histograms are shown in Fig. 8 formed in order to measure errors of random nature such as jitter, internal and external noise, etc. These kind of errors are not captured by statistical tests, due to their random behaviour. A motor-driven coaxial phase shifter was used to generate a time sweep. This passive instrument is very linear and generates very little jitter, so it is better suited for this application than active delay generators. Using this setup, the RMS resolution of the converter was measured to be 34.3~s (0.38 LSB), with a maximum error smaller than f1.2 bins (107 ps). The plot in Fig. 9 shows the conversion error histogram. It includes 60,OOO measurements generated with a time step of -2.8 ps (accumulating 5 measures per time step). 
