We present a compact time-resolved spectrometer suitable for optical spectroscopy from 400 nm to 1 µm wavelengths. The detector consists of a monolithic array of 16 high-precision Time-to-Digital Converters (TDC) and Single-Photon Avalanche Diodes (SPAD). The instrument has 10 ps resolution and reaches 70 ps (FWHM) timing precision over a 160 ns full-scale range with a Differential Non-Linearity (DNL) better than 1.5 % LSB. The core of the spectrometer is the application-specific integrated chip composed of 16 pixels with 250 µm pitch, containing a 20 µm diameter SPAD and an independent TDC each, fabricated in a 0.35 µm CMOS technology. In front of this array a monochromator is used to focus different wavelengths into different pixels. The spectrometer has been used for fluorescence lifetime spectroscopy: 5 nm spectral resolution over an 80 nm bandwidth is achieved. Lifetime spectroscopy of Nile blue is demonstrated.
INTRODUCTION
High precision time-interval measurements are required in many fields, ranging from high-energy physics and optical distance measurements to biology and medical imaging. Therefore, instruments able to accurately measure timings down to the picoseconds scale at an affordable cost find a broad market within both the scientific and the industrial community. Specific applications like Fluorescence Lifetime Imaging (FLIM) [1] , Förster Resonance Energy Transfer (FRET) [2] and Diffuse Optical Spectroscopy (DOS) [3] make use of Time-Correlated Single Photon Counting (TCSPC) technique [4] for reconstruction of fast, low-intensity, repetitive optical waveforms. This technique is based on the detections of single photons of the optical signal and on the precise measurement of their arrival times; after many acquisitions of individual events, it is possible to reconstruct the waveform of the optical signal, down to few picoseconds with no need to employ electronics with hundreds of GHz bandwidth. Concerning single-photon detectors, Single-Photon Avalanche Diodes (SPAD) or photomultiplier tubes (PMT) are usually employed.
The state-of-the-art in TCSPC measurement is provided by Becker&Hickl GmbH boards [5] and by PicoQuant GmbH modules [6] . However such solutions are not compact and have high power consumption. All these issues severely limit the number of channels and the achievable count rate for each channel [7] .
In this paper we present a high-performance, low-power, time-resolved spectrometer based on a 16 pixel array chip, able to be operated through an USB 2.0 interface for user-friendly control, parameter settings and data upload to a remote computer. The system reaches 70 ps Full-Width at Half-Maximum (FWHM) precision and a DNL better than 0.015 LSB (rms), which makes it viable for extremely demanding time-resolved spectroscopy applications. This paper is organized as follows. The architecture of the spectrometer is described in Section 2. The experimental characterization and the fluorescence lifetime measurement performed using the module are presented in Section 3. Finally, Section 4 draws the conclusions. ii iBH i Li Yi;líiY Ytiliìt5ä ivi
SPECTROMETER ARCHITECTURE
The spectrometer is composed by a detection module and a monochromator. The monochromator spectrally separates the input light and focuses the output light onto the chip. The detection module (see Figure 1a) provides a synchronization output (e.g. to trigger a laser), a STOP input, an USB 2.0 connector for PC interfacing and a 5 V DC power supply plug. The dimensions of the detection module are 8 cm x 8 cm x 12 cm. The module is composed of three electronic boards, mounted one on top of the other, for: i) holding the chip's package and providing the timing signals; ii) FPGA processing; iii) power supply (from the 5 V DC input).
The core of the spectrometer is the application-specific integrated circuit (ASIC) chip composed of 16 pixels, fabricated in a 0.35 µm CMOS technology, shown in Figure 1b . Each pixel has 250 µm pitch and contains a 20 µm active area diameter SPAD and an in-pixel TDC for the measurement of START / STOP delay times with respect to the reference clock. Moreover the chip hosts the global electronics, composed of three Delay-Locked Loops (DLLs) for synchronization and reference voltages generation. The chip dimensions are 5.9 mm x 1.5 mm and the TDC architecture is described in detail in Ref. [8] .
16 pixels SPAD plus TDC Array Chip
The chip is composed of 16 pixels and common interpolator. Each pixel provides the START event when a photon is detected by the SPAD and the in-pixel TDC computes the elapsed time, based on a global counter and a START interpolator. The 'coarse' counter provides the large full-scale range by counting the number of reference clock pulses between START and STOP signals, while the 'fine' resolution is attained through interpolators, which measure the time elapsed between the START instants and the following clock edge. Then the common interpolator, in the global electronics, measures the time elapsed between the STOP event and the next reference clock edge [8] .
The TDC operation principle is shown in Figure 2 . Within each pixel, a 'coarse' counter provides the number T counter of clock rising edges between START and STOP events. Then an interpolator improves the time resolution. In order to reduce area occupation, the interpolator employs two stages. A 'coarse' stage counts 16 sub-phases of the reference clock between the first clock phase successive to the START (T 11 ) or STOP (T 21 ) event and the rising edge of the reference clock. Finally, the 'fine' second interpolation stage resolves the time elapsed between two successive phases of the multiphase clock, within 62 time bins, namely T 12 and T 22 . 
11J
The measured time interval is given by 11 12 21
22 .
meas counter
T T T T T T
In particular T counter , T 11 and T 12 are independent between the pixels while T 12 and T 22 are in common for each pixels. The overall interpolation factor is 992, being the product of the 'coarse' and 'fine' interpolating factors. Therefore with a 100 MHz reference clock the resolution reached is LSB = 10 ns / 992 ≈ 10 ps.
Signal-Conditioning Electronics
The TDC chip accepts signals with 3.3 V CMOS levels, therefore a front-end signal-conditioning electronics is needed in order to guarantee compatibility with any kind of signal levels, like CMOS, TTL, ECL, NIM, and even non-standard signals, either positive or negative edge-triggered [9] . As shown in Figure 3 , the front-end is based on a fast comparator (ADCMP561 by Analog Devices), powered at ±5 V while the output stage is biased between 0 V / +3.3 V. The comparator output is a fast PECL logic, providing differential signals with 1.6 V / 2.4 V logic levels. The positive input of the comparator is fed by the STOP input of the module, while the negative input is connected to two 8 bit digital potentiometers, controlled by the FPGA through I 2 C bus. In this way, the user can set the input threshold for the STOP input, from -2 V to +3 V with a resolution of 19.5 mV.
The input comparator is followed by an edge selector and a monostable, in order to detect either the rising or the falling edge of the input signal and to provide a STOP pulse of fixed width to the chip. 
Control and Data Processing Electronics
A high-speed mode USB 2.0 interface is used to connect the module to a remote computer, managed by a dedicated microcontroller. The microcontroller is controlled by the FPGA, which is responsible for the overall management of module operations, data processing and data transfer.
The first task of the FPGA is to receive the settings selected by the user on the computer graphical interface and to configure the module accordingly. For instance the FPGA can set the threshold levels for STOP input, by means of a digpot, and also can select the trigger edges for the input.
The second task is to control the operation of the chip and to acquire the raw data measured by the 16 TDCs. Finally, the FPGA is responsible for the raw data processing in order to obtain the final results of the measurement.
Power Supply
All necessary power supplies are derived from a +5 V DC unregulated supply. The necessary voltages are: ±5 V for the first signal-conditioning stage with the input comparators and for the digital potentiometers; +3.3 V for the PECL and CMOS logics, +38 V for the SPADs reverse bias. The +5 V and +38 V are obtained using DC-DC boost converters, -5 V using a buck-boost one, and +3.3 V using a buck converter. These voltages are feed to LDO (Low DropOut) regulators, which provide separate voltages to the different parts of the module, guaranteeing minimum mutual interference. In fact, besides keeping PECL and CMOS components with separate supply rails and ground planes in order to minimize disturbances and cross-talks between the two input signals. The DC-DC switching converters are placed on a separate board, i.e. the power supply board, while the LDO regulators are placed on the timing board.
The timing board consumes 1.5 W, the FPGA board consumes 1.5 W too, while the overall efficiency of the power supply board is 85%. Therefore, the total consumption of the detection module is independent from conversion rate and about 3.5 W. 
Experimental Characterization
Each TDC channel provides a timing resolution of 10 ps, which sets the channel-width of the instrument, 160 ns fullscale range and a maximum conversion time (i.e. dead time) of 150 ns, thus it can reach a conversion rate of about 3 Msamples/s. The SPADs have a Dark Count Rate (DCR) of about 5 kcps at 5 V excess bias, and the array shows one hot pixel on average. The monochromator grating is 150 g/mm, so the spectral resolution of the overall system is 5 nm.
The precision of the detection module was measured by means of a short laser pulse (less than 10 ps FWHM). After a large number of measurements, we computed the precision as the Full-Width at Half-Maximum (FWHM) of the measured conversions distribution around the mean value. We repeated the acquisitions by delaying the STOP pulse over the whole 160 ns full-scale range . The average single-shot precision is better than 70 ps FWHM.
The linearity of the detection module was characterized by means of the code density test, by using a periodic STOP signal and the (uniformly distributed) random START signals provided by the dark generation of the sixteen SPADs.
After accumulating a very large number of samples, the non-uniformities of the sixteen reconstructed histograms reflect the non-linearity of the 16 converters. The measured Differential Non-Linearity (DNL) reaches the excellent value of 1.5% LSB rms, i.e. 150 fs.
Spectrally-resolved Fluorescence Lifetime Measurements
The spectrometer was used to perform fluorescence lifetime measurements (FLIM), using the system shown in Figure 4a [10] . A pulsed diode laser emitting at 635 nm (PicoQuant GmbH, Germany) was collimated by a lens (25 mm focal length), attenuated and focused on a cuvette with a second lens (75 mm focal length). The cuvette was filled with a solution of Nile-blue diluted in water (the dye exhibits very short excited-state lifetime in water, of 0.38 ns). Fluorescence light was detected at 90°, using an emission filter (centered at 680 nm) and a lens (50 mm focal length), which creates an image of the fluorescence spot on the input of the monochromator (SpectraPro 2300i, Action Research Corporation, USA). The output light is focused on the SPAD array chip. The in-pixel TDCs compute the time measurements and, through the USB link, result are shown on the PC. Figure 4b shows the acquired time resolved and spatial resolved fluorescence data. The fitted lifetime was 0.42 ns, slightly longer than the expected one, cause of reflections in the optical system and the not-negligible duration of the pulse (approximately 150 ps width). The system is currently under optimization with the goal to demonstrate that this time-resolved spectrometer is ideally suited for other applications. 
CONCLUSION
We presented a spectrometer that shows high performance (10 ps resolution, 70 ps FWHM precision and DNL better than 1.5% LSB rms) for spectral-and time-resolved measurements, with low power consumption (3.5 W). The overall system and graphic user interface are easy-to-use and fully programmable.
The module enables a high performance, low-cost, low-power consumption and very compact implementation of a large set of applications, including demanding TCSPC-based measurements like in FLIM, FRET and DOS where very compact dimensions and many-channels working in parallel are needed.
ACKNOWLEDGMENTS
This work has been funded by the Italian Research Ministry, under the COFIN-PRIN 2009 project XT785A.
