INTRODUCTION
Last developments in the integration of single photon avalanche diodes (SPADs) in standard CMOS technology [1] have opened new research lines in the field of 3D vision and medical applications. Time-of-flight (ToF) estimation has been successfully employed for distance measurement and generating a depth map of the scene [2] . In addition to this, it is employed in enhanced imaging techniques for nuclear medicine, such as positron emission tomography (PET) [3] . ToF measurement requires the control of the illumination source. Pulse modulation is the most straightforward technique [4] , [5] . Also this method has some benefits when compared to continuous-wave modulation. It is unambiguous and ensures a higher SNR with lower average optical power [6] . Obviously, regarding 3D imaging, the most important parameters are the overall time resolution and the uniformity of the array.
In the field of nuclear medicine imaging, several techniques can be classified as single photon imaging. One of them is detecting the high-energy photons emitted directly by a radioactive nucleus (SPECT: single-photon emission computed tomography). Another one is detecting pairs of 511-keV photons emitted in exactly opposite directions (PET: positron emission tomography) and evaluating their coincidence [7] . ToF measurement has started to be used in this type of applications in order to reduce the uncertainty error along the line of response (LOR) [8] . Basically, a PET detector is composed of a scintillator crystal, which convert those high-energy photons into visible light photons [9] , and an array of single-photon detectors, in this case SPADs, close to the imaging volume [10] . Because of conversion into visible photons can occur at different depths within the scintillator crystal -depth of incidence (DOI)-, there is uncertainty in the determination of the actual LOR, what results in a blurred reconstruction. In order to enhance the spatial resolution of the detection, statistics on the light-spot can help to precisely determine the DOI [11] .
The chip presented in this paper has a double functionality: it can be employed for ToF imaging by multiplexing each SPAD to an external time-to-digital converter; in addition, each photon detection event is accounted for at each column and row of the array in order to realize focal-plane statistics.
II. ARRAY OF SPADS
The test chip is composed of an array of 8 8 × SPAD cells. Each one of them includes the photosensitive device, the active quenching/reset (AQR) circuitry and some peripheral circuits to independently select and connect each pixel to the output, to create statistics with the actual spot position, and to store and serialize the data in the output. The block diagram of the chip is presented in Fig. 1 . The size of the chip core is 256 256 × μm 2 and can be easily scaled to a larger array size.
A. ToF configuration
The digital output of each pixel is multiplexed to one single output channel such that ToF measurements are performed off-chip. This scheme is more appropriate compared to column level buffering due to the fact that a bank of digital buffers can present a large mismatch on the delay, which can seriously affect the time resolution. The time accuracy of one individual SPAD is given by the jitter of the avalanche current spike while in the case of arrays the uniformity of the back-end circuitry is also an important factor. Therefore the variability of the delay, if it is larger than the jitter of the SPAD, can compromise the desired time resolution. A constant delay could be considered in this case a systematic error and can be removed through calibration. Finally, the overall distance uncertainty can be divided by M by merely averaging M ToF measurements [5] . Fig. 2 sketches the block diagram of the array. Each row and column is connected to a different 12b counter through a pulled-up line. Therefore each individual event that is detected at pixel level is counted once by the row and once by the column counters. Obviously events that are detected at the same time on the same column or row cannot be individually counted, but this scheme is thought for applications in which this situation is not common. From the area and power consumption point of view, the proposed scheme is a better alternative than having a counter integrated at pixel level. When the accumulation time window ends the content of the counters can be serially read out. An additional output flag is used to signal a counter overflow. 
B. Light spot statistics configuration

C. Digital pixel
The proposed AQR circuit is presented in Fig. 3 . It is based on the same principle as the circuit reported in [12] , but employs a much simpler hold-off circuit. Compared to passive QR, AQR has the advantage of achieving fast transitions in both phases, an increased sensitivity that avoids count losses, and an adjustable hold-off time [13] . The model of the SPAD that has been used in the simulations [14] merges both the DC characteristic of the model in [15] and the statistical model in [16] . The cross section of the implemented SPAD is depicted in Fig. 4 . Basically, the SPAD is a typical P + -diffusion/N-well diode with an additional T-well guard ring, i. e. an additional P-well layer. Its purpose is equalizing the breakdown voltage (V BD ) [17] . The quenching circuit operates as follows: before a photon is detected there is no current flowing through the reverse-bias diode. The anode is connected to ground and M 4 is turned off, meaning that the SPAD is turned on. In other words, the avalanche is ready to be triggered. Whenever a photon event is detected a large current flows through the SPAD and, consequently, the anode voltage (node A) is pulled up. Therefore the output of inverter Inv 1 is pulled down, M 2 and M 3 transistor are turned off and on, respectively, which means that the anode terminal is connected to VDD quenching the avalanche and disabling the SPAD by setting up V E to be at most equal to zero. Furthermore, with the output of Inv 1 set to zero, the MOS capacitor starts to charge through M 6 and M 5 . The time constant is controlled by V hold_off signal. When the voltage of the capacitor M 8 reaches the threshold of Inv 3 then V reset is set high turning on M 4 transistor. Further, the anode of the SPAD is pulled down for a short time, until the output of Inv 1 goes to VDD turning off M 3 , a large current will flow through M 3 and M 4 . Because of this reset mechanism, a large current spike is required, sometimes of the same magnitude as the current through the SPAD. In order to avoid this inconvenience the width of M 1, 2 should be larger than M 4 . On the other hand M 4 should be stronger than M 3 in order to be able to pull down the anode terminal. Inv 2 is used to drive the readout circuitry, that consists in a transmission gate to connect the output signal to a column bus (signal V out in Fig. 3) , and two transistors M 9 and M 10 to pull down the column (V col ) and row (V row ) detection signals, respectively. Each SPAD can be asynchronously reset through RST signal. Fig. 5 depicts the layout of the pixel. We have designed an integrated SPAD array in a 0.18μm standard CMOS technology. It has a quasi-circular shape having a diameter of the active area of 12μm. Thanks to the fact that we are interested only in detecting the impinging photons without having a linear I SPAD versus V E characteristic, the maximum current through the P + -diffusion/N-well diode can be limited to 3.5mA by means of M 1, 2 transistors.
III. SIMULATION RESULTS
The 8 8 × -SPAD array still needs to be measured and characterized together with individual SPAD devices. In the following post-layout simulation results regarding ToF measurement and AQR circuit functionality are presented.
A. ToF measurements
Each SPAD is multiplexed to the output node. Therefore, besides the jitter of the current pulse, the time accuracy is also given by the uniformity of the signal path. Fig. 6 shows the jitter of the digital output due to non-uniformities of the readout path. Thus if the maximum jitter of the SPADs is small compared to those 115ps, then the overall time accuracy is mainly limited by the signal paths nonuniformity. Notice that for pixel level ToF measurement, with a TDC per pixel, the time resolution is defined only by the maximum jitter of the sensors and TDC non-uniformity. Figure 6 . Post-layout simulations of the time accuracy Fig. 7(a) sketches the worst case signal path delay, where the contributions of the different elements at pixel, array and pad level, can be inferred from the accumulated delay. Voltage V out pixel is the output of one pixel, V out array is the output before row/column multiplexing scheme and V out PAD is the output of the array taken at pad level. This last component is the same for all the pixels of the array, thus it can be cancelled by calibration without jeopardizing the actual ToF. Fig. 7(b) illustrates the operation of the AQR circuit, V A is the anode voltage and V reset is the reset signal that turn-on the SPAD. Fig. 7(c) shows the range within which the dead time (DT) can be continuously tuned, in order to lower the dark-count rate as much as possible. 
where p i is the probability of the i-th row/column of the array of being hit. According to [11] the standard deviation of the light-distribution ( 2 σ μ = ) is directly related with the DOI (z) as follows:
where z 0 and σ 0 are obtained by experimental data fitting. Fig. 8 depicts the signals for light-spot statistics calculation. All the SPADs receive the same illumination in this simulation; therefore all of them will detect photons at the same rate. The signal labeled DIGITAL OUTPUT shows the output of one of them. The programmed EN allows counting only 10 pulses. After that, the content of the counters is serially downloaded, through pad labeled COLUMN OUTPUT/ROW OUTPUT. The waveform shows how code 10 is delivered for every row and column.
Characterization of the statistical magnitudes like darkcount rate (DCR) and after-pulsing (AP) for different excess voltages (V E ) and hold off time and the variation of DCR with temperature, the linearity of the sensor and the estimation of the power consumption depending on the number of the events detected will be reported after full test of the chip. IV. CONCLUSION A 8 8× perfectly scalable array of SPADs has been designed and sent to fabrication in a 0.18μm standard CMOS technology. The pixel pitch is 32μm. Each pixel incorporates a SPAD with quasi-circular shape and an active quenching/reset circuitry with tunable hold-off time. The chip is able to perform ToF measurements and to estimate the actual position of the center and the dispersion of a light-spot as needed in PET applications. Each pixel can be successively connected on a single digital output to measure the overall time accuracy. The non-uniformity of the multiplexing scheme limits the time resolution to 115ps. In addition, maximum spot detection is implemented using a column and row level counters scheme.
