Abstract: Time-resolved imaging by means of single-photon avalanche diodes (SPADs) has achieved widespread interest in recent years, especially since technological progress has opened the way to the development of multichannel time-correlated single-photon counting (TCSPC) acquisition systems. Unfortunately, currently available TCSPC imagers feature relatively low performance with respect to state-of-the-art single-channel systems. A real breakthrough in this field would be the exploitation of large arrays of high-performance SPAD detectors developed by means of dedicated fabrication processes, usually referred to as custom technology. Custom-technology SPADs require external electronics potentially leading to interconnection issues for densely integrated arrays. In this paper, we present a new fully integrated front-end circuit able to provide both quenching/reset and timing functionalities while requiring a single connection toward the SPAD. This is the first fully integrated circuit reported in literature that can provide both the timing information about the photon time of arrival with a jitter as low as 37 ps and apply high-voltage pulses up to 50 V in order to meet the requirements of several detectors, including the new red-enhanced SPAD. Combining these two capabilities in a single circuit strongly reduces the complexity of the connection between an array of custom-technology SPADs and the relative external front end, thus paving the way for the exploitation of high-performance SPADs in TCSPC imaging systems.
Introduction
The Time-Correlated Single Photon Counting (TCSPC) technique is an extremely powerful tool to analyze fast and faint optical signals. Therefore, TCSPC is exploited in numerous applications such as Fluorescence Lifetime Imaging (FLIM) and Förster Resonance Energy Transfer (FRET) [1] in life sciences, Laser Imaging Detection and Ranging (LIDAR) [2] in remote objects sensing and defense, and Quantum Key Distribution (QKD) [3] in cryptography and communication. Unfortunately, TCSPC is intrinsically a slow technique: a statistically-significant number of events has to be collected to achieve a proper reconstruction of the signal, resulting in a fairly long acquisition time especially if a single acquisition channel is used [1] . For this reason, in recent years there has been a fast trend towards the development of multichannel TCSPC systems. Parallelism not only can lead to measurement speed increase, but it also enables advanced measurements, as Fluorescence Correlation Spectroscopy (FCS) [4] , that require the simultaneous acquisition of data from different spots of the same sample and/or the acquisition of different wavelengths of the signal at the same time. Making use of standard scaled CMOS technologies to integrate both the detector and the electronics on the same chip has been leading to the design of TCSPC acquisition systems featuring hundreds or even thousands of pixels [5] - [10] . However, inherent features of scaled CMOS technology conflict with overall SPAD performance: for example, thin p-and n-well layers limit the Photon Detection Efficiency (PDE), especially in the red region of the spectrum [6] , [11] , high dopant levels could enhance Dark Count Rates (DCRs) due to band-to-band and trap-assisted tunneling effects, and the complexity of the CMOS process has detrimental effects on afterpulsing [12] . Recently, researchers have successfully investigated the development of back-illuminated CMOS SPADs in order to increase the PDE in the near infrared [13] , [14] . This solution, along with the possibility of increasing the fill factor, has been pushing the exploitation of 3-D stacking in this field. Vertical interconnection of chips can be a breakthrough in this field, especially because it opens the way to the combined use of different technologies for the detectors and the electronics on a large scale. Concerning the detector, best in class results for SPADs have been achieved so far resorting to dedicated fabrication processes developed on purpose, usually referred to as custom technologies [15] . Unfortunately, the exploitation of large arrays of high-performance custom-technology SPADs has been prevented so far since external front end electronics is required to operate these detectors at high voltage. Available solutions require the connection of two different circuits to the opposite terminals of the detector (i.e., anode and cathode) leading to interconnection issues. In this paper, we present a circuit that requires a single connection to the detector providing both quenching/reset and timing capabilities. In this way, interconnection issues are minimized and 3D stacking with custom technology SPADs becomes feasible. To the best of our knowledge, this is the first fully integrated front end for SPADs that can both apply large voltage variations up to 50 V and extract the timing information with a jitter as low as 37 ps Full Width at Half Maximum (FWHM).
The paper is organized as follows: in Section 2 the issues related to the design of a fully-integrated front end with quenching/reset and timing capabilities for high-performance SPADs are outlined, in Section 3 the design of the circuit is reported while in Section 4 the experimental results are shown. Finally, conclusions are drawn in Section 5.
High-Performance SPADs Features and Challenges for the Front-End Design
Custom technologies are particularly suited to optimize SPADs, since they give the designer the degrees of freedom necessary to pursue the best device performance. Among others, the planar custom-technology process developed at Politecnico di Milano [12] , [15] has enabled the fabrication of SPAD arrays featuring remarkable performance, as a PDE of several tens of percent also in the red region of the spectrum [16] combined with low dark counts and afterpulsing probability [17] . Unfortunately, this custom fabrication process prevents the integration of complex front end and processing electronics on the same chip of the detectors; therefore, external electronics is required, potentially leading to interconnection issues, especially if wire-bonding is exploited. The proper operation of a SPAD detector and the extraction of the timing information requires an Active Quenching Circuit (AQC) [18] and an avalanche-current pick-up circuit. Historically, these two blocks have been implemented separately and connected to the opposite terminals of the SPAD (i.e., anode and cathode). In fact, high-performance custom-technology SPADs require quenching pulses that can range from a few Volts, for standard-process custom technology SPAD detectors, up to tens of Volts for the recently introduced Red-Enhanced SPAD (RE-SPAD) [19] . Managing such high quenching pulses with fast leading edges is not only a challenging task for fully integrated AQCs that drive external SPADs (because of the capacitive load introduced by the detector itself and connection-related parasitics), but it is also hardly compatible with picoseconds-timing functionalities that require a low-current sensing threshold. In this scenario, dedicated front end and processing electronics designed on purpose is essential to achieve a timing precision of few tens of picoseconds as needed by high-demanding applications, especially in the biomedical and biological fields. In 2017 we demonstrated that a timing jitter of few tens of picoseconds can be achieved with a fully integrated pick-up circuit featuring a low input impedance and a high bandwidth [20] . Unfortunately, these constraints conflict with the exploitation of High Voltage (HV) transistors that would easily make the pick-up circuit compatible with the integration of quenching and reset capabilities on the same chip. In order to overcome these issues, we designed a new solution to enable the coexistence of a high-voltage AQC and a pick-up circuit within a single front-end circuit for high-performance SPAD detectors.
Time-Resolving Active Quenching Circuit
The architecture of the designed circuit is reported in Fig. 1 . The circuit consists of three main blocks, namely an Active Quenching Circuit (AQC), a current pick-up circuit and an auxiliary Logic circuit that prevents any conflict between the other two parts. The circuit operation can be divided into four phases:
r During the idle phase, the pick-up circuit is enabled, meaning that it is ready to sense the avalanche current. High-voltage transistors of the AQC, namely M Q and M R , are both off. The anode voltage is set by the pick-up circuit and it is equal to 900 mV.
r Upon the detection of a photon, the pick-up circuit senses the avalanche current, it compares the TIA output signal with an externally tunable threshold and feeds the output of the comparator to a Low-Voltage Differential Signaling (LVDS) driver (sense phase). In this way, the timing information about the photon time of arrival is extracted and can be fed to an external time measurement circuit.
r After the avalanche-current sensing, the Logic block disables the TIA and sends the photon detected signal (ph_det in Fig. 1 ) to the AQC: the active quenching phase starts. In this phase, M Q is activated and the SPAD anode is driven up to VddH, that is an externally tunable voltage and it can be set to a value as high as 50 V by the end user.
r At the end of the quenching phase, M Q is switched off while M R is switched on: the reset phase takes place. The goal of the reset phase is restoring the initial bias condition of the SPAD to make the device ready to detect another impinging photon. To this aim, the anode voltage at the end of the reset phase should be 900 mV, which is the same value of the idle phase. This result has been achieved placing the resistor R in series to M R while the diode D has been exploited to break the tradeoff between final value and speed, as will be clarified later in this section.
Active Quenching Circuit
The main role of the AQC is to quench the SPAD avalanche current and reset the device to its initial bias condition where it is ready to detect another photon. The AQC structure is similar to the one reported in [21] , but with two substantial differences. First of all, the AQC sense stage has been removed: its functionality (i.e., sensing the avalanche current) is now carried out by the pick-up circuit. The AQC architecture is reported in Fig. 1 : it consists of High-side and Low-side logic blocks plus the anode driver consisting of HV transistors M Q and M R , the resistor R and the diode D. In order to maximize the circuit speed, the two AQC logic blocks have been designed exploiting low voltage transistors that can tolerate at most 2 V between any pair of their terminals. Nevertheless, the bias voltage of the two logic blocks have been separated to allow high-voltage operation. The Low-side logic block is biased between ground and 1.8 V while the High-side logic block bias voltages are VddH and VddH-1.8 V. VddH is chosen by the end user and it can be as high as 50 V. During the quenching phase, M Q drives the AQC anode up to VddH, while in the reset phase, the reset network composed by M R , R and D resets the SPAD anode to approximately 900 mV, which is also the anode voltage set by the pick-up circuit in idle state. In this scenario, VddH-900 mV is the quenching/reset pulse amplitude applied to the SPAD. This has to be at least equal to the overvoltage applied to the SPAD in order to guarantee a proper operation of the detector. The designed AQC can apply quenching/reset pulses up to 50 V, a feature that make it suitable to operate different kind of detectors including the new Red-Enhanced SPAD, which is a silicon, planar, high-performance SPAD able to provide a PDE as high as 40% at 800 nm, that is the wavelength of interest in many applications, especially in the life sciences. A simplified schematic of the AQC logic blocks is reported in Fig. 2 . The Low-side logic block generates most of the internal signals that are necessary for the proper operation of the AQC. Then, two voltage translators are used to feed the set and reset signals (S_L1 and R_L1, respectively) to the Latch L1 of the High-side logic.
A simple reset path sets the initial condition of the latches L1, L2, L3 (Q = 0); this is not shown in Fig. 2 for simplicity.
The rising edge of the ph_det signal coming from the circuit Logic block (see Fig. 1 ) causes the set of flip-flop FF and latch L1, and the consequent start of the active quenching phase since the output of L1 drives the gate of the HV PMOS transistor M Q . In order to provide fast voltage variations across the external SPAD detector, M Q features a large W/L as reported in Table 1 . Consequently, the gate capacitance of M Q is as high as 1.1 pF, thus requiring a suitable driver. Here a cascade of four inverters with progressive gate sizes has been exploited. The Q output of FF is also fed to a programmable delayer circuit: the output of this path ("eoq" in Fig. 2 ) is used to reset the Latch L1, resulting into the end of the quenching phase. Since the same signal (Q of FF) controls both the beginning and the end of the quenching phase, the duration of this phase is set by the delay introduced by the programmable delayer.
The schematic of this circuit is shown in Fig. 3 . The delay introduced by this stage on a positive step voltage input depends on the stray capacitance C P and on the source-drain resistance (R DS,ON ) of a PMOS transistor. The gate terminal of the PMOS device has been made externally available as CTQ input pin to provide the end user with the flexibility of choosing the duration of the delay of the programmable delayer block, that is the hold-off time of the AQC. Depending on the bias applied to CTQ pin, it is possible to introduce a delay spanning from 2.8 ns up to a few ms. Montecarlo simulations for process and mismatch variations showed that the variability of the programmable delayer output with respect to its mean value are below 8%, e.g., with CTQ = 1.1 V the mean delay is equal to 4.89 ns and the standard deviation is 0.38 ns. Temperature dependence of the delay has also been simulated and it is as low as 20 ps/°C.
At the end of the quenching phase, the reset phase begins: its duration is set by another programmable delayer, that is equal to the one used for the quenching phase except for it is controlled Finally, the AQC features a "Gate" signal to switch the SPAD off as long as this signal is high. Its path has not been shown in the schematics for the sake of simplicity. Only when the "Gate" signal is lowered, the SPAD is reactivated. This feature enables the gated-mode operation of the SPAD that is commonly used to suppress background pulses from the detector in the time intervals where no signal is present, or to discriminate between fast and slow effects, e.g., between Raman scattering and fluorescence [1] .
The Logic Block
The Logic block generates all the control signals necessary to prevent any conflict between the AQC anode driver and the pick-up circuit. The schematic of this block is reported in Fig. 4 . It consists of a comparator that takes the differential output of the pick-up circuit and converts it into a single ended signal suitable to be fed to standard logic gates. Upon the detection of a photon, the Logic block generates the ph_det signal, which is a rectangular pulse lasting 3 ns. This is fed to the AQC logic to initiate the active quenching phase. At the same time, EN and EN_n are set to '0' and '1' respectively. As a result, the TIA is disabled and the AQC can perform its operations without damaging the pick-up circuit. At the end of the reset phase, the AQC switches off M R so both the high-voltage transistors of the AQC anode driver are off; the pick-up circuit can be re-enabled. The AQC sends the reconnect signal to the circuit Logic block, FF2 is reset and transistors M 4 and M 5 of the pick-up circuit are switched off, thus restoring the avalanche current read-out functionality of the TIA.
The time diagram of the main signals of the circuit upon the detection of a photon are shown in Fig. 5. 
The Pick-Up Circuit
The main task of the pick-up circuit is collecting all the avalanche current while keeping the detector anode at a fixed voltage. This aspect is of the utmost importance to prevent any current loss due to the SPAD anode-substrate capacitance, that is in the order of a few picofarad (depending on the characteristics of the detector [22] ). Since the lower the current threshold used to sense the avalanche, the lower the timing jitter of a custom-technology SPAD [23] , the minimization of the current loss is crucial. In single channel systems the substrate is typically biased with a high-value resistor (e.g., 100 k ) in series to the bias source, a solution that strongly increases the overall IEEE Photonics Journal 37ps-Precision Time-Resolving Active Quenching Circuit impedance of this path preventing any current loss. Unfortunately, the same solution cannot be exploited with an array of detectors, because the common substrate must be kept at a fixed voltage to avoid electrical cross-talk among the channels. In this scenario, a front-end circuit featuring a very low input impedance is required. To this aim, a Trans-Impedance Amplifier (TIA) has been designed, followed by a fast latched comparator (see Fig. 1 ). The complete schematic of the TIA is shown in Fig. 6 and the device dimensions are reported in Table 2 . When the detector is biased above its breakdown voltage and it is ready to detect photons (idle phase), the TIA is enabled, meaning that M 4 and M 5 are switched off. In this configuration, the circuit can be divided into two parts: the input stage, consisting of M 1 , M 2 , R 1 , M 3 and cascodes HV 1 and HV 2 , that implements a current amplifier with an active shunt feedback, and an output stage consisting in M 6 and R OUT . Upon the detection of a photon, the avalanche current starts flowing through the pick-up circuit: thanks to the negative feedback implemented by M 2 , R 1 , M 3 and cascodes HV 1 and HV 2 , the incoming current is sunk by M 3 and mirrored into M 6 causing a voltage drop across resistor R OUT . The output of the TIA is then compared to an externally tunable threshold voltage by means of a fast latched comparator (same reported in [24] ) to determine the onset of the photo-generated TABLE 2 Dimensions of the TIA Devices current. Finally, the differential output pulse of the comparator is fed to a LVDS output driver (not shown in Fig. 1) , which provides the information to the external processing electronics, and to the Logic block that communicates with the AQC part. Upon a photon detection, the Logic block sets the EN and EN_n signals of the TIA to a low and a high value, respectively. In this way, transistors M 4 and M 5 behave as closed switches and the internal nodes of the TIA cannot exceed 1.8 V, that is lower than the limit imposed by the technology for low voltage transistors (2V). HV 1 and HV 2 are high-voltage transistors that can tolerate up to 50 V between their gate and drain terminals: they have been used to separate the anode of the detector from the low voltage transistors. In parallel to the activation of M 4 and M 5 the Logic block enables the AQC as well. At this point, the AQC performs a quenching transition followed by the reset phase. In the designed circuit, the anode voltage during the idle phase is determined by the bias condition of the TIA and it is around 900 mV, as a result of the current flowing through M 3 and HV 2 in this phase and their equivalent impedance. This characteristic of the circuit makes the operation of the AQC during the reset phase rather delicate. In fact, to properly accomplish the reset phase, M R must be able to drive the SPAD anode until it reaches 900 mV. This cannot be easily achieved since 900 mV is not a voltage reference, but the steady state condition set by the TIA. To overcome this issue, a resistor (R in Fig. 1 ) has been placed in series to M R .
The value of R (0.7 k as reported in Table 1 ) has been chosen to guarantee that the impedance of the network consisting of M R and R, multiplied by the current flowing through them during the reset phase results in a final voltage value of approximately 900 mV. Montecarlo simulations have been carried out to verify the stability of this solution against process and mismatch variations: the mean difference between the anode voltage reached at the end of the reset transition and the steady state voltage set by the TIA is as low as 19% (i.e., 169 mV with a steady state value of 900 mV) with a standard deviation of 5%. Not only the anode voltage after the reset phase is close to the steady state value, but also the time needed to reach the final value is as short as 5.9 ns with a standard deviation of 1.3 ns. It is worth highlighting that the current at the end of the reset phase comes from M 5 (see Fig. 2 ). Thus, the use of this PMOS transistor has a twofold advantage: it prevents M 3 and HV 2 from exceeding the voltage limitations set by the technology and it contributes to adjust the anode value reached at the end of the reset phase. Unfortunately, placing a resistor in series to M R sets a strong current limitation that significantly affects the speed of the reset transition.
In order to break the tradeoff between the speed of the reset transition and its final value, we placed a diode in parallel to R. During most part of the reset transition, the diode is forward conducting and the current limitation due to the resistor R is avoided. As a result, the slope of the reset voltage transient only depends only on the maximum current that M R can sink and on the characteristics of the external SPAD. As will be shown in the next section, with a standard custom technology SPAD the reset transition slope is as high as 1.5 V/ns. Then, as soon as the anode voltage approaches 900 mV, the diode turns off and the final value is set by the M R and R series, as desired. At the end of the reset phase, whose duration is also tunable by the end user by means of the CTR input pin, the AQC communicates to the Logic block that it has finished its operations: the Logic resets the EN and EN_n signals and the corresponding transistors M 4 and M 5 are switched off (see Fig. 6 ). From now on, the pick-up circuit is re-enabled and it can sense the avalanche current as soon as it occurs. It is important to notice that the solution adopted to make the anode voltage at the end of the reset phase equal to the idle value set by the TIA provides two distinct advantages. First, as soon as M 4 and M 5 are switched off the pick-up circuit functionality is immediately restored. In this way, the duration of the recovery phase of the pick-up circuit is minimized, thus enabling high count rates. Second, the SPAD anode voltage is accurately restored to the idle value at the end of the reset phase. This is extremely important, since any variation of the anode voltage in this phase would result in a shift of the excess bias across the detector leading to a variation of important parameters like PDE, DCR and amplitude of the avalanche current signal. This, in turn, would lead to a distortion in the measurement when a photon is detected right at the end of the reset phase. With our solution this problem is minimized.
The exploitation of HV transistors in the pick-up circuit, namely HV 1 and HV 2 , introduced some constraints in the design. First of all, the resistance of HV 1 has a direct impact on the overall input impedance of the circuit. Since the TIA must provide a low input impedance, a high aspect ratio is needed (the maximum gate-source voltage for this transistor is 2 V). On the other hand, high values of W and L lead to high stray capacitance that concur to limit the bandwidth of the TIA.
An accurate design of this circuit has led to the values reported in Table 2 . The simulated input resistance of the TIA at low frequency is 53 and the circuit features a bandwidth as high as 238 MHz, considering a capacitive input load of 4 pF, which is a conservative estimation of the overall capacitance at the anode terminal considering the SPAD and parasitics. Finally, to avoid speed issues the gate of HV 1 and HV 2 is set to a fixed value, meaning that they are used as cascodes, not as switches. Switches M 4 and M 5 concur to limit the voltage of the TIA internal nodes while HV1 and HV2 are only subject to large voltage variations at their drain terminal, that can tolerate voltage swings up to 50 V with respect to both gate and source terminals.
Experimental Results
The Time-Resolving AQC has been designed and fabricated exploiting a High-Voltage 0.18 μm technology (AMS H18 by Austriamicrosystems). The circuit has been extensively characterized with a 80 μm-diameter custom technology SPAD detector directly connected to the circuit input pin by means of wire bonding. First of all, the quenching/reset capabilities of the circuit have been verified. Fig. 7 shows the waveform at the anode terminal of the SPAD. The measurement has been carried out connecting the detector anode to the probe of the oscilloscope by means of a capacitive divider to minimize the capacitive load due to the measurement setup. As can be seen, when an avalanche event is triggered, the circuit makes the SPAD anode voltage to increase: the final value is set by the VddH bias voltage that has to be properly chosen by the end user to ensure that the detector bias voltage is reduced below the breakdown value in this phase. At the end of the quenching phase, whose duration is externally tunable varying the CTQ input voltage, the circuit enters the reset phase where the anode voltage is decreased until it reaches its final value of about 900 mV. The circuit can carry out a complete quenching-reset transition in a time span as short as 12.5 ns, which corresponds to a maximum count rate as high as 80 Mcps. The key feature of the designed circuit is its ability to provide not only quenching/reset but also timing capabilities. In order to evaluate the timing performance of the circuit, an ultrafast laser diode (Antel MPL-820 laser module) emitting optical pulses at 820 nm with about 10 ps FWHM and a TCSPC module (Becker&Hickl SPC-130) has been used. The B&H module receives the electrical output of the laser, which is synchronous to the optical signal, as sync input (STOP) and a NIM signal provided by a PCB mounting the 80 μm-diameter SPAD and the Time-Resolving AQC as CFD input (START). The Instrument Response Function (IRF) of the circuit under test operating the SPAD at 5 V above its breakdown voltage (V BD = 34.2 V) is shown in Fig. 8 . As can be seen, the circuit can read the SPAD avalanche current and provide the timing information about the photon arrival time with a timing jitter as low as 37 ps FWHM. This result is not only comparable with the best timing precision TABLE 3 Timing Jitter as a Function of the Circuit Threshold Fig. 9 . FWHM timing jitter and peak shift as a function of the overall detector counts.
reported in literature for a fully integrated pick-up circuit and a custom technology high-performance SPAD detector [20] but it has also been achieved with the first fully integrated circuit able to provide such high timing performance while integrating HV quenching and reset functionalities on the same chip. In the experimental conditions of the reported measurements, the best timing performance are achieved with a threshold as high as 100 mV. The measured timing jitter as a function of the threshold used to determine the onset of the photo-generated current is reported in Table 3 .
The circuit has been designed to be the building block of a densely integrated TCSPC acquisition system based on high performance SPAD arrays. Future evaluations will disclose the impact of large voltage variations applied by the AQCs to adjacent SPADs on the operation of the pick-up circuit. In this regard, Table 3 shows that the circuit provides very high performance even with very high threshold voltages, a feature that makes this circuit particularly robust against threshold-voltage variations and thus suitable to be exploited in densely integrated systems.
The timing measurement has been also repeated at different levels of background light intensity. Fig. 9 shows that the FWHM timing jitter is practically unchanged over a very wide range spanning from 55 k to several hundreds of Mcps with a slight increase at very high count rates (52 ps at 10 Mcps), and the IRF shift is limited to few tens of picoseconds even in presence of a high background illumination. Therefore, the circuit is perfectly suitable to perform timing measurements even in presence of large background variations.
The power dissipation of the main blocks is the sum of a static contribution of 2 mW plus a dynamic contribution that depends on the operating rate, that is as low as 30 μW per MHz. The prototype presented in this paper has an overall static power consumption of 11 mW, because of an additional contribution due to the LVDS driver (9 mW). It is worth noting that in densely integrated systems the number of LVDS drivers is not necessarily equal to the number of pixels, but resource sharing can be exploited as presented in [25] or [26] . Therefore, the circuit is suitable to be a building block of a densely integrated system. Finally, the designed Time-Resolving AQC features an area occupation of 0.048 mm 2 (without PADS) and the layout is reported in Fig. 10 .
Conclusion
We presented the first fully integrated circuit able to drive external high-performance SPAD detectors providing both quenching/reset capabilities with pulses up to 50 V and timing functionalities with a jitter as low as 37 ps FWHM. A new architecture has been designed to allow the coexistence of these two capabilities while also achieving a minimum duration of the overall quenching-reset phase as short as 12.5 ns. Requiring a single connection towards the SPAD anode, the designed circuit paves the way to the exploitation of custom-technology SPAD arrays in densely integrated imagers thanks to the minimization of the interconnection issues.
