Single photon avalanche diodes (SPADs) have been subject to a fast improvement in recent years. In particular, custom technologies specifically developed to fabricate SPAD devices give the designer the freedom to pursue the best detector performance required by applications. A significant breakthrough in this field is represented by the recent introduction of a red enhanced SPAD (RE-SPAD) technology, capable of attaining a good photon detection efficiency in the near infrared range (e.g. 40% at a wavelength of 800 nm) while maintaining a remarkable timing resolution of about 100ps full width at half maximum. Being planar, the RE-SPAD custom technology opened the way to the development of SPAD arrays particularly suited for demanding applications in the field of life sciences. However, to achieve such excellent performance custom SPAD detectors must be operated with an external active quenching circuit (AQC) designed on purpose. Next steps toward the development of compact and practical multichannel systems will require a new generation of monolithically integrated AQC arrays. In this paper we present a new, fully integrated AQC fabricated in a high-voltage 0.18 µm CMOS technology able to provide quenching pulses up to 50 Volts with fast leading and trailing edges. Although specifically designed for optimal operation of RE-SPAD devices, the new AQC is quite versatile: it can be used with any SPAD detector, regardless its fabrication technology, reaching remarkable count rates up to 80 Mcounts/s and generating a photon detection pulse with a timing jitter as low as 119 ps full width at half maximum. The compact design of our circuit has been specifically laid out to make this IC a suitable building block for monolithically integrated AQC arrays.
Introduction
Single Photon Avalanche Diodes (SPADs) proved to be an excellent alternative to PMTs in many fields [1] . Along with the intrinsic advantages of solid state devices (ruggedness, small size, low supply voltage, high reliability), SPADs also provide photon detection efficiency inherently higher than PMTs, especially in the red and near-infrared regions of the spectrum. For these reasons, SPADs have been exploited in a steadily increasing number of applications, such as Förster Resonance Energy Transfer (FRET) [2] , Single Molecule Fluorescence Spectroscopy (SMFS) [3] , Laser Imaging Detection and Ranging (LIDAR) [4] and Quantum Key Distribution (QKD) [5] .
Single-Molecule Fluorescence Spectroscopy (SMFS), for instance, is a very powerful experimental tool in cell biology, biochemistry, and biophysics. The basic feature of SMFS is to excite and collect light from a very small volume and work in a low concentration regime: the infrequent transit of a single molecule into this volume gives rise to a burst-like event [3] . Detecting photon bursts is a challenging task: first of all, SMFS requires high detector sensitivity since a small number of photons is emitted in each burst; secondly, the burst events occur at a very short distance from each other, so a detector featuring a short dead time is necessary; finally, many bursts need to be accumulated to achieve proper statistical accuracy, resulting in long measurement time. Multichannel solutions would speed up the data acquisition process but they require arrays of SPADs [6] . The recently developed Red-Enhanced SPAD (RE-SPAD) [7] not only provides excellent performance in terms of Photon Detection Efficiency (PDE), Dark Count Rate (DCR) and timing jitter but it is also fabricated in a fully planar custom technology process that makes the development of RE-SPAD arrays feasible. Fig. 1 . Photon Detection Efficiency (PDE) of SPAD developed with different technologies: RE-SPAD [7] ; CMOS camera SPC3 [14] ; PDM [16] ; τ-SPAD [17] ; SPCM-AQRH [18]; SAP-500 [19] , CMOS SPAD by Webster et al. [21] .
In order to fully exploit the intrinsic performance of SPAD devices a suitable external circuitry is required, capable of sensing the leading edge of the avalanche current, actively quenching the avalanche by lowering the bias voltage below the breakdown voltage and then restore the photodiode voltage to the operating level. This circuit is referred to as Active Quenching Circuit (AQC) [8] .
The characteristics of the quenching circuit considerably affect the operating conditions of the detector and, thus, its actual performance [8] : a prompt sense of the avalanche current has beneficial effects on power consumption and afterpulsing probability [8] , a short dead time reflects into a wide dynamic range, a constant and accurately controllable dead time duration enables dead time correction techniques [8] .
In this paper we present a new, fully integrated AQC fabricated in a high-voltage 0.18µm CMOS technology (Ams H18 from Austriamicrosystems) able to provide quenching pulses up to 50 Volts with fast leading and trailing edges. Although specifically designed for optimal operation of RE-SPAD devices, the new AQC is quite versatile: it can be used with any SPAD detector, regardless its fabrication technology, reaching remarkable count rates up to 80 Mcounts/s and generating a photon detection pulse with a timing jitter as low as 119 ps full width at half maximum (FWHM). The design and layout of the circuit have been specifically carried out to make the circuit easily scalable to a multi-element AQC array. This paper is organized as follows: in section II the main features of custom technology SPAD devices, particularly the RE-SPAD, are reviewed and the challenges that must be faced in order to properly operate these detectors are outlined; in section III the new active quenching circuit is presented and in section IV the experimental results are reported. Finally, conclusions are drawn in section V.
Custom technology single photon avalanche diodes
SPADs are essentially p-n junctions biased above breakdown: silicon-based devices have been developed in the past exploiting either standard CMOS technologies or custom technologies developed on purpose.
Standard technologies offer manifest advantages, i.e. the availability of a fully supported, mature and reliable fabrication process at reasonably low cost, and the possibility of developing complete systems on chip with a high degree of complexity.
The integration of SPAD devices and associated electronics in submicron and deepsubmicron CMOS technologies paved the way for the fabrication of high density systems based on smart pixels [9] [10] [11] . However, a challenging basic issue must be faced: the inherent features of CMOS technologies, namely the relentless trend toward higher doping levels, lower thermal budget, and thinner p-and n-well layers, conflict with SPAD detector performance. The smaller depth of carrier collection layers usually limits the PDE to a few percent at 800 nm [11] [12] [13] . In addition, the high electric fields arising from higher doping result in a strongly enhanced DCR due to band-to-band and trap-assisted tunneling effects and the reduced thermal budget along with the lack of external gettering processes also have adverse effects on afterpulsing [1] .
On the other hand, technologies purposely developed for the fabrication of SPADs give to the designer the freedom to pursuit better device performance, thus opening the way to the improvements mostly requested by users [14] . As reported in Fig. 1 , the exploitation of custom fabrication processes led to the development of SPAD devices [15] with a PDE much higher than standard-CMOS detectors, even noticeably higher than 30% in the red and nearinfrared regions up to 900nm [16] [17] [18] . Nevertheless [16] [17] [18] , have historically relied on nonplanar processes, resulting in significant variations from sample to sample [19] . Along with high operating voltages and thus high power dissipation, the inherent complexity of these fabrication processes prevented the fabrication of detector arrays so far.
Recently, various approaches have been investigated to increase CMOS SPAD photon detection efficiency, relying on back-flipped solutions [20] , high bias of the detector [21] or modifying the fabrication process in order to introduce extra layers only for the detector [22] . Remarkable results have been achieved with respect to traditional CMOS structures; however, some drawbacks have to be taken into account, such as a low PDE in the blue region of the spectrum [20] , high DCR and afterpulsing [21] , mainly due to the tradeoff that the use of a single technology for both the detector and the electronics imposes. In Fig. 1 , the PDE of 130nm CMOS SPAD by Webster et al. [21] is reported, at 6V excess bias where the DCR is comparable to the recently developed Red Enhanced SPAD (RE-SPAD) [7] .
Thanks to new developments in planar-epitaxial custom technology, the RE-SPAD features a PDE as high as 40% at 800 nm. The intrinsic advantage of planar processes is the feasibility of fabricating SPAD arrays. On the other hand, the custom technology fabrication process prevents the integration of complex circuitry on the same chip of the detector [23] , thus making the exploitation of external AQCs mandatory.
A critical parameter of the AQC is the amplitude of the quenching pulse, which has to be slightly larger than the overvoltage applied to the detector (i.e. the difference between the bias voltage and the breakdown voltage). Depending on both the structural characteristics of the detector and the specific application, the overvoltage to be applied ranges from a few Volts for SPADs fabricated with standard custom technology to tens of Volts for RE-SPADs [7] . Managing such high overvoltages represents a challenge in the design of integrated AQCs. CMOS technology does now offer high-voltage devices suitable for operation at tens of Volts. However, these devices do typically provide poorer performance than their low voltage counterparts, especially in terms of switching speed. Speed, though, is also crucial for AQCs:a quenching pulse having a fast leading edge is fundamental to limit the power dissipation and the afterpulsing probability, whereas the dead time (related to the duration of the quenching pulse) has a strong impact on the dynamic range and its stability is mandatory for the application of the known dead time correction techniques [2] .
Another important issue in the design of the AQC is the generation of the output voltage pulse synchronous with the detection of a photon. A careful design of the avalanche current sense sub-circuit in order to minimize the timing jitter of the output pulse would open the way to the exploitation of compact systems based on just SPADs and AQCs not only in photoncounting based applications, but also in some time-correlated photon-counting ones. In the next section, we report a new circuit design that provides a good trade-off between the requirements mentioned above, combined with low area occupation and low power dissipation. The latter features are mandatory for the AQC scalability to large arrays, in view of future implementations of multichannel acquisition systems.
The active quenching circuit
As described in the previous section, custom technology SPAD detectors require an external quenching circuit to be properly operated and this circuit should be capable of applying voltage pulses up to some tens of Volts in order to be compatible with different SPAD technologies and applications.
To meet this constraint along with the capability of providing high count rates we designed an AQC featuring the architecture illustrated in Fig. 2 .
The circuit consists of four main blocks: the sense stage that senses the avalanche current at the anode terminal of the detector, the high-side and the low-side logic blocks that generate all the control signals of the circuit, and the high voltage transistors M Q and M R that regulate the quenching and reset phases.
This architecture has a twofold advantage: first of all, the circuit provides the end user with the flexibility to choose the voltage at which the SPAD is biased during quenching and reset phases individually. The quenching-reset voltage swing is limited only by the maximum drain-source voltage that M Q and M R can tolerate: in our circuit this swing is as high as 50 Volts. Second, this structure makes it possible to exploit low voltage transistors for the logic, which are faster and smaller than their high voltage (HV) counterparts. Indeed, all the logic blocks have a rail-torail power supply as low as 1.8 Volts irrespective of the voltage swing chosen for the SPAD.
This also guarantees the gate-source voltage compliance with the limits set by the technology: since any gate-source voltage in the circuit is generated by one of the logic blocks, these cannot exceed 1.8V, that is lower than the limit imposed by the technology (2V). The signals are fed from one logic block to the other by means of two voltage translators. Thanks to this structure a state-of-art count rate of 82.2 Mcounts/s has been achieved, as it will be shown in the next section.
The presented AQC features a mixed passive-active quenching and active reset as classified in [8] . The avalanche current flowing through the sense stage causes the anode voltage to increase (passive quench); this event triggers a logic circuit producing the ignition of the HV mosfet M Q that actively completes the quench phase forcing the anode of the SPAD up to the highest potential of the circuit (V DD,AQC in Fig. 2 ). After a given delay, M Q is switched off and M R is switched on: the reset transistor is responsible for the reset phase and it drives the anode of the SPAD down to the lowest supply of the circuit (ground in Fig. 2) .
The duration of the active quenching period is externally controlled by the "ControlQ" signal (see Fig. 2 ): by varying this bias voltage the end user can set the duration of the holdoff time which directly affects the system performance in terms of afterpulsing and maximum count rate.
To the same extent, the duration of the reset phase can be chosen in order to properly operate different SPAD devices: the capacitive load at the input of the AQC and the overvoltage to be applied, indeed, depend on the characteristics of the chosen detector and they directly influence the minimum duration of the reset phase that can be used. By means of the "ControlR" pin, the presented circuit provides the end user with the flexibility to properly choose the duration of the reset phase in order to guarantee that a complete reset of the device takes place.
The sense stage is illustrated in Fig. 3 : it consists of five low voltage transistors that are connected to the anode of the SPAD through a HV mosfet M SWITCH . The role of M SWITCH , which can tolerate up to 50 V between its source and drain terminals, is to prevent the damage of the low voltage transistors when variations of several Volts are applied. To this aim, the gate voltage of M SWITCH is upper limited to 1.8V: in this way, its source voltage, and therefore all the low voltage circuitry that is connected to this terminal, cannot exceed 1.8V which is below technology limits. M SWITCH is switched on when the circuit is ready to detect a photon while it is switched off right after an avalanche detection and before the active quenching phase takes place in order to avoid cross-conduction current from flowing during the quench phase. Being synchronous to the detection of a photon (except for a fixed delay introduced by the logic gates), the control signal of M SWITCH ("pass" in Fig. 3 ) is also buffered out as counting output of the circuit.
The main role of the sense stage is to determine when an avalanche current is flowing through the detector and feed a signal to the AQC logic stages in order to trigger the active quenching and reset phases. Assuming M SWITCH closed, the input current flows into the equivalent resistor of M SENSE giving rise to a voltage variation at its drain terminal. The resistance of M SENSE is externally adjustable by varying its gate voltage that is connected to the "ControlS" pin: the value of this equivalent resistor (R SENSE ) determines the speed of the response of the circuit to a photon detection event. According to simulations, for example, with "ControlS" equal to 900 mV and using the equivalent model of a custom technology SPAD biased 5 V above its breakdown voltage, the active quenching phase starts 2.3 ns after the arrival of a photon.
A higher value of R SENSE would reduce this delay but it would also increase the sensitivity of the M 4 gate potential to noise and disturbances; this could be a problem especially in an array of AQCs because of the electrical coupling between adjacent channels.
For all these reasons, "ControlS" is an external signal available to the end user to let him choose the most suitable value of the equivalent R SENSE depending on the characteristics of the system.
The dimensions of M SENSE (see Table 1 ) have been chosen such to have a good range of R SENSE values and a significant dependence of this value from the externally tunable gate voltage.
The output signal of the sense stage ("out sense") is generated by means of the inverter constituted by M 3 and M 4 . In order to rapidly generate this signal, M 4 has been dimensioned with a large W/L as reported in Table 1 . Finally, the role of M 1 and M 2 transistors is to keep the internal nodes of the sense stage fixed during the reset phase in order to avoid the rising of spurious signals at the sense output.
The high-side and low-side logic blocks generate all the internal signals necessary to the proper operation of the AQC. The design of these logic blocks has been made exploiting low voltage transistors and minimizing the number of logic gates in order to maximize the operating frequency of the circuit. A simplified schematic of the two logic blocks along with the voltage translators is reported in Fig. 4 . A simple reset path sets the initial condition of the latches L1, L2, L3 (Q = 0); this is not shown in Fig. 4 for the sake of simplicity. The falling edge of the "out sense" signal coming from the sense stage triggers the set of L1 latch, resulting into a twofold effect: first of all, the "pass" signal goes low and M SWITCH (Fig. 3) is turned off disconnecting the low voltage sense transistor from the anode of the SPAD detector.
Secondly, this event causes the set of Latch L2 and the consequent start of the quench phase: M Q , indeed, is a PMOS transistor that is connected to the negative output of L2. M Q has a large W/L to provide fast voltage variations across the external SPAD detector.
On the other hand, this results into a gate capacitance of M Q as high as 1.1 pF: in order to properly drive this capacitance, then, a driver consisting of a cascade of five inverters with progressive gate sizes has been designed [24] . The size of M Q and transistors of the first and last inverter of its driver circuit are reported in Table 2 .
The Q output of L1 is also fed to a programmable delayer circuit: the output of this path ("Eq" in Fig. 4 ) is used to reset the Latch L2, resulting into the end of the quench phase. Since the same signal (Q L1 ) controls both the beginning and the end of the quench phase, the holdoff duration is set by the delay introduced by the programmable delayer.
The schematic of this circuit is illustrated in Fig. 5 . The delay introduced by this stage on a positive step voltage input depends on the stray capacitance C P and on the R DS,ON resistance of a PMOS transistor. Being the R DS,ON proportional to the voltage applied to the gate of a transistor, this terminal of the PMOS device has been made externally available as ControlQ input pin: in this way, the end user is provided with the flexibility to choose the duration of the delay of the programmable delayer block, that is the hold-off time of the AQC. Depending on the bias applied to ControlQ pin, the programmable delayer can introduce a delay spanning from 5.4ns up to a few ms. Montecarlo simulations for process and mismatch variations showed that the variability of the programmable delayer output with respect to its mean value are in the order of 10%, e.g. with controlQ = 0.9V the mean delay is equal to 29.7ns and the standard deviation is 2.3ns. This would not be a problem in an array of AQCs since the dead-time correction can be carried out by measuring the delay introduced by each AQC individually before the measurement starts. Temperature dependence of the delay has also been evaluated and it is as low as 20ps/°C.
At the end of the quench phase, the reset phase is started: this result is achieved by using the "Eq" signal to trigger the ignition of the reset transistor M R by setting the Q output of Latch L 3 to a high value. A driver equal to the one used for M Q has been exploited also to drive M R for the same reasons. The gate capacitance of M R , indeed, is as high as 1.9pF, (the transistor W/L is reported in Table 2 ). L3 is then reset by its own output after a delay introduced by another programmable delayer, that is equal to the one used for the quench phase except for it is controlled by ControlR input pin, in order to allow the end user to choose the duration of the reset and quench phases individually.
The output of L3 is used to reset the Latch L1: in this way the transistor M SWITCH is switched on before the reset phases takes place. This lead to a better response of the circuit, that will be ready to detect an avalanche right after the end of the reset phase. It is worth noting that voltage compliance with technology limits is ensured also in this case since the gate voltage of M SWITCH ("pass") is upper limited to 1.8V, preventing the drain-source voltage of the underlying M SENSE transistor to exceed 2V.
Finally, the AQC features a "Gate" signal to switch the SPAD off as long as this signal is high. Only when the "Gate" signal is lowered, the SPAD is reactivated. This feature gives the end user the possibility to operate the SPAD in gated-mode configuration that is commonly used to suppress background pulses from the detector in the time intervals where no signal is present, or to discriminate between fast and slow effects, e.g. between Raman scattering and fluorescence [2] .
The layout of the circuit is shown in Fig. 6 : the area occupation is 150x150 µm 2 which makes this circuit suitable to be the building block of AQC arrays.
Experimental results
The designed AQC has been extensively characterized with two different custom technology SPAD detectors. First of all, the circuit has been tested with a RE-SPAD. As previously stated, indeed, the presented IC is the first fully integrated AQC that can work with RE-SPADs, being capable of applying voltage variations of tens of Volts (up to 50 V). In Fig. 7 the waveforms at the anode of the RE-SPAD are shown: the measurements have been carried out for quenching pulses equal to 20 Volts, that is a typical overvoltage for these detectors, and 30 Volts. It is worth noting that in all the measurements the capacitive loading of the setup has been minimized in order to make its contributions negligible with respect to the overall capacitance of the detector, the circuit itself and the parasitics of the wire-bonding connection between the two. Secondly, the maximum count rate of the AQC has been evaluated. To this aim, the circuit has been connected to a 100 µm-diameter custom technology SPAD that not only features a lower anode capacitance with respect to the RE-SPAD, but mostly important it is typically biased 5 Volts above its breakdown voltage, being a good tradeoff between PDE and DCR. The result is shown in Fig. 8 : with the minimum value of the duration of reset and quenching phases, the detector is ready to detect a subsequent photon 12.2 ns after triggering, corresponding to a maximum operating count rate equal to 82 Mcounts/s. To the best of our knowledge this is the first fully integrated AQC capable of achieving such a high count rate with an external SPAD. The performance of the AQC in terms of count rate are quite insensitive to process and mismatch variations: simulations show a standard deviation as low as 0.9 ns on the shortest dead time. To the same extent, temperature variations are limited to less than 20ps/°C.
The 100 µm-diameter SPAD allowed us to evaluate the timing performance of the AQC output: this detector, indeed, can provide the information about the arrival time of the photon with a timing jitter as low as 35 ps FWHM [15] . However, to achieve this goal a specifically designed pick-up circuit must be designed and so far this kind of circuit has always been connected to the opposite terminal of the SPAD with respect to the AQC in order to avoid electrical damages and/or a worsening of the pickup performance because of the high AQC voltage swings.
This approach complicates the design of the whole system especially if the ultimate goal is the design of a high-density multichannel module where the number of interconnections can definitely be a limiting factor.
The presented AQC, instead, cannot only properly operate the detector but it also provides the timing information on the arrival time of the photon with a FWHM jitter down to 119 ps, as shown by the Instrumentation Response Function (IRF) in Fig. 9 . The setup consisted of an ultrafast laser diode (Antel MPL-820 laser module) emitting optical pulses at 820 nm with about 10 ps FWHM and a TCSPC module (Becker&Hickl SPC-130). The B&H module receives the electrical output of the laser, which is synchronous to the optical signal, as sync input (STOP) and a NIM signal provided by a PCB mounting the 100 µm-diameter SPAD and the AQC as CFD input (START). This result opens the way to the exploitation of the designed AQC also in all those applications where a timing jitter around 100 ps is satisfactory, with the advantage of reducing the complexity of the system removing the need of a pick-up circuit. The measurement has also been repeated varying the background light intensity: Fig. 10 shows that dark count rates variations in the wide range spanning from 10 kcps to 2 Mcps result in a shift of the IRF peak as low as 20 ps, making the circuit perfectly suitable to perform timing measurements even in presence of large background variations.
Power dissipation of the circuit has been evaluated: the contribution of the logic circuits (High-Side Logic, Low-Side Logic and voltage translators) is lower than 200 µW at 10MHz; this contribution makes the presented circuit suitable for AQC arrays and it is negligible with respect to the typical contribution due to the current flowing from the detector that depends on the characteristics of the detector itself and on the applied overvoltage.
Conclusions
We presented a new fully integrated Active Quenching Circuit (AQC): the designed circuit has been fabricated in a High Voltage 180 nm standard CMOS technology and experimentally characterized with two different SPAD detectors. This is the first integrated AQC that can provide quenching pulses up to 50 Volts to the detector, thus being capable of driving SPADs of different technologies including the new RE-SPAD. Thanks to its architecture, the circuit can achieve a state-of-art counting rate of 82 Mcounts/s with an external SPAD and it provides a photon counting output pulse with a timing jitter as low as 119 ps FWHM. Its compact design makes this circuit suitable to be the building block of AQC arrays for multichannel acquisition systems exploiting custom technology SPAD arrays.
Funding
National Institute of General Medical Sciences of the National Institutes of Health (NIH) (5R01 GM095904).
