We present the challenges of using single-photon avalanche diodes in large arrays and the architectures employed in the readout systems. We also discuss the elements that make these imagers appealing for some applications and why.
Introduction 1
Photon counting devices have existed for some time in non solid-state form (photomultiplier tubes -PMTs and microchannel plates -MCPs) and in solid-state form (silicon photomultipliers -SiPM). The common property of these devices is a large active area (cm 2 ) and high fill factor, while only a single channel is usually available or they are bulky and require vacuum. Recently, compact multi-channel solid-state single-photon detectors have emerged based on the same technology of SiPMs but thought for imaging applications. The detector, known as single-photon avalanche diode (SPAD), is photon counting enabled and has high time-of-arrival resolution, of the order of a few tens of picoseconds. SPADs may be fabricated in dedicated silicon processes or in standard CMOS; while most SPADs operate at room temperature, they can also be cooled for better noise performance.
Cova and McIntyre were among the first advocating the use of SPADs for fast timing applications in the 1980s [1, 2] . Thanks to their picosecond timing resolution, SPADs are a natural candidate for such techniques requiring highresolution time-resolved photon detection, as time-correlated single-photon counting (TCSPC). Since the emergence of CMOS SPADs at the beginning of the millennium, it has become possible to integrate thousands of SPADs on the same substrate and to use them as an imager [3] . Since then, it is increasingly common to see large format CMOS image sensors capable of photon counting with time-of-arrival (TOA) evaluation. This paper reports on innovations that have followed since 2005 in the domain of CMOS SPAD arrays and SPAD-TOA imagers.
SPAD Principles
A SPAD consists of a p-n junction that relies on impact ionization to multiply photon-generated electrons and holes. Fig. 1 shows the steady-state I-V characteristics of a typical p-n diode. What this curve does not show is the pseudo-steady-state behavior in the breakdown operating condition. A voltage above the breakdown voltage can be applied so long as no carriers exist in the diode's depletion region. As soon as a carrier is injected into the depletion region, impact ionization may cause an avalanche, and the diode will shift operating points to the steady state [4] . Geiger counts can be converted to digital pulses using the circuit shown in the figure. The figure also shows the circuitry used for quenching the avalanche and to recharge the diode to its initial idle state. The probability that a single photon's generated carriers are detected is called the photon detection probability (PDP). Noise's sources include tunneling and fabrication defects, which ease valence-to-conduction band transitions, such as thermally generated or tunneling carriers. Dark counts are characterized by the dark count rate (DCR). The dead time is referred to as the time required in a detection cycle, generally in the ns~µs range. The dead time determines the maximum count rate a SPAD can support. In active quenching, such maximum count rate is the inverse of the dead time; when passive quenching is used the maximum count rate is divided by e. (e = 2.718281…) The ratio between maximum count rate and DCR gives an indication of the dynamic range that in SPAD imagers is usually over 80dB.
The fabrication of a SPAD requires one to build a structure that can prevent premature edge breakdown; in essence, the electric field must be controlled or the increase of the breakdown voltage at the edges of the junction must be ensured, so as to maximize the probability that the avalanche is initiated in the center of the multiplication region. Examples are shown in Fig. 1(a), (b) , and (c). Shallow trench isolation (STI) can also be used to delimit the junction, provided that it is surrounded by a multi-layer of doped silicon so as to force recombination of those charges generated in the defect-rich STI as shown in structure (d) [5] . These structures are usually shaped as a ring around the junction; they are known as guard rings. Guard rings can also be defined implicitly by proper definition of drawn layers.
There exist a variety of avalanche quenching techniques, partitioned in active and passive methods. The literature on these variants is extensive [6] . In active methods, the avalanche is detected and stopped by acting on the bias. In passive methods the p-n junction bias is self-adjusted e.g. by a ballast resistor. When implemented in large arrays, several measures have to be taken to limit crosstalk and substrate-induced interference, and to ensure that DCR and afterpulsing in SPADs be contained [7] [8] [9] [10] [11] [12] . Dead time, timing jitter, PDP, as well as DCR uniformity have to be accounted for and properly characterized as in [7] . PDP of course will also be a function of the input wavelength.
Readout Architectures vs. Applications
Creating large arrays of essentially independent digital pulse generators implies the design of efficient data readout mechanisms. The simplest readout architecture implementing photon-counting on-chip in combination with random-access single-photon detection, was demonstrated for the first time in [3] . In this readout scheme, all timesensitive operations had to be performed sequentially. The micrograph of the chip is shown in Fig. 2(a) . The readout bottleneck was partially addressed by means of a latchless pipeline, a technique proposed in [13] and shown in Fig. 2(b) , where a time-to-digital converter (TDC) was used at the column level to determine where in the column and when the photon was received. The first fully integrated SPAD array was reported in LASP [8, 14] , shown in Fig. 2(c) , where column-parallel TDCs were used to process photon arrivals in an event-driven fashion. Finally, in the project MEGAFRAME [15, 16, 17] , a pixel-parallel array of 32x32 TDC-SPAD pixels was implemented. The chip reported in [15] is shown in Fig. 2(d) . A larger version of the chip (160x128 TDC-SPAD pixels) was later reported in [18] , while other column-parallel arrays have recently been reported in [19] .
In MEGAFRAME, each TDC had resolutions varying from 52ps to 119ps, with a depth of 10b and a cycle time of 1µs. In these chips the differential non-linearity (DNL) and integral non-linearity (INL) could typically range from 1 to 4LSBs and in LASP, they were recently improved to ±0.1LSB and ±0.25LSB, respectively [14] .
Future Challenges and Outlook
Single-photon detection has given the way to single-photon imaging over the last years. Thanks to very large scale integration and miniaturization, new functionality is now possible with optical performance that is comparable to nonsolid-state solutions. However, limitations still exist, such as fill factor and PDP are still relatively low and only partially mitigated by the use of optical concentrators [20] . SiPMs have traditionally exhibited high fill factors at the cost of the lack of imaging capability. Digital SiPMs, while keeping the SiPM philosophy, enable basic processing and have thus been adopted in arrays where they are employed as mini SiPMs. An example of this trend has been reported by [21] , where four 10x10 mini SiPMs has been implemented in 0.18µm HV CMOS technology in combination with a mirror to scan large areas in TCSPC mode. Recent SPAD structures demonstrated in nanoscale CMOS, exhibit improved DCR and spectral efficiency, as well as compatibility with through silicon vias (TSVs) and backside processing [22, 23, 24] . Adoption of these advanced processes brings the prospect of simultaneous improvements in time resolution, fill factor, pixel pitch as well as the capacity to integrate on-chip time-of-flight computation to ease I/O data rate demands. Analog approaches to timeresolved SPAD pixels offer a route to the smallest pixel pitch provided uniformity issues are addressed [25] . On another front, the emergence of III-V materials in configurations that are fully compatible with a CMOS fabrication line, may bring these materials to the mainstream. Examples of this trend are two independent works reporting the first Ge-on-Si SPADs fabricated in a way that is fully compatible with a conventional CMOS technology [26, 27] .
