Infrared sensor designers have long maximized S/N ratio by employing pixel-based amplification in conjunction with
Introduction
While nascent (ca. 1970's) imaging sensor arrays used MOS technology for readout, designers of visible and infrared sensors relied on CCD technology to produce the best imaging arrays for various applications through mid-1980. In the late 1980's, however, compliant detector physics (allowing relatively large pixels to match optical blur) and the emergence of affordable CMOS foundries enabled hybrid infrared (IR) focal plane arrays (FPA) to rapidly evolve using CMOS-based circuits for readout. Today's Hybrid IR FPAs (Fig. 1) , now encompassing over 4 million pixels, exclusively use CMOS readouts mated to detector arrays via flip-chip packaging and indium interconnects. One such early CMOS-based active pixel sensor (APS), which is now~15 years old, is still active as a vehicle for breakthrough cosmology in the Hubble Space Telescope. The CMOS readout for Hubble's NICMOS sensor [1] , which has 40 µm by 40 µm pixel pitch, was developed using 2-µm photolithography in 1988.
Ongoing advances in CMOS have since enabled proliferation of infrared sensor products [2] . As another consequence of the inexorable development progress encapsulated by Moore's Law, there is also now a "full circle" homecoming to MOS technology to produce higher performance visible sensors. Although a few skeptics still question CMOS' basic capability to dethrone the venerable CCD, leading CCD manufacturer Sony recently acknowledged the turnabout. In a Reuters interview on September 28, 2004, the head of Sony's imaging device unit declared, "We have to win in CMOS." As head of the world's leading CCD production group Mr Suzuki also stated, "We will develop the new CMOS sensor for the high end market." Consequently, CMOS is now arguably the common platform for both infrared and visible sensor communities. This stunning admission of the imminent cannibalization of Sony's profitable CCD business by its own nascent CMOS team and competing CMOS companies corroborates the revolution.
In this paper we report conception and successful development of a circuit-based noise reduction method that leverages system-on-chip functionality via CMOS integration to suppress pixel noise. By sharing the circuit burden with transistors external to the pixel, fill factor is maximized, pixel area minimized, and photon collection maximized. Furthermore, the technique enables low noise that is independent of video rate for reset time~5 µs and longer.
Background
While the IR community has for nearly 20 years conveniently incorporated amplification at each pixel by exploiting relatively large pixel area, adverse physics (smaller optical blur needing smaller pixel pitch) and strong consumer affection for compact, portable electronics have worked against adopting CMOS for visible sensors. In Fig. 2 , we plot pixel area vs. year of introduction for the various imaging sensor arrays reported in the pertinent IEEE publications from 1970; MOS/CMOS pixel area was consistently much larger than CCD technology until the late 1990's. Availability of deep submicron lithography (£ 0.25 µm) then enabled CMOS designers to suddenly close the gap by shrinking area while excluding functionality beyond basic signal detection, storage and readout. Consequently, lagging behind both the latest CCDs and CMOS sensors in the figure are next generation CMOS sensors with global shutter. A global shutter boosts functionality by adding a second storage elements at each CMOS pixel to deliver progressive readout of true "snapshot" images akin to photographic film and frame transfer CCD but at smaller "grain" and chip size.
The primary thrust of imaging sensor development for visible cameras has thus been to increase resolution while maintaining acceptable sensitivity and noise. Figure 3 summarizes, from pertinent IEEE publications, progress in increasing pixel count over three decades of CCD and MOS/ CMOS sensor development. The largest commercial CMOS sensors, which are used in film-replacement digital single lens reflex (DSLR) cameras currently having market volume of over 1 million units per year, now boast up to 16.7 million pixels. Although higher resolution CCDs exist, these are sold in little quantities at higher price. The larger CMOS sensors having ³ 8 million pixels lead DSLR sales.
On the other hand, ongoing shrink in pixel area to myopically increase resolution is problematic since the laws of physics apply. Image sensor sensitivity at standard illumination is plotted vs. pixel area in Fig. 4 . Even though many different types of imaging sensors with both standard and advanced microlenses are included, sensitivity is clearly limited by pixel area. Furthermore, the most recent CCD data at pitch < 2 µm falls significantly below the area-limited trend. While early CMOS devices using 0.5 µm lithography also fell significantly below the implicit physical limit evinced by the roll off at small pitch in , the sensitivity of deep submicron CMOS sensors now matches the best CCD levels and is also now limited by physics rather than technology. This is a consequence of the fact that sales volume of CMOS sensors recently became sufficiently high to justify foundry investment to specifically optimize optical behaviour. One outcome is the thinning of the metal stack over each photodiode to minimize the distance between the photodiode and the microlens to largely eliminate vignetting at each pixel. This departure from the International Technology Roadmap for Semiconductors (ITRS) is extremely significant since it means that imaging sensors are now on a development track separate from microprocessors and memory. The prior optical limitations caused by leveraging the standard ITRS trend have hence been ameliorated. Yet, ongoing reductions in metal pitch and transistor performance are still available for developing the next imaging sensors.
Since signal cannot be boosted any further and is instead dropping off with increasing sensor resolution, it is paramount that noise be reduced in order to continue increasing sensor resolution and/or reducing the size of consumer electronics. Unfortunately, CCD noise performance has not improved over the last decade, even as pixel size has shrunk. Figure 5 shows sensor noise vs. year of introduction for both CCD and CMOS. Arguably due to the unrelenting push to higher resolution, recent CCDs exhibit higher random noise in terms of electrons because their noise is dominated by output amplifier thermal noise and video rate has increased. CCD random noise is approximately 10 to 20e-depending on video rate/sensor size and suggests that CCD S/N is now degrading with increasing resolution. CMOS random noise varies greatly, reflecting the fact that the CMOS literature is skewed toward academic R&D sensors. Figure 6 provides another view of the lack of progress in reducing noise. By normalizing random noise to each sensor's pixel area, it is clear that CCD noise performance is incapable of compensating for the relentless drop in sensor sensitivity; the noise per unit area would have to decrease each year to compensate for signal loss as pixels have shrivelled over the last decade. The normalized CMOS noise performance data likely support the assertion that these data are skewed by research devices having large pixel areas that ineffectually lower the noise per unit area. The useful path is to reduce noise rather than to increase area. On the other hand, these data do not clearly show that CMOS technology is the solution to the problem; the large scatter masks any possible trend.
Although both CCD and CMOS sensors today have similar resolution, sensitivity, and noise, albeit at disparate market price and production cost, there is a basic difference between the two technologies that can enhance performance. The minimum theoretical read noise of a CCD is limited in large format imagers by the output amplifier's thermal noise after correlated double sampling (CDS) is applied in off-chip support circuits. On the other hand, CMOS can offer lower temporal noise because the relevant noise bandwidth is fundamentally several orders of magnitude smaller to better match the signal bandwidth. Furthermore, CMOS supports system-on-chip (SoC) integration at low power. We assert and show in this paper that the key to circumventing sensitivity fall-off is to use CMOS SoC integration to suppress noise and the available pixel area to maximize sensitivity. Figure 7 conceptually shows the system-on-chip (SoC) architecture for the proposed CMOS active pixel sensor. Noise suppression is implemented by augmenting the three transistors in the pixel with active components located in the buffers supporting each sensor column (i.e., column buffers). The distributed and pixel components work together to alternately constitute a source follower amplifier during pixel readout. During pixel reset, the SoC embodiment transforms to a single-stage amplifier with feedback capacitor and reset switch having variable resistance. The number of pixel transistors is minimized and thus optical fill factor maximized. The distributed feedback amplifier resets each pixel using a tapered reset clock that is tailored by additional support circuits in the sensor periphery to extinguish reset noise and limit mode-switching noise. We now summarize the theoretical analysis of reset noise on a capacitive node that is connected to a single-stage feedback amplifier through a variable resistor. Thorough analyses are reported elsewhere in the literature [3, 4] . The latter acts as the reset switch wherein a tapered reset clock supervises its variable resistance. One application is CMOS active pixel sensors (APS) requiring small pixels having high sensitivity, high optical fill factor and low noise. We use the distributed single-stage feedback amplifier formed by the transistors in the pixel and in the column buffer to reduce photodiode kTC noise and simplify active pixel design. A key objective of the analysis is a simple expression to predict noise and enable intuitive design. Figure 8 illustrates the progressive readout scheme for reset and readout, respectively. On the left is a transistor level diagram for pixel reset; the pixel consists entirely of n-type MOSFETs and an n-type photodiode in a p-substrate. In reset mode, M col acts as a current source set by V bias , Ml acts as a transconductance, and M3 acts as a variable resistance, R sw , controlled by V rst . The series resistance of M3 must be increased gradually by a slowly decreasing V rst ramp, which is common to all pixels being reset, to enable the feedback transconductance of Ml to null the reset noise. Here, M2 is conducting ("row select" is high) and the output column must be tied to a low impedance voltage source. This type of array can reset (i.e. integration then starts) within an aperture on the order of microseconds.
SoC noise suppression

Active pixel design
The right half of Fig. 8 shows the same pixel during readout mode. To read out, V bias is simply brought down to turn M col on harder, so it acts like a closed switch, and so that Ml has power to operate as a source follower with current source in the column buffer outside of the imaging array.
The pixel thus physically reduces to the compact threetransistor (3T) layout used for the classical source follower per detector [5] . Topologically the scheme is similar to a distributed capacitive transimpedance amplifier (CTIA) [5] readout, but without the explicit feedback capacitor. It is also similar to Fowler's active reset having the reset amplifier collocated within each pixel. Figure 9 shows the small-signal circuit model during reset, which we call tapered reset since noise suppression involves tapering the reset waveform to enable active feedback and minimize generation of excess noise [8] . This model is used to calculate the steady-state noise envelope at the reset node corresponding to a fixed value of the reset switch resistance, R sw . It is the envelope reached after all transients have decayed. Of course, this isn't quite the real situation and is only the first step to study the decay rate of the transients and dependence on R sw . The objective is to ultimately design an appropriate waveform for V rst . If V rst ramps down at a slow rate, it might take too long to reset the array or one of its rows. If V rst ramps down too quickly, the initially large kTC noise envelope won't sufficiently decay before the switch opens completely. The process hence involves "bandwidth control noise suppression" [9] .
Noise model
The photodiode node has voltage v 1 and capacitance C 1 to ground. The amplifier output node has voltage v 2 , output capacitance C o and output conductance G o to ground. C o is the capacitance associated with the M1-M3-M col junction in the pixel and the entire reset access bus, most of which comes from the M1-M3 junctions of all the rows. g m is the transconductance of M1, possibly degenerated by M2; it is shown as a controlled current source. The feedback capacitance, C fb , is normally a parasitic or could include a separate component. Noise from M1 is represented by current source i n , and noise from M3 (which is in the ohmic region) is represented by voltage source v n .
This model excludes noise from capacitive feedthrough of V rst . We also note that we do not consider the impact of CMOS distributed channel resistance; while it does not appear in standard noise models for FETs in saturation, it is still physically coupled to C 1 through the gate capacitance. Finally, we also do not consider excess noise coupled onto C 1 via C fb , which is manageable by considering pixel layout and clock transitions.
Simplified noise expression
Equation 1 summarizes the simplified expression for the rms noise charge, Q n , (not noise power) involving two terms. The first is kTC noise from the photodiode node capacitance C 1 , with reduction factor 1/(1 + K 1 + K 2 ) and ensuing noise charge Q n . The second term is noise from C fb . We assume that A dc is high so that the first term of C sw is negligible.
Simplified reset noise approximation:
• 
Key observations
Significant noise reduction occurs for K l + K 2 much greater than unity and K l + K 2 is proportional to R sw . Consequently, there are two ways to extract more noise reduction out of the same R sw for fixed g m :
• increase the output conductance (lower the dc gain),
• increase the feedback capacitance.
In a globally resettable array where p-MOSFETn M col is included at each pixel site rather than at the column buffer, each pixel amplifier likely operates subthreshold to keep total power at a reasonable level. This gives a very low G 0 . In fact, g m could be about 1000G 0 , and hence C fb would be key to improving noise reduction (i.e., K 2 would dominate). In a progressive row reset array, on the other hand, the designer can add G 0 to the output node (as it is a column bus), and doesn't have to rely on C fb . The circuit's decay time constant is an issue with regard to the coefficient on R sw for K l + K 2 . By lowering the dc gain, we get more noise reduction at lower R sw along with shorter time constant, particularly the tentative time constant. However, the potential drawback is that the actual time constant for reset will rise very rapidly for increasing R sw . On the other hand, for high A dc , the time constant is longer, yet the circuit can handle a much larger value of K l + K 2 . This tradeoff would not change if C fb were instead increased.
HDTV sensor performance
Tapered reset efficacy was studied in a CMOS imaging-SoC (iSoC) sensor specifically designed for high definition television. Sensor floor plan is shown in Fig. 10 . Having die size of 13.1 mm by 9.3 mm in 0.25-µm process technology, the iSoC comprises a 1920 by 1080 array of three-transistor pixels with 5 µm by 5 µm area, upper and lower banks of analogue buffers reading each sensor column, digital signal processing including line-mixing and pixel-binning, pipeline 12-b digitization, programmable state machine and bias generator, and three banks of dualport SRAM totaling 6 kB. The pipeline ADC has 7-stage configuration (3-3-3-3-3-2-2) with error correction, is distributed at sensor top and bottom, and consumes 0.1 pJ/DN at 74.25 MHz.
Analogue signal processing includes offset correction and programmable gain amplifiers in the column buffers and at the ADC input, black-level clamp, dynamic noise reduction via threshold-programmable analogue gain management in the column buffers, and pixel kTC noise suppression by means of SoC implementation of photodiode reset via distributed negative feedback. Power dissipation without digital signal processing is~550 mW for 1080 p60 (i.e., 1920´1080 at 60 Hz progressive) imaging.
The sensor's nominal analogue and end-to-end conversion gains are 32 µV/e-and 12.2 e-/DN, respectively. Programmable analogue gain of up to 48 dB is available. As-drawn fill factor for the 3T pixel with photodiode is 50%; with microlens array having 0.4 µm interpixel gap the effective fill factor is~75% after accounting for losses at the optical interfaces of the planarized 4-metal/1-poly stack.
Random noise
The SoC sensor supports several reset modes including hard reset, soft reset, tapered reset and various combinations. Table 1 compares the measured to the predicted values for several key reset modes along with the concomitant image lag at 18 dB gain. Tapered reset operation at 18 dB analogue gain yields minimum random noise of 8e-with image lag < 0.012%. The current methodology for HDTV cameras resets the pixel within an epoch of~10 µs; using tapered reset the noise is thus » 1 4 -th the predicted kTC level of 30e-for 5.5 fF detector capacitance at 23°C. Measured noise is flat to the maximum frequency of 225 MHz as predicted when the tapered reset clock waveform is properly tuned. While lowest pixel noise is measured using only soft reset, the associated image lag of 1.5% is unacceptable for video use. Perhaps a better alternative is to use tapered reset with either longer reset time (tapered reset II), or to further tune the various distributed amplifier settings to improve noise reduction efficacy since the measured data agree with the predicted noise levels.
Sensor S/N ratio
Video signal to noise ratio is often measured and specified at a prescribed level of illumination. The HDTV application specifically requires measurement at 2000 lux and f8 aperture. We measured minimum sensor S/N ratio ³ 52dB using progressive read and no line-mixing to boost the signal beyond the base pixel sensitivity. This compares favourably to competing FIT CCD-based HD cameras where minimum S/N ratio for 1080i60 operation (interlaced 60Hz video with 2-line mixing for 6dB signal boost) is typically specified from 54 to 56 dB.
Comparison of noise performance
Since random noise typically increases with frequency at 3dB per octave for CCD sensors [10] , the SoC HDTV sensor's noise levels are arguably better judged by using a metric that normalizes random noise to the video frequency so that the workaround of using additional video taps can be directly compared to the CMOS sensor. The figure of merit, r, is simply the random noise in electrons divided by the sensor's video frequency. . Though astronomy and lowlight-level CCDs using either high-gain output amplifiers [11] or shift registers with avalanche gain [12] achieve~1 e-read noise, their lower useful video rates typically yield higher values of the normalized metric r.
It is interesting to compare the SoC implementation with prior CMOS-based results for hybrid infrared FPAs including reported results for source follower per detector (SFD) and capacitive transimpedance amplifier (CTIA) embodiments. Figure 12 hence plots noise vs. detector capacitance including previously measured data [13] . The noise levels of SoC with tapered reset are comparable to SFD with correlated double sampling (CDS) in actual system use. Also, since the SoC feedback amplifier with tapered reset is much like a distributed CTIA, its noise is comparable at similar capacitance.
Conclusions
The theoretical advantages of CMOS-based imagers have been validated on infrared and visible imaging sensors. While the read noise of competing CCD imagers has not improved significantly over the last decade except when the video rate is lowered to rates unacceptable for new cameras, SoC CMOS now yields superior performance including lower read noise at comparable sensitivity.
We have also reported a system-on-chip technique for suppressing kTC noise that is used in a progressive 2/3-inch 1920´1080 sensor to generate 12-bit video with < 10e-read noise and 2.2 V/lux-s sensitivity at up to 90 Hz frame rate. S/N ratio is > 52 dB at standard scene illumination (2000 lux, f8 and 89% reflectivity). Minimum random noise at 18 dB gain is 8e-, independent of video frequency, using a SoC distributed amplifier to minimize noise. Maximum frame rate for contiguous 1280´720 region-of-interest is 180 Hz at SNR ³ 52 dB. Analogue and digital signal processing limit fixed pattern noise to <1.8 DN. Minimum horizontal MTF for 625 nm illumination is 50% at Nyquist.
SoC integration provides a way to circumvent physical limitations inherent in high resolution sensors having small pixel pitch by significantly reducing random noise below the level otherwise practically achieved including CCDs. Another way to show the impact is to compare S/N using photographic terms, such as ISO speed. Straightforward methodologies exist to directly compare the sensitivity of motion imaging or photographic film to electronic sensors [14] . ISO specification 12232 documents the industry standard. We hence estimate effective sensor read noise by measuring the ISO speed of digital still and video cameras. In general, per the definition of film ISO speed rating, accurate camera exposure is obtained when the shutter speed is set to the reciprocal of the ISO speed with the lens aperture at f16. Following Baer [15] , we calculate and plot the theoretical ISO speed curves vs. read noise and pixel pitch at 100% optical fill factor shown in Fig. 13 . At square pixel pitch of 4 µm per side and 10 e-read noise, the theoretical film speed for an electronic sensor that collects all photons impinging on the 16 µm 2 pixel area is~200 ISO. Accurate electronic film exposure at f16 and 200 ISO requires setting the shutter to 1/ISO = 1/200 th s. Using tapered reset at 5 µm pitch, measured ISO speed is from 300 to 600 ISO depending on the reset method. Even though the CCD noise data includes noise processing in the supporting camera, the SoC method still provides superior or comparable ISO speed.
