I
N RECENT years, the emergence of high-performance CMOS image sensors (CISs) has been enlarging the imager market. The performance requirements of next-generation CISs have been increasing in terms of frame rate, read noise, dynamic range (DR), as well as pixel resolution. In particular, the pixel rate of most applications is getting faster as its pixel resolution and frame rate increase. For instance, an ultrahigh definition television with 33 megapixels and a 240-fps frame rate requires a pixel rate of 8 GHz. For such a specification, high-speed signal readout in analog domain is not realistic due to the insertion of the wideband noise and signal distortion.
In order to achieve high pixel rate while maintaining a sufficient low-noise performance, a column-parallel ADC is a key element in state-of-the-art CISs [1] - [5] . However, this architecture leads to side effects such as vertical fixed pattern noise (VFPN), which results in a significant degradation of image quality in low light imaging. To reduce VFPN, several techniques can be applied such as digital correlated double sampling (CDS) using a single-slope integrating ADC [6] , [7] and preamplified digital CDS using a successive approximation ADC [8] . Although the single-slope ADC has the advantage of good linearity, it may be difficult to meet both high-speed A/D conversion and high bit resolution because required clocks are increased to 2 N with N bit resolution. Signal amplification prior to the column A/D conversion is also an effective method for reducing an input-referred noise. However, it limits the amount of detectable signal charge and reduces the DR of an imager. Therefore, it is challenging to overcome the difficulty of compatibility between ADC speed and bit resolution, while maintaining low-noise performance and high DR.
In this paper, a low-noise, high-DR, and high-speed CIS with a 13-b column-parallel cyclic ADC based on a singleended architecture is presented. The cyclic ADC needs only 12 cycles for 13-b A/D conversion. The ADC performs identical conversions for reset and signals, within the limited horizontal time period, at a frame rate of over 300 fps, so as to achieve perfect digital CDS and ultralow VFPN. A new single-ended ADC architecture provides sufficient linearity and low-noise characteristics as a 13-b column ADC. Lower total noise and 13-b linearity are achieved using a built-in reference generation and a return-to-zero (RTZ) technique removing the multiplying digital-to-analog converter (MDAC) error caused by digital coupling between the sampling capacitor and the digital feedback route in the column. This low-noise highresolution cyclic ADC brings a high DR over 70 dB without any signal amplifications.
Published cyclic ADCs implemented in the column of CIS have used a fully differential architecture and had a column pitch of 15 μm or larger [9] , [10] , [13] . The simplified circuits with the single-ended architecture presented in this paper are composed of much fewer components, so that they are squeezed into a 5.6-μm column pitch. It can be also applied to a 2.8-μm pixel pitch with double side disposition.
The rest of this paper is organized as follows. In Section II, the sensor architecture and overall operation are described. In Section III, the detailed design and operation of the cyclic ADC and some issues to improve the performance of the ADC in the column implementation are presented. Section IV demonstrates the experimental results. for resetting (RT), charge transfer (TX), and pixel selection (S). High and low voltage levels for RT (VRTH and VRTL) and TX (VTXH and VTXL) can be set by external bias through the vertical driver. Nonoverlap clock signals for the ADC are generated in the TG.
II. SENSOR ARCHITECTURE
As described later, the cyclic ADC generates three-state redundant binary (RB) codes expressed with two decision levels (2 b) for each cycle [15] . Therefore, the raw output of the ADC has a 24-b word length to obtain a 13-b resolution. The ADC outputs for reset and signal levels in the RB code are once stored in two sets of a 24-b latch array. The ith output is read out during the A/D conversion period for the (i + 1)th input. The low-voltage differential signaling driver is used for horizontal data scanning to attain high-speed and low-noise data transfer to the output digital circuitry.
The digital circuits with three pipeline stages for real-time signal processing perform RB to B conversion and the digital CDS. In the digital circuits, only lower half codes (six RB digits) for the reset and signal levels are converted to non-RB codes, and then, the subtraction between the reset and signal levels is carried out for a partial digital CDS operation. The processed lower half codes have 8-b resolution including a sign bit.
In this experimental chip, upper half codes (six RB digits) are read out from the imager chip without any preprocessing. The digital CDS operation for all the digits is performed by an external system using an FPGA. To do this, the upper half RB codes are converted to B codes and subtracted between converted reset and signal levels to perform the digital CDS, which is the same signal processing for lower half RB codes. The final output is then obtained by a bit synthesis of the higher and lower bits followed by a bit expansion operation. The reason for the unprocessed output of the upper half codes is to perform a digital error calibration of the capacitor mismatches and finite gain errors of the operational amplifier if necessary [9] . In this paper, this option is not used.
III. COLUMN-PARALLEL SINGLE-ENDED CYCLIC ADC

A. ADC Design and Operation
Fig . 2 shows the schematic diagram of the single-ended cyclic ADC with an internal reference generation architecture. It consists of a single-ended amplifier, capacitors (C 1 and C 2 ), two comparators, switch transistors, and control logic circuits. The sampling capacitor C 1 is divided into C 1a and C 1b for the internal reference generation.
As shown in Fig. 3 (a), all capacitors are initialized with this circuit configuration by turning the switch Φ INI (shown in Fig. 2 ) ON. In other words, prior to the sampling operation, the charge on the capacitors is reset to remove the residual charge, depending on the MDAC operation for the final bit. This enables the precise digital CDS by eliminating an input dependence of the settling error resulting from the residual charge in a short sampling period. Signal sampling is performed with all capacitors and the operational transconductance amplifier (OTA) of which the inverting input and output terminals are connected [ Fig. 3(a) ]. During the sampling operation, for the most significant digit (MSD) decision by the sub-ADC consisting of two comparators, the sub-ADC is connected to the input terminal, and the input signal V IN is compared with two references for the comparator (V RCH and V RCL ). The sub-ADC generates a three-
) with the value of zero, one, or two. Using the MSD, switches in the DAC are controlled, and the physical bottom plates of the capacitors C 1a and C 1b are connected to V RH or V RL , while the top plates are connected to the OTA's inverting input, as shown in Fig. 3(d) . In this operation, the input V IN is multiplied with a gain of two and subtracted by a reference level, depending on the digitized output (D). After the amplification phase, the amplifier output is then sampled by capacitors C 1a and C 1b at the feedback phase [ Fig. 3(c) ] for the next cycle. The second significant digit is determined by the sub-ADC in the feedback phase. The amplification and feedback phases are repeated so as to obtain the required resolution. An amplification of 11 times is necessary to obtain a 13-b resolution.
The ideal transfer curve of the MDAC operation of the singleended cyclic ADC is shown in Fig. 4 . The relationship between the input and output is given by
where V R is the reference level determined by the output of the sub-ADC, and it is given by
The use of the three-state digital code in the MDAC that is widely used in pipelined and cyclic ADCs greatly relaxes the comparator's precision [11] , [12] . In the fully differential implementation of the three-state MDAC, only two external references are required, and the precise control of the external reference is not necessary. Furthermore, the fully differential configuration is robust to the common-mode noise. For these reasons, most of the pipeline and cyclic ADCs are implemented with fully differential circuits [9] .
In the single-ended implementation of the three-state MDAC, three external references are required, and the three references must be equally divided for the precise analog residue generation given by (1) . However, the generation of a precise reference in a chip is not an easy task.
In order to address this problem, the proposed single-ended MDAC shown in Fig. 2 includes a local reference generation circuit for the midpoint reference level in a similar manner for the fully differential topology. To generate the reference (V RH + V RL )/2, C 1a and C 1b are connected to V RH and V RL , respectively. The transfer characteristic of this single-ended MDAC is given by
where V R is written as
If
and (4) are identical to (1) and (2), respectively. Fig. 5 shows the configurations for reference generation circuits and local DACs. Even though the accuracy of the absolute value of two reference levels V RH and V RL does not affect the linearity of the present ADC, the three reference levels must be equally divided to obtain a high linearity. In Fig. 5(a) , three buffered references generated by a resistor string are supplied to the ADC array. The accuracy of three references are determined by the matching property of the resistor string. It is also affected by the offset voltages of operational amplifiers used for the buffers. On the contrary, the accuracy of the midpoint reference generated by the capacitors C 1a and C 1b [ Fig. 5(b) ] is much higher than the reference level generated by the resistors because the matching in capacitors is much better than that of the resistors. Removing a global route for the common reference also leads to the reduction of the silicon area.
B. OTA and Comparator
The OTA used in the column ADC is shown in Fig. 6 . A single-ended two-stage OTA is used to achieve a sufficient dc gain and output swing for 13-b resolution. Conventional Miller compensation is applied to stabilize the two-stage amplifier.
The sub-ADC comprises two comparators to digitize the analog residue. The comparator, as shown in 
The preamplifier places to reduce the kickback noise from the latch circuit [14] . The dc gain of the preamplifier is 24 dB, and the static power consumption is 16.5 μW. The latch with a positive feedback regenerates the analog output of the preamplifier into a digital level. The output of the latch is applied to the NOR-type RS FF. After the latching operation, the LATCH goes to VDD, and the latch output is pulled down to the ground, so that the following RS FF holds the output until the LATCH goes to low again.
C. Timing for Digital CDS
A variance of physical property such as an offset of the OTA results in a VFPN when the ADCs are implemented in the column of the imager. A digital CDS with high-resolution and high-linearity ADC makes the imager free from the VFPN. Fig. 8 shows a timing diagram of one horizontal period. RT(i) and TX(i) represent control signals in the ith row for a reset transistor and a transfer gate, respectively. The horizontal scanning time is 6 μs at the frame rate of 390 fps with 428 vertical lines. A reset level from the selected pixel is sampled on the capacitors C 1a , C 1b , and C 2 when the signal Φ S is turned on followed by an initial pulse Φ INI . The initialization of sampled charge prior to sampling operation enhances the performance of the digital CDS by eliminating the remained charges at the end of the MDAC operation. In the experimental design, the initialization period is 31.25 ns, which is much longer than the time constant of 2 ns whose sampling capacitance and ONresistance of the switch are 2 pF and 1 kΩ, respectively.
The reset level sampled by Φ S is converted to a digital signal without analog CDS and signal amplification. TX(i) is then activated to transfer accumulated charges in a PD into the floating diffusion for signal level sampling. The second A/D conversion for the signal level is performed with the same period of the first conversion. The conversion time for each cycle is controlled elaborately to boost the conversion speed while maintaining the input-referred settling error at a minimum. Therefore, the A/D conversion for one sample is performed in 2.3 μs, which enables two A/D conversions within a horizontal period of 6 μs. The digitized output for the reset level is subtracted from that of the signal level before being read out.
D. Coupling Noise Reduction
The single-ended ADC is more sensitive to noise and nonlinearity than the fully differential counterpart due to the interferences from digital signals. In column-parallel cyclic ADCs, a coupling noise between a sensitive analog node and a digital feedback route may significantly degrade the noise performance. Because of the narrow column pitch, the digital feedback wiring cannot be kept away from the sensitive analog node such as an amplifier's input. Although a shield metal can prevent their interference, the effect is not sufficient if a 13-b linearity is demanded. Fig. 9(a) shows an equivalent circuit diagram for the single-ended cyclic ADC including the parasitic capacitances. The 2-b outputs D 1 and D 0 for the three-state ADC codes are fed back to the DAC placed at the top of the ADC layout. As a result, these routes are strongly coupled to the top plate of the capacitors C 1 and C 2 (the charge summation node) through the parasitic capacitances C C1 and C C2 . The error voltage at the charge summation node due to these couplings is written as
where V DD is the power supply voltage of the digital signal and ΔD 1 and ΔD 0 are transit codes with respect to D 1 and D 0 . The error depends largely on the parasitic capacitances and the transits ΔD 1 and ΔD 0 , which can take 1 or −1 determined by the digital outputs. In order to avoid the coupling error, an RTZ coding for feedback digital signals is implemented. Fig. 9(b) shows the proposed feedback architecture with a modulated feedback signal. The three-state digital signal is modulated by a pulsewidth with an encoder at the sub-ADC, and then, the modulated signal is decoded at the DAC. Since the feedback signal always starts from zero and returns to zero, the effect of couplings becomes insensitive to the error at the charge summation node.
The coupling capacitances may degrade the ADC performance even for a fully differential configuration shown in Fig. 9(c) . In this configuration, the transit codes do not affect the amplifier's differential inputs in principle. However, if there is a mismatch between the coupling capacitances C C1 and C C2 Fig. 10 . Simulated MDAC error resulting from the capacitive coupling between the sensitive analog nodes and the digital feedback paths. The sampling capacitors C 1 and C 2 are set to 1 pF, and a 20% mismatch between the capacitors C C1 and C C2 is assumed in the simulation.
due to a different pattern of metal routes in the layout, it causes the error. Therefore, the RTZ coding is also effective to the fully differential configuration, as shown in Fig. 9(d) . Fig. 10 shows the simulated MDAC errors with respect to the four different architectures shown in Fig. 9 . In the simulation, all sampling capacitors are set to 1 pF, and it is assumed that the mismatch between coupling capacitances C C1 and C C2 is 20% in fully differential topology. The MDAC error using an encoded feedback signal is reduced to 0.02% in the singleended topology, for the extracted parasitic capacitance between the sampling capacitor and digital route of less than 2 fF. In the fully differential MDAC, the coupling error is reduced to as low as 0.7 μV from 6.6 mV with a coupling capacitance of 10 fF. The remaining error after the RTZ coding results from the lack of the settling time for the RTZ operation because the encoding signal should be settled within 15.625 ns that is much shorter than the settling time for the amplification and feedback operation. 
IV. IMPLEMENTATION AND EXPERIMENTAL RESULTS
A prototype image sensor is implemented with a 0.18-μm CIS technology. The die micrograph of the fabricated sensor is shown in Fig. 11 . Column-parallel ADC arrays are located on a single side below the pixel array.
Measured differential nonlinearity (DNL) and integral nonlinearity (INL) plots of the ADC are shown in Fig. 12 . A 1-Hz sinusoidal signal is inputted through a low-pass filter for this measurement. The maximum DNL is less than 0.4 LSB at 13-b resolution without missing codes. The maximum INL is +0.6/−3.2 LSB. A maximum step of 2 LSB is observed in the INL at the output codes of the comparator's decision levels, which are V RCH and V RCL for a 1.5-b/stage algorithm. This step corresponds to 0.024%. The linearity can be further improved by a digital error correction with an external processing using the upper 6-b RB output if necessary. Fig. 13 shows a noise floor as a function of incident light input. In this result, 1 LSB corresponds to 122 μV with 1-V reference at 13-b resolution. The measurements were performed at room temperature with an IR cut filter. The dark random noise is 2.34 LSB, which corresponds to 285 μV rms . The conversion gain obtained from the cross points where the photon shot noise becomes one electron is 61 μV/e − . The measured VFPN is shown in Fig. 14 , including a part of the acquired image, which is averaged over 500 frames to emphasize VFPN with digital gain of 256. Extremely low VFPN is achieved by the digital CDS using a high-linearity and high-resolution cyclic ADC together with the capacitor initialization technique. The measured VFPN is 6.1 μV rms or 0.1 e − rms with the pixel conversion gain of 61 μV/e − . The sensor operates with an external clock generated by PLL on an FPGA, so that it can work with a wide range of frame rate. 15 shows the measured temporal noise performance of the sensor as a function of its frame rate. The cyclic ADC operates with lower noise, which is as small as 2.5 e − rms up to 390 fps. At a higher frame rate than the design specification, over 390 fps, increasing rms random noise may be caused by the lack of settling time of an internal amplifier.
The noise analysis of the proposed cyclic ADC was performed by means of hand analysis and SPICE simulation. In the noise analysis, the noise power spectral density of each working phase (shown in Fig. 3) is calculated from the equivalent noise model and noise sources. The total input-referred noise was obtained by the summation of input-referred noise contributions in all the cycles. Only the thermal noise is considered. The handcalculated total input-referred noise with a sampling capacitor of 1 pF agrees well with that in the simulation, and it was 110 μV rms . The measured ADC noise of 2.5 e − rms corresponds to 153 μV rms with the conversion gain of 61 μV/e − . The difference between the analysis and the measurement may result from the 1/f noise generated in an internal amplifier and the external noise coupled from power supply lines, which are not taken into account in the analysis. Fig. 16 shows a raw image taken by the prototype image sensor. For comparison, the image with a lower frame rate of 30 fps is laid on the image taken at 300 fps. A rotating chopper blade at the speed of 5 Hz is captured clearly in the image with a 300-fps frame rate. On the other hand, the image at 30 fps has a large motion blur. The specifications and characteristics are summarized in Table I . The total area of the chip is 5 mm × 5 mm. The unfamiliar vertical number of pixels of 428 is determined to fit the imager into the predetermined chip size for the prototype implementation. The low RMS random temporal noise of 4.5 e − rms is attained at the frame rate of 360 fps. Despite the increase in the frame rate up to 390 fps, it still remains at a low level of 4.9 e − rms , resulting in a wide DR of 71 dB. The measured ADC noise of 2.5 e − rms is much smaller than the noise with the pixel source follower, further suggesting a low-noise design in future developments by optimizing in-pixel transistors and sampling impedance for pixel outputs. Table II shows a comparison of the primary characteristics for recent imagers with high-resolution column ADCs. Owing to the nature of cyclic ADCs, a high resolution of 13 b is attained with a high frame rate of 390 fps. The highest DR is achieved using a low-noise single-ended cyclic ADC.
V. CONCLUSION
In this paper, a high-speed low-noise high-DR CIS with a column-parallel single-ended cyclic ADC has been presented. Extremely low VFPN of 0.1 e rms and 0.7 e p−p has been achieved using the precise digital CDS technique. Several techniques proposed in this paper lead to the low noise of 4.9 e − without any signal preamplifications and the resulting high DR of 71 dB. The cyclic ADC enables a high resolution of 13 b while maintaining a high frame rate of 390 fps. The proposed single-ended topology overcomes the problem of the large pixel pitch for the prior art cyclic ADCs. The application to a 5.6-μm pixel pitch was demonstrated for the single-side column. It is applied to a 2.8-μm pixel pitch with no modification but a double side disposition. Furthermore, it should be noted that this cyclic ADC can be implemented in finer pitches for high-resolution image sensors. This is because the dominant contribution to the limiting pitch is the capacitor, and it can be narrowed by selecting another material or an advanced process technology.
