ABSTRACT This paper presents a CMOS time-of-flight (ToF) range image sensor using high-speed lock-in pixels with background light canceling capability. The proposed lock-in pixel uses MOS gate-induced lateral electric field control of depleted potential of pinned photodiode for implementing a multiple-tap charge modulator while achieving a high-speed charge transfer for high-time resolution. A TOF image sensor with 320 x 240 effective pixels is implemented using a 0.11-µm CMOS image sensor process. The TOF sensor has a range resolution of less than 12 mm without background light and 20 mm under background line for the range from 0.8 to 1.8 m and integration time of 50 ms. The effectiveness of in-pixel background light canceling with a three-tap output pixel is demonstrated.
I. INTRODUCTION
In the last decade, the interest in application for real time range imaging is growing in the automobile industry, industrial control, robotics, security, ambient assisted living, medicine, gaming, virtual reality, and sciences. There are many range imaging methods including stereo vision, lightsection method and time of flight (ToF). The ToF range imaging is a rapidly developing technique for video rate 3-D imaging because it allows all pixels to calculate the range in parallel. Numerous developments of ToF range imagers have been reported [1] - [5] . Recent developments are often based on CMOS image sensor technology with pinned photodiode options [6] - [8] which is suitable for cost-effective mass production. Reported CMOS ToF range image sensors use single-tap or two-tap lock-in pixels and, to cancel the influence of background light, two or four sub-frames are needed to produce a background canceled range image. These architectures, however, have weakness for the precise range measurement to moving objects, because the cancelling of background light is not guaranteed for moving objects. The lock-in pixels without charge draining gates also suffer from the background light during the readout time of the operation. Another important issue of CMOS ToF range imagers for high range resolution is the speed of lockin pixels, which is needed to be improved for using high modulation frequency light or short duration light pulse.
To address these problems and requirements, this paper presents a CMOS ToF range image sensor using high-speed lock-in pixels with background light cancelling capability [9] . The proposed lock-in pixel structure uses a pinned photodiode and lateral electric field control for photo-charge modulation. This structure is suitable for implementing a multiple-tap charge modulator while achieving high-speed charge transfer for high time resolution. A QVGA-size ToF range image sensor is implemented using the proposed lock-in pixels for the proof of concept. This paper is organized as follows. The operation principle of the pixel and the design of an imager are described in Sections II and III, respectively. The experimental results of the implemented chip are presented in Section IV, and conclusions are given in Section V.
II. LOCK-IN PIXEL WITH LATERAL ELECTRIC FIELD CHARGE MODULATION
Indirect ToF measurement is based on a periodical modulation of photo-generated charge by electric field created by the MOS gates. A pixel with this function is called a lockin pixel or demodulation pixel. Conventional lock-in pixels often use a CCD-like structure [1] - [4] , where photon detection and photo-generated charge transfer are done by a set of MOS gates and direct vertical electric field under the gates. Lock-in pixels can also be implemented with pinned photodiodes [10] - [12] with multiple transfer gates based on standard CMOS image sensor technology [5] - [8] . Fig. 1 shows the structure of two-tap photo-charge modulator proposed in this paper [9] , [13] . In this modulator called the lateral electric field modulator (LEFM), two sets of gates (G 1 and G 2 ) for applying lateral electric field are used for charge transportation in the depleted pinned diodes. The gates are not used for transferring the charge into or out of them, but for controlling electric field of X-X direction in Fig. 1(a) . To do this, a relatively small positive voltage (e.g., High = 1.8 V) and negative voltage (e.g., Low = −0.8 V) are used for the operation. As shown in Fig. 2 (cross-section ( Fig. 2(a) ) and potential profile ( Fig. 2(b) ) in A-A direction of Fig.1(a) , the depleted potential can be modulated as V well while maintaining the potential barrier V b to the gates when applying negative or relatively small positive voltage pulses to the gates. Fig. 3 shows the structure of a LEFM with four taps for three output ports and a drain. Using 4 sets of gates (G 1 , G 2 , G 3 and G D ) for applying lateral electric field in the channel region, the direction of photoelectron flow in a pinned diode is controlled and time-resolved signal detection and accumulation are carried out in the three floating diffusions (FDs), FD 1 , FD 2 and FD 3 . Using the LEFM with three-tap outputs, ToF of a light pulse under background light can be measured. A similar technique of background light cancelling for modified pinned photodiode-based ToF imager is reported in [14] . The timing diagram for the operation of the three-tap LEFM for background-cancelled ToF measurement using a small duty ratio light pulse is shown in Fig. 4 
where I bg is photo current generated by the background light.
In the next phase, G 3 gates are activated and the other gates are deactivated. During G 3 is activated, a part of the signal light pulse is received, and the signal charge Q 3 stored in FD 3 is expressed as
where I s is the photo current generated by the signal light pulse. Similarly, in the next phase where G 2 gates are activated and the other gates are deactivated, the rest part of the light pulse is received and the signal charge Q 2 stored in FD 2 is expressed as
In the rest of time in one cycle, unwanted photo-generated charge generated in the photodiode due to background light or slow carrier components generated by signal light is drained. To do this, G D gates are activated and the other gates are deactivated. The number of photons contained in one pulse is very small, and therefore to intensify the signal, this cycle is repeated for M times during the period of exposure as shown in Fig. 4 . The accumulated signals Q 1 , Q 2 and Q 3 stored in FD 1 , FD 2 and FD 3 , respectively, are
Signals stored in the floating diffusions are sequentially read out to the output of the imager. During the readout time of the signals, the G D gates of all the pixels are activated (while the other gates are deactivated) to prevent the photoelectrons generated by background light to be mixed into the signals stored in FD 1 , FD 2 and FD 3 . By subtracting the output of Q 1 from Q 2 and Q 3 , the background light can be canceled, and obviously from Eq. (1), (2) and (3), the time of flight or T d can be estimated by
and the range is given by cT d / 2 where c is the speed of light.
III. PIXEL AND TOF IMAGER DESIGN
Lock-in pixels using the LEFM and a ToF Imager are designed based on a dedicated 0.11mm 1P 4M CMOS image sensor technology. To obtain high near infrared sensitivity and maximize the carrier transfer speed in the LEFM, the pinned diode structure is created on top of a relatively lightly-doped and thick p-type epitaxial layer. Photo electrons generated in the deep inside of Si substrate move very slowly through thermal diffusion. To minimalize these photoelectrons, a highly doped p-substrate is used to increase the recombination rate of electrons. The unit size of the LEFM with four taps shown in Fig , by applying High to G 1 or G 2 , and Low to G 3 and G D , a potential profile to transfer photo-generated carriers to the direction of x 1 is created and by applying High to G 3 or G D , and Low to G 1 and G 2 , a potential profile to transfer photo-generated carriers to the direction of x 1 ' is created. The simulated potential profiles along y-y' direction is shown Fig. 5(b) . By applying High level to G 1 , and Low level to G 2 , G 3 and G D , the potential profile to attract photoelectrons to be transferred to FD 1 is created as shown by the red line in Fig. 5(b) . Similarly, by applying High to G 2 , and Low to the others, the potential profile to transfer photoelectrons to FD 2 is created. The results of Fig. 5 (a) and (b) indicate that photoelectrons is controlled to be transferred to either FD 1 , FD 2 , FD 3 or drain by changing the combination of the applied voltages in G 1 , G 2 , G 3 and G D . Fig. 5(c) shows the potential profile along the x 2 -x 2 ' direction for applying High(= 1.8V) and Low(= −0.8V) to G 1 gates. The peak depleted potential in the pinned diode is modulated to be 0.55V for the difference of the gate voltage of 2.6V (= 1.8 − (−0.8)V). The potential barrier to the gate of 0.29V is still remained when the G 1 gates are set to High, which is sufficient to prevent the leakage current from channel to gate [9] . and Low to the others. The red dots indicate the initial position of electrons before moving by the electric fields and red lines indicate the trace of movement of electrons. The direction of the electron flow is controlled by the gate voltages and transferred to floating diffusions over a long range of movement of 4 µm or more. The transfer time from the initial position to floating diffusion or drain is simulated to be 400 ps.
The pixel circuit schematic of the designed lock-in pixel is shown in Fig. 7 . In the pixel size of 16. electric field in the pinned diode for increasing the transfer speed as well as maintaining a sufficient photo receiving area. Each CM has own microlens to enhance effective fill factor and the resulting fill factor of the photo receiving area to the pixel size is 50 to 60% depending on the efficiency of the microlens. The range resolution of the lock-in pixel for the ToF measurement strongly depends on the number of signal electrons and the maximum resolution is dominated by the well capacity of the charge storage connected to the FDs. In order to increase full well capacity, capacitors are connected in the FD node. The capacitor is made of a native MOS transistor whose source and drain terminals are tied together. Because of the threshold voltage of around 0V and less dependency of the capacitance to the gate voltage, the capacitor with the native MOS transistor has a good linearity for the applied voltage range of the capacitor. The unit size of the capacitor is 5.5f F and by connecting another capacitor whose capacitance is 11f F by turning on the switch SW, the full-well capacitance can be increased to 16.5f F. The total capacitance including the parasitic capacitance of 11 CMs connected in parallel, can be increase to approximately 40 f F. The large FD mode capacitance is necessary to prevent the saturation under background light. The entire ToF imager architecture is shown in Fig. 8 . The ToF imager has a 320(H) x 240(V) effective pixel array, 93(H) x 240(V) pixel array for testing and the total of 413 x 240 pixels. A driver array for gates (G 1 , G 2 , G 3 , and G D ) of LEFM pixels is used at the upper side of the imager. All of the pixels and the light source operate globally during the exposure time while other control signals for readout operation are not applied at all. After exposure time, the readout operation is started. Because electrons are accumulated in the FD nodes, the 3-transister active pixel type of readout operation is used, i.e., the signal level is read out first and the reset level is read out later. Each signal from the 3-tap outputs of each pixel is connected each of the column ADC and are simultaneously converted to digital codes in parallel using a column-parallel highresolution folding integration/cyclic ADC [16] . Each signal is converted to 17b digital code. The circuits and architecture are described in [16] and the details are not described here. The TOF Imager works with the horizontal data scanning clock of 22.5MHz. The three 17b digital data from each pixel are horizontally scanned and read out to the outside of the chip. The time for reading the signals for one row is 55 µs (= 413 x 3 x 44.4ns) and total readout time is 13.2 ms (= 55 µs x 240). With photo-signal exposure time (charge accumulation time) of 50ms and readout time of 13.2ms, the frame rate is 15.7fps. The frame rate of 30fps can also be used by setting the charge accumulation time to be 20 ms.
IV. EXPERIMENTAL RESULTS
A prototype ToF image sensor is fabricated using a 0.11µm 1P 4M CMOS image sensor process. As described in Section II, a relatively thick lightly-doped p-type epitaxial layer is used for enhancing the sensitivity and response of photo-carriers generated by the near infrared light. The size of the whole chip is 9.0 mm x 9.3 mm. To characterize the sensor chip, the following operating conditions are used. The light pulse width T 0 , cycle time of exposure T c and the duty ratio R D in Fig. 4 are 13ns, 1µs and 0.03, respectively.
The gate pulse width for G 1 , G 2 and G 3 is 30ns. The number of cycles M a is about 50000 in 1 frame at 15.7fps. The signal light source is composed of 96 near-infrared LEDs with the wavelength of 870nm. As the background light source, CDM-T150W/942 is used for emulation of sun light. A lens with a focal length of 12.5mm and an F-number of 2.0 are mounted on the sensor. An infrared bandpass filter is placed between the lens and the sensor. The bandwidth of the filter is 50nm.
As a fundamental characteristic of the pixel, the response of three outputs from pixels to a constant DC light as a function of the applied gate voltage is measured. By changing the High levels for G 1 , G 2 , G 3 and G D in the range from 1.5V to 2.5V, the sensitivity ratios of outputs, S 2 /S 1 and S 3 /S 1 change as shown in Fig. 9 , where S 1 , S 2 and S 3 are the sensitivity of FD 1 , FD 2 , FD 3 outputs, respectively. The Low level is fixed to −0.8V. In order to cancel the background light component, the sensitivity of three outputs from pixels should be the same. However as shown in Fig. 9 , the sensitivity of S 1 used for background light cancelling is larger than the others when the voltage level of High is relatively low. By increasing the voltage level of High to larger than 2.2V, the sensitivity of three outputs becomes almost the same. For the High level of 2.4V, S 2 /S 1 and S 3 /S 1 are 0.97 and 0.95, respectively. The reason for the reduced sensitivity in S 2 and S 3 is the imperfect carrier transfer to FD nodes, which may influence to the performance of the ToF sensor, such as linearity, sensitivity and resulting range resolution. In the following measurements, the High level for gate voltage of G 1 , G 2 , G 3 and G D is set to 2.4V, although 1.8V for the High level was good enough in the simulation described in Section III. There still exists the sensitivity deviation between S 2 and S 1 or S 3 and S 1 . To compensate the sensitivity deviation and time offset due to the delay of electronic circuits, Eq. (4) for estimating the ToF is modified and the range is calculated by
where R 12 and R 13 are S 2 /S 1 , S 3 /S 1 , respectively, and T os is the time offset.
To characterize the performance of the ToF imager, 30 frames of raw data are taken using the range measurement with a white flat panel located between 0.8m and 1.8m with the step of 10cm. Fig. 10 shows the relationship between the measured range and the real range. This data is the average of measured results with 100 pixels in the center region of pixel array. The slope and offset in Fig. 10 is calibrated at the distance of 0.8m and 1.8m by modifying T 0 for the slope and T os for the offset. The maximum nonlinearity error which is the maximum difference of the measured range from real range normalized by the range span (= maximum range -minimum range) is 1.9%. Fig. 11 shows the range resolution which is the standard deviation of the temporal deviation of range as a function of the real range. In this measurement, 30 consecutive range images and 100 center pixels are used and the range resolutions with and without background light are shown. For large full well capacity to prevent the saturation of the well, MOS capacitors are connected to the FD nodes. By turning of the switch (SW), the total capacitance of the FD nodes is set to 41f F.
From Fig. 11 , the range resolution (sigma of range deviation) of 5.5mm to 12.2mm is obtained depending on the distance to be measured when the background light is turned off. Under the background illumination of 10klx, the range resolution is form 7.2mm to 20.3mm depending on the distance to be measured. Fig. 12 shows the comparison of range resolution between measurement and theoretical calculation as a function of signal intensity. In this measurement, the signal intensity is controlled by the number of light pulses. Theoretical calculation is based on a formula considering readout noise and photon shot noise due to signal and background light [1] , [2] . The number of total signal electrons is denoted by N S which is the sum of signal electrons stored in FD 2 and FD 3 .
272
VOLUME 3, NO. 3, MAY 2015 The signal electrons stored in FD 2 is proportional to the ToF, and it is denoted by N TOF . The number of electrons due to background light stored in FD 2 or FD 3 is denoted by N B . The readout noise expressed as the number of electrons is denoted by N R . Then, by considering the factors of background offset cancelling and the demodulation contrast, the formula to calculate the range resolution is expressed as less than 10,000, the theoretical calculation shows a good agreement with the measurement results for both cases with and without background light. In the region of high signal intensity, where the number of light pulses is larger than 10,000, the measured results are larger than that theoretically calculated. The difference of the measurement from the theoretical calculation is due to the PRNU of 100 pixels in the center region. Fig. 13 shows the range resolution as a function of the background light intensity. This range resolution is measured for the distance of 1m using the white flat panel. The maximum range resolution is 8.9mm for the background illumination of 14klx. The sensor works reliably under a relatively strong background light level.
Figs. 14 and 15 show range images of a moving hand under the background illumination of 4000lx. In Fig. 14, the hand is moving to horizontal direction, and in Fig. 15 , the hand is moving to vertical direction. Figs. 14(a) and 15(a) show the case using in-pixel background cancelling and Figs. 14(b) and 15(b) show the case using background cancelling with two consecutive frames. In the case of in-pixel cancelling, the background cancelling is done by Eq. (4) with three parallel outputs from each pixel. In the case of cancelling with two frames, one frame receives the signal light and the other frame (next frame) receives background light only. Therefore, for cancelling the background components, the output of FD 2 and FD 3 , of the next frame Q 2 ' and Q 3 ' is used. Then Eq. (4) in this case is expressed as
The range image taken by the in-pixel cancelling does not have visible errors, whereas the image taken by the background cancelling with two consecutive frames causes large motion artifact due to the time difference between the two frames as shown in Figs. 14 and 15.
A pixel that tolerates 150klx of background light is reported in [17] . The reported pixel erases the common mode charge by using anti-parallel connected capacitors while keeping the differential mode charge. In this pixel, the background light component is rejected periodically during signal integration time. This imager employs large pixel size of 125 x 125 µm 2 to include large capacitors of 500 f F.
Similar approach for background light cancelling is reported in [5] . Anti-parallel capacitors are also used for erasing the background light charge. The integration time of sensor is divided into several short sub-integration time to avoid the saturation in the presence of strong background light. The outputs after sub-integration are read out and are combined. Due to the sophisticated design with on-chip processing, the ToF sensor chip has totally high performance, particularly for background cancelation with small pixel size of 10 x 10 µm 2 . However, the chip consumes relatively large power of 2.1 W, because of the high sub-frame rate, and a complicated circuit and processing.
Another technique of background light cancelling by using holes injection is reported in [18] . In this technique, when both outputs are below the reference level, the electrons stored in the floating diffusion recombine with the injected hole and keep the FD from saturation. Reported sensor obtained the correct range data under the background illumination of 40klx. However, transistor mismatch may introduce some non-linear range error to each pixel.
Compared with previous works, our proposed sensor is not sufficient for cancellation for strong light such as direct sunlight. The present design of our ToF sensor is suitable for indoor use, and limited range of 1 to 2 meters. If the application allows these limitations, the present ToF sensor provides a well-balanced performance of background cancelation capability, less motion artifact to moving object, and low-power consumption due to simple processing for range calculation.
274 VOLUME 3, NO. 3, MAY 2015 Table 1 summarizes the specification and characteristics of the implemented ToF imager. The performance of the ToF imager developed in this work is compared with the ToF imagers recently reported using CMOS technology as shown in Table 2 . Because of the short light pulse modulation which has a small duty cycle, the average illumination power is relatively small while achieving a high range resolution of less than 12mm. The ToF imager of the present paper only cancels the influence of background light using in-pixel high-speed modulation, leading to an accurate and robust imaging for moving objects. This feature is useful for applications such as human interfaces with gesture recognition, robotic guidance and object tracking.
V. CONCLUSION
A CMOS ToF range imager using high-speed lock-in pixels based on pinned photodiode technology featuring background light cancelling capability has been presented. A new structure for the lock-in pixels using lateral electric field control by MOS gates is used. A prototype with 320 x 240 effective pixels has been implemented using a 0.11-µm CMOS imaging process with dedicated wafer for high NIR sensitivity and high-speed response. The TOF sensor has a range resolution of less than 12 mm without background light and 20mm under background line for the range from 0.8 to 1.8m range and integration time of 50ms. The effectiveness of the back-ground light cancelling pixel with a 3-tap output for moving objects is demonstrated. Variety of applications will be expected using the ToF imager with the features of relatively high spatial (320x240 pixels) and range (<10mm) resolutions, background light cancelling capability up to 14klux, accurate measurement for moving objects.
