This article presents a prototype of a CMOS phase sensor for high accuracy (1 Angstrom) heterodyne interferometry. Switched integrators realization of a lock-in pixel for 4-bucket phase detection algorithm is described and illustrated by experimental results. Factors that limit the accuracy of this implementation and possible ways for its improvement are discussed.
INTRODUCTION
To increase density of electronic components on integrated circuits, the semiconductor industry is trying to decrease the wavelength of the light used in the lithography process. Extreme ultraviolet lithography (EUVL) aimed to use 13.4 nm to be able to manufacture wafers with feature sizes smaller than 50 nm. 1 Most of the materials used for lenses in classical refractive projective optics becomes almost opaque at this wavelength, urging to move to reflective projective optics. In order to perform properly the mirrors for this projection system should be made with a sub-nanometer figure accuracy. Although there exist tools able to measure the mirror profile in high and middle spatial frequency ranges, none of them is suitable to measure low (< 1mm 1 ) space frequencies in an optical shop environment. Such a device, based on heterodyne interferometry, was designed in TU Delft. 2, 3 The operating principle of the interferometer is shown on Fig.1 . A light source system uses acusto-optical modulators to produce reference and object beams. Each of these beams is amplitude modulated, but with different frequencies. With the help of optical fibers, the reference beam shines directly on the sensor, forming thus a reference spherical wavefront. The light from the object beam, modulated with a different frequency, shines first on the mirror, which reflects the wavefront back, focusing it near the reference fiber tip. Then the light propagates further to the sensor. Light reflected from different parts of the mirror goes to different parts of the sensor. Interfering with each other, object and reference beams are seen in every point of the detection plane as an AM optical signal with beating frequency equal to the difference of the modulation frequencies. Due to different optical path differences (OPD) between object and reference beams in different points of the detection plane, the phase of the resulting AM signal will also be different. It can be shown 2 that the resulting phase difference is proportional to the OPD. Then, using the back propagation algorithm, the profile of the mirror can be restored. Current optical setup designed by Luke Krieg and described in his Ph.D. thesis 3 can use different type of sensors. With a Kodak CCD sensor an accuracy of 4nm of measurement has been achieved. To achieve sub-nanometric accuracy a special phase sensor should be used. This article describes our progress in design of such a sensor.
THEORETICAL BACKGROUND OF PHASE SENSING
As was described earlier, in ideal case each pixel in the sensor receives a sinusoidal optical signal. The phase of this sinusoidal (relative to some reference, for instance external clock or one of the pixels) should be calculated with an accuracy of 2Π/ 10 4 rad to obtain desired accuracy of 0.1nm of profile measurement. A number of phase sensing algorithms is known in interferometry. 4, 5 The majority of these algorithm can be divided into two groups -global and local algorithms. Global algorithms can use information from only one frame of intensity field, and local ones use series of intensity measurements for each pixel. In our application we are not interested in measuring dynamic events, but in contrary, it is desirable to Figure 1 . Operational principle of a heterodyne interferometer. An interferometric fringe pattern is formed in the detection plane due to optical path differences (OPD) traveled by the object and reference beams.
eliminate the influence of fast changing factors, such as vibration or air turbulence. Thus the choice is made to use local algorithms.
Any local algorithm can be expressed in terms of phase shifts. 5 Namely, if the signal represents a periodic function on phase and is written in accepted in interferometry form as
where signal's DC component a is called the bias, coefficient Γ is the fringe contrast. Function s is a cosine in a simplest case, but may also contain other harmonics. Then any phase calculation formula using N samples I k with phase shift ∆,
can be written as˜
where c k a k b k are some complex numbers. Usually, the coefficients c k and phase shift ∆ are chosen with a purpose to remove as many as possible harmonics in the Fourier series of the sum 
In heterodyne interferometry, the optical signal is a periodical function of both phase and time, with frequency f 1/T ,
and thus phase shift ∆ corresponds to time shift Τ ∆/ 2Π f , which gives
for N-bucket algorithm.
To increase the accuracy of the algorithm without increasing the sampling frequency f N, one can repeat the measurements M times and average the results. For N-bucket algorithm it is equivalent to continue the sampling and collect MN samples with the same time shift:
One can see from this formula that the algorithm effectively averages samples in N points and then calculates the phase (hence the name N-bucket). For ideal sinusoidal signal, each bucket receives copies of the same sample and thus grows linearly with M.
LOCK-IN PIXELS WITH FOUR INTEGRATORS
In practice, the phase sensing algorithm described in the previous section can be implemented in two ways: one can either acquire the whole sequence of MN signal samples from a photodetector in each pixel and then process them "off-line", or perform collection in N buckets directly on chip, and then acquire only N values, from which the phase can be calculated. We refer to the first approach as a fast sampling type camera (FSTC) 6, 7 and the second as a lock-in camera, the term that was coined by Lange and Seitz. 8, 9 The main advantage of the FSTC approach is its independence on the chosen algorithm, as it provides the researcher with a "snapshot" of the sampled signal. However, it poses severe limitation on the range of possible signal frequencies, as the data-transfer speed cannot be higher then the sensor's frame-rate. In the second approach, these restriction are alleviated, as sampling and collection of the samples in buckets are performed on chip, and can be done with a high frequency. Thus the same amount of samples can be collected for a shorter period of time, which is important in case of studying dynamic events or in systems with low temporal coherence of a light source.
For our phase sensor we have chosen lock-in approach with 4-bucket algorithm. In this case, formula (7) is simplified to˜
The choice for N 4 was made due to a symmetry of the resulting formula, which contains only sums and differences of the signal samples. In integrated circuit design, where absolute values of parasitic effects (a leakage current, for instance) cannot be known a priory, precautions can be made to make them equal to certain extent. Then their influence will be removed by the symmetrical formula.
Schematics of one pixel of a prototype phase sensor is shown on the Figure 2 . Conversion of the light intensity to a photocurrent is performed by a photodiode D. The photodiode current and its inner capacitance depend on the voltage across its terminals in non-linear way. Thus, to keep harmonic distortion of the signal at low level, the photodiode is held at constant reverse bias by an operational amplifier. The reverse bias voltage V pb is set by non-inverting input gate of op-amp. In this configuration the photocurrent is linearly proportional to the optical signal. Fig. 2 . Maximum charge that can be stored across each capacitor is defined by the photobias and op-amp saturation voltages. This value limits the maximum measurement time.
The advantage of this schematics is that no current mirroring or copying is used, which increase effect of mismatches due to fabrication tolerances. Sharing of one op-amp between 4 integrators provides a better matching, which is limited only by capacitors matching and parasitic elements on the IC. The disadvantage is that although op-amp characteristics are of low importance during one of four integration phases, low slew rate or poor gain-bandwidth product can affect the performance during switching from one capacitor to another.
This schematics does not include the readout stage, because in the prototype chip we are interested only in continuous monitoring of the op-amp output. The readout of the stored photo-charge is done by setting clocks Φ i to High in sequence, or simply by considering the pixel output during the last period.
It should be noted, that the schematics realizes a slightly different algorithm, because samples are represented as integrals of the signal over time interval of finite length Τ. This introduces an additional error source in the algorithm. A theoretical error analysis for this case was presented in another paper. 
EXPERIMENTAL RESULTS
A prototype device was manufactured by DIMES (Delft Institute of Microelectronics and Submicron Technology) of TU Delft in a standard 1.6Μ CMOS (DIMOS-01) process. Compared with CCD technology, which is often used in imaging and light sensing applications, standard CMOS process is optimized for the electronics, not for photocharge collection, and has a higher noise level. On the other hand, CMOS allows integration of different functions on the same chip, and is the cheapest and most rapidly evolving technology nowadays. A special process combining CCD and CMOS features would be optimal, although more expensive.
The chip has been designed in accordance with the application needs. Due to high asphericity of the EUV-lithography mirrors, the fringe pattern produced by the interfering with a spherical reference wavefront is very dense. The size of a photosensitive part of each pixel should not exceed one half of a fringe width to avoid aliasing. The pixel pitch however can be bigger, provided the number of "skipped" fringes is known. 3 For the layout, we have used numbers from Klaver's analysis. 10 The prototype chip represents a matrix 4 4 of identical lock-in pixels (see Fig. 3 ). Each pixel is 791.6 333.3 Μ 2 , and has an N-well-substrate photodiode of 11 33 Μ 2 as its photosensitive part. Well-substrate photodiode has a bigger sensitivity for red light used in the optical setup (He-Ne laser, Λ 631nm) then diffusion-well photodiode. The measured quantum efficiency Η was 39% for this wavelength. Each pixel's output is connected to the output pad. Bias voltages and clocks are common for all pixels. Transmission gate (TG) of twice the minimal transistor size was used as a switch. Two "dummy" TG's of minimal size driven by inverted clocks surround every switch to minimize the charge injection effect. "Sandwich" capacitors poly-metal1-metal2 of 1pF were used for storage caps. The rest of the chip area is occupied by a low-noise 3-stage op-amp with p-MOSFETS input gates. All pixel area but the photodiode is shielded from the optical signal by an additional metal layer.
The chip was mounted in a custom PCB (Fig. 4) , which generates 4-phase clocks from a master clock of the optical setup. The integration time of each phase Τ is set by the duty cycle of the master clock. The board can also generate a To test the chip performance a simpler optical setup based on a LED and an optical chopper was used. Depending on the distance from the LED to the chopper's disk, optical signal generated by the setup varied from rectangular to sinusoidal one. The phase difference in signals seen by different pixels appeared due to the finite sizes of the LED and the chip. To check the integration linearity also a constant light source (not modulated) with known intensity was used.
The chip performance was tested against two main characteristics defining the accuracy of the phase measurement: charge integration and charge conservation. To validate the charge accumulation mode a rectangular optical signal (lightno light) was generated. The results have shown a linear integration behavior (see Fig. 5 ). The slope of each pixel's output monitored with an oscilloscope corresponded to the measured light intensity. The linearity was achieved in all range of the output voltage.
Acceptable charge conservation was obtained for small differences of accumulated charge (Fig. 6 ). For big voltage differences (> V pb ) and small gaps between the accumulation phases (Τ T /4), the inverting op-amp's input cannot go lower then the ground during the transient, which leads to charge losses. This effect is noticeable especially for the signal in form of rectangular wave, or sinusoidal wave with a high contrast. Increasing the clock's gap or decreasing the signal contrast helps to avoid the problem.
DISCUSSION
There are several factors which limits the maximum achievable accuracy of the phase measurement with the described prototype. These are: poor SNR characteristics at the readout stage, finite slew rate of the op-amp, asymmetrical charge injection, and devices mismatch. Figure 7 shows the experimental data obtained with a constant light source and 4kHz master clock frequency. A spurious phase was measured as a result of capacitive coupling of the clock signal to capacitors. A close-up shows a big noise level at the output caused by external electro-magnetic fields. From the linear growth of the stored voltages we conclude that this noise affects only the read-out values, and not the 4-bucket functioning. However the read-out uncertainty limits the phase measurement accuracy to 2Π/ 300. Proper shielding of the chip and PCB can be used to alleviate the readout noise. A better solution could be an improved read-out schematics with an on-chip ADC.
Finite slew rate of the op-amp limits the maximum frequency of the measurements. This in turn reduces the total number 4M of acquired samples before op-amp saturates, and reduces the accuracy. Improved output stage of the op-amp is a way to deal with this problem.
The charge injection effect is inevitable in switching analog IC's. The amount of the charge injected to a low-impedance node when a MOSFET is switched off depends on a number of factors and is difficult to predict. There exist different Figure 7. Pixel's output at constant light and 4kHz sampling frequency. The light intensity was big enough to saturate the sensor in 25 periods. The first plot presents the spurious phase calculation due to master clock coupling to the photodiode biasing. Two second plots plots show the close-ups of the same data. High noise level due to interfering electro-magnetic fields does not affect the charge conservation and linear growth of voltage differences. Output of two pixels with asymmetrical charge injection. The incoming light intensity is zero, the sampling frequency 1kHz for the first plot and 2kHz for the second one. The plot appears as a thick line because of the device mismatching. The fact that the slope is proportional to the frequency indicates that the total growth is due to the charge injection.
techniques to minimize it; we have used dummy transistors in our chip. The experiments were made in the absence of the light signal, with different clock frequencies to estimate the amount of injected charge. Although the average spurious current produce by charge injection at 1kHz frequency is comparable with the photodiode dark current (about 150fA), it appeared to be asymmetrical for some pixels (see Figure 8 ). This can be the result of the mismatching of switch transistors, whose dimensions were close to the minimal ones. Careful and symmetrical design of switches should exclude this factor.
CONCLUSIONS
We have presented a working prototype of a CMOS phase sensor for the heterodyne interferometry, based on switched integrators implementation of the 4-bucket algorithm. The prototype proves appropriateness of the chosen approach. The phase measurement accuracy of 1 was achieved. Improved op-amp output stage and read-out electronics, together with careful symmetrical layout of the chip, should increase the accuracy.
