We have developed a hybrid Active Pixel Sensor for detecting low energy X-rays. The sensor consists of a silicon diode detector array built on a high resistivity wafer and an SOI CMOS readout circuit, connected together by means of unique 3D integration technology developed at MIT Lincoln Laboratory. In this paper we will describe measurements of sense node capacitance and device depletion depth along with corresponding simulations aimed to optimize device performance. We also describe race condition in the column decoder and identify ways to eliminate it in order to reduce fixed pattern noise.
INTRODUCTION
Enormous progress has been achieved recently in development of Active Pixel Sensors (APS) for visible light imaging, leading to proliferation of these devices into multiple new applications. They are replacing CCDs in many traditional applications such as imagers for digital cameras and camcorders. There are numerous advantages that APS offer over CCDs -much higher readout rates, lower power consumption, better radiation tolerance, to name just a few. In high energy astrophysics, though, most instruments that are built to detect soft X-rays still use CCDs as primary detectors. The main reasons for that are very shallow depletion depth around the charge collecting potential well and low fill factor, typically much smaller than 100%, if light focussing microlenses (efficient at visible light, but nonfunctional in X-rays) are not taken into account.
In order to overcome these deficiencies we are developing an Active Pixel Sensor for X-ray detection that aims to work in the soft X-ray band (0.3 -10 keV) with high quantum efficiency. The detector is implemented using 3-D integration technology developed at MIT Lincoln Laboratory (see 1 for details of this technology), many features of the device were described in. 
DEVICE DESCRIPTION
The device is an array of 256 x 256 pixels. The pixel size is 24 x 24 microns, which is typical for scientific X-ray sensors. It includes two separate tiers -a photodiode made on an n-type high resistivity substrate (3000 Ohm·cm) and an SOI CMOS readout circuit. On the side that is connected to the CMOS readout the diode has circular p+ islands, each of them 10 microns in diameter. They are surrounded by n+ channel stop regions that separate pixels. The wafer is thinned to 50 microns. The illuminated side is doped with phosphorus and then laser-annealed. The p+ region of each diode is connected by a tungsten plug to the gate of the NMOS transistor in the SOI readout circuit, implemented on a different wafer. An electron microscope photograph of the device cross section is shown in 1. The details of the integration of the photodiode wafer with the readout tier were described elsewhere.
1 During operation the n+ regions of the diode are connected to a positive voltage with the intention to fully deplete the entire volume.
The SOI CMOS is fabricated in a 350-nm gate length technology with 3.3V operating voltage. The device schematic is shown in Fig. 2 . Its pixel includes two stages, the first one contains a sense node reset transistor M 1 and a source follower. The second stage, separated by a large capacitor C 1 , is introduced in order to reduce reset noise, as described in. 3, 4 This in-pixel capacitor is formed between a doped polysilicon gate and SOI active regions, separated by a thin oxide (7.2 nm). We estimate a 787 fF value from the layout.
Connecting plug

SOI
Detector tier, 50 micron A significant advantage of the SOI technology is the possibility to use both n− and p−channel transistors in the pixel. We used p−channel transistors as reset switches, thus avoiding the problem of dealing with "soft reset" lag. In order to minimize parasitic leakage paths an H-gate geometry was utilized for all the transistors in the pixel. We have also found that such transistors have substantially lower noise.
MEASUREMENT RESULTS
The entire device is functional, optical images were taken at room temperature. We found that achieving satisfactory X-ray performance is a much more challenging task for a number of reasons. A very detailed characterization of device parameters was undertaken in order to better understand the limitations of the current design.
X-ray performance
X-ray detection may be the most demanding test of the quality of a detector. In order to achieve high energy resolution, signal losses of any type have to be negligible. Since the signal is very small, the readout noise level also needs to be very low. When an entire array is clocked out, pixel non-uniformities of all kinds mask the intrapixel X-ray performance and make it difficult to diagnose the limits of the technology, which was the purpose of this work. For this reason we focused mainly on single pixel testing. We have been able to demonstrate very good X-ray performance and that has allowed us to measure many parameters with accuracy that is not achievable by other means. For the X-ray tests described here we chose a mode in which one pixel was reset once while the output signal was sampled multiple times. Total number of samples per reset was 260, with 16 of those samples taken during reset. A typical set of data points including one full period from one reset pulse to another and containing an X-ray event is shown in Fig. 3 . X-ray amplitude was determined by averaging 10 readout samples before the X-ray hit and 10 samples after, and then taking the difference of the averages. In such a mode the reset noise is suppressed since all the samples before and after an X-ray event are taken during one reset cycle, the sense node is not being reset between those samples. This is an analog of the Correlated Double Sampling (CDS) readout, although here the processing of double samples (or, in fact, multiple samples) is done in the digital form in the computer, not in the analog domain, as it is traditionally done. Moreover, in this mode most of the 1/f noise is eliminated, because an interval containing 20 samples is short compared to the time corresponding to frequencies below the 1/f knee. The white noise component is also significantly reduced by averaging repeated readout samples before and after an X-ray hit.
We determined the equivalent noise of this measurement by applying the same procedure that is used for X-ray pulseheight computation to the readout sequences without X-rays. Every time an X-ray is detected, a reset cycle 10 reset pulses prior to this hit is analysed, 10 samples before and 10 samples after the sample that corresponds to the hit are averaged and the averages are subtracted. A histogram of these nonexistent events produces a peak around zero and its width is the noise of such measurement, which was determined to be 12.6 electrons rms. Conversion into electrons is based on location of the 5.89 keV line, a result that is described below. Figure 4 shows a histogram of the X-ray events accumulated when the chip was illuminated by F e 55 radioactive source. Characteristic K α and K β emission lines of Manganese at 5.89 keV and 6.4 keV are very well resolved. The energy resolution of K α line of 181 eV Full Width at Half Maximum (FWHM) is close to the Fano factor limited resolution of 162 eV for the readout noise level of 12.6 electrons, clearly indicating that charge collection efficiency is very high. This is especially remarkable since this is a histogram of all events exceeding certain threshold; no selection of events has been made based on the analysis of the signal amplitude in the adjacent pixels. Our previous experience with X-ray detection in CCDs suggests that excluding the so-called split events (the ones that have charge in two or more neighboring pixels) can noticeably improve energy resolution.
Sense node capacitance and its components
Once the X-ray peak is found, it is easy to calculate the responsivity and the sense node capacitance of the detector. The responsivity was calculated to be 4.6 µV/electron, and the node capacitance is 24.7 fF when the voltage applied to the charge collecting photodiode is 18 V. In order to accurately determine the sense node capacitance one needs to know the gain of all the chains between the sense node and the output. This was measured by finding the response of the output to a known signal applied to the reset drain while the reset gate is grounded. The sense node capacitance includes two components connected in parallel: the input capacitance of the readout circuit and the capacitance of the charge collecting photodiode. The capacitance of the photodiode is a strong function of the voltage applied to the p − n-junction, while the circuit capacitance is decoupled from it (see pixel schematic shown in Fig. 2 ). This presents an interesting opportunity to separate the contribution of these components by varying the voltage applied to the photodiode. The result of such measurement is shown in Fig. 5 . Each experimental point on this plot was produced by acquiring substantial amount of data at a given voltage applied to the photodiode with F e 55 source illuminating the device. Capacitance values were extracted from the location of the 5.89 keV line. The capacitance data points were then approximated by an analytical expression describing capacitance as a function of voltage in the following form:
where C node is total sense node capacitance, C circuit is the capacitance of circuit component, k 0 is a constant determined by junction parameters, V is voltage applied to the photodiode, and V i is built-in voltage of the p − n junction. The above equation describes the capacitance of the junction with linearly graded distribution of dopant (in parallel with the voltage-independent circuit component C circuit ) and it fits the experimental points reasonably well. The free parameters in fitting process were C circuit and k 0 . The best fit value of the circuit capacitance C circuit is 13.9 fF. We found that the more common expression for an abrupt junction produces a much poorer fit to the data. This makes sense once the details of the diode structure are taken into account. A schematic illustration of the photodiode cross section is shown in Fig. 6 . The central p+ region is surrounded by a phosphorus doped n+ channel stop. The phosphorus implantation was made through a mask that was tapered near the edge in order to reduce the dopant concentration near the junction. That is the reason why function 1 results in a much better fit than the expression for an abrupt junction. Analytical estimates indicate that the channel stop region surrounding the central p+ diode (the corresponding capacitance denoted as C ch−stop in Fig. 6 ) entirely dominates the diode capacitance (a total of 10.4 fF at 18 V applied to the diode), while the capacitance between the p+ region and the opposite n+ plate, C depl on Fig. 6 , constitutes approximately 1.5 fF when the diode is fully depleted. This was confirmed by detailed 2-dimensional simulations of the photodiode structure using Sentaurus TCAD. Distribution of the potential in the structure produced by such a simulation is shown on Fig. 7 . Very high density of equipotential contour lines between the channel stop region and the p+ diode means that this is indeed the area responsible for most of the diode capacitance. The simulation indicates that the entire bulk of silicon is fully depleted and undepleted portion is very thin even at the middle of the channel stop region at the pixel boundary. Using this TCAD tool we studied how diode capacitance could be reduced by changing distance between the highly doped regions. The corresponding plot is shown on Fig. 8 . Increasing the distance from 1 to 4 microns can reduce the capacitance by approximately a factor of 2, with most of the change occuring when n+ and p+ layers are very close to touching each other. More accurate simulation should, of course, take into account the 3D nature of the problem.
Analysis of the sense node voltage as a function of time allows one to find gate-source capacitance of the reset transistor M1 by measuring reset feedthrough amplitude. We have noticed that this amplitude depends on the biasing voltage applied to the photodiode. This is the result of sense node capacitance being the function of the photodiode voltage. This fact was confirmed by calculating the gate-source capacitance at each step of the photodiode voltage scan. Since reset feedthrough results from the capacitive divider, one component of which is reset transistor gate-source capacitance, its value can be calculated, once the total sense node capacitance is determined. We found the gate-source capacitance to be approximately 0.15 fF, a reasonable value for a 2 micron wide transistor.
Depletion depth
Depletion depth is one of the most important parameters of an X-ray sensor, determining quantum efficiency and splitting of individual events between pixels. It is especially critical for Backside Illuminated devices, because if device is not fully depleted, response at low energies can deteriorate quite dramatically. At the same time, finding depletion depth is not an easy task, the transition to full depletion at higher energies (several keVs) is gradual and can be shadowed by other effects. C-V tests would not work for this purpose, because capacitance of depleted layer in the bulk is by far exceeded by the capacitance to the channel stop region in the vicinity of the p+ diode, as explained in the previous section.
We made such a measurement by using the same scan of the photodiode bias voltage as in the previous section, and analyzing the shape of the response to 55 F e X-ray source. Device was illuminated by the source for a fixed amount of time at each voltage, and a histogram of the response was plotted. Then the number of counts in both M n K α and K β lines were determined, as well as the number of counts in the low energy tail. We assumed that events in the low energy tail are produced by X-rays photons that interact with silicon in the undepleted portion of the device and lose some signal charge due to uncontrolled charge diffusion. On the contrary, interactions in the depleted bulk result in full charge collection and corresponding events contribute to the characteristic line peak. In this analysis we assumed that the number of events in the tail due to splitting of charge between pixels is negligibly small. While, probably, not a very accurate assumption, it is a reasonable first order approximation.
Thus, a ratio of counts in the peak to counts in the tail is a measure of the depletion depth. It is an especially senstive parameter because counts in the peak and in the tail move in the opposite directions when depletion boundary is shifting. Once the bulk of silicon is fully depleted, this ratio as a function of bias voltage should become flat. This is what we indeed observe on Fig. 9 , which shows the corresponding experimental data. The roll-off indicating transition between fully and partially depeleted bulk occurs approximately at 14 V applied to the photodiode.
RACE CONDITION IN THE ADDRESS DECODER
We found that during scanning pixels of the same row some pixels exhibit a large spike before equilibrium signal level is reached (see Fig. 10 ). A detailed examination revealed that such spikes occur at the moments when multiple column address lines simultaneously switch from one state to another. Moreover, the larger is the number of address lines that are switching, the higher is the spike in the output signal. An analysis of the column decoder circuit led us to the conclusion that this is caused by a certain feature of the column decoder circuit design, described below.
Some transitions in the decoder can cause both COL and COL gates in all 256 column readout blocks (Fig.  2) to shut off, while current source which mirrors ICMP current keeps pumping current into the node connected to the gate of the chip output transistor. This causes the voltage at this node to rise, until the column selected for the signal readout enables column readout gates COL and COL. The mechanism of simultaneous shut-off of all the column-enabling gates is clarified in the next paragraph. Interestingly enough, it allows to measure the capacitance of this node by finding the slope dV /dt of the rising edge of the spike, since the ICMP current is known. We found the capacitance to be approximately 0.42 pF, in excellent agreement with the estimates from the device topology, confirming our explanation for the spike's origin.
The column decoding circuit is shown in Fig. 11 . It uses the ratioed logic, which allows for a very simple one stage decoder for eight address lines C0-C7 with a single load transistor M load . More common complementary static decoding logic for eight input signals would need at least two stages with much higher number of transistors. In order to provide low signal level at the decoder output below threshold voltage of the n-channel transistor, the p-channel load transistor M load (Fig. 11) should have high resistivity, so its W/L=0.5/2. As a result, the transition from the low level to the high level at the decoder output node D is a slow process, as it is determined by the current flowing through M load . This transition occurs when all address lines supplying the given decoder block go to 0 (low state). On the contrary, for the transition from 1 to 0 at the node D occuring when at least one of the address lines switch to 1, the discharge of the node D goes fast due to high conductivity of the n-type transistors. The larger the number of address lines switching to 1, the faster the discharge of the node D, as shown on the Fig. 11 b) . During column readout, D nodes in all the columns, except the one that is being selected, are discharged to the low level. When the next column is selected for readout, its node D starts to charge slowly through the corresponding load M load . Meanwhile, the column, that was selected before it, quickly discharges its node D, because discharge is a faster process. If multiple address lines are switching to select this particular column, the discharge of its node D happens even faster, and the delay between disabling the previous column and enabling the current one gets more pronounced. This means that for a short period of time all the columns become deselected, resulting in the spike at the chip output. The highest number of address lines are switching when transition occurs from column 127 (binary 01111111) to column 128 (binary 10000000), and that is indeed where we observe the largest spike, with sequentially smaller spikes at columns 64, 32, and so on.
When only one address line is switched, the spike is insignificant, as it can be seen in the Fig. 10 . This suggests one solution to the problem -using Gray code for choosing address' sequence. The Gray coded addresses have only one binary digit changed for every transition from one address to the next, thus making the spikes negligibly small. We have tested this approach by applying Gray-coded address sequence to the existing decoder, the result was total disappearence of the spikes in the output signal. In this case the order of pixel selection did not correspond to their physical location order, but it worked very well as a proof of concept. The future versions of the decoder will be redesigned accordingly to incorporate the logic for on-chip Gray code implementation. The other possible solution is to use dynamic logic that precharges each D node before switching addresses, although this approach would require additional clocks to synchronize chip operations.
