Abstract-This paper presents the detailed characterization of a single photon counting chip, named CHASE Jr., built in a CMOS 40-nm process, operating with synchrotron radiation. The chip utilizes an on-chip implementation of the C8P1 algorithm. The algorithm eliminates the charge sharing related uncertainties, namely, the dependence of the number of registered photons on the discriminator's threshold, set for monochromatic irradiation, and errors in the assignment of an event to a certain pixel. The article presents a short description of the algorithm as well as the architecture of the CHASE Jr., chip. The analog and digital functionalities, allowing for proper operation of the C8P1 algorithm are described, namely, an offset correction for two discriminators independently, two-stage gain correction, and different operation modes of the digital blocks. The results of tests of the C8P1 operation are presented for the chip bump bonded to a silicon sensor and exposed to the 3.5-µm-wide pencil beam of 8-keV photons of synchrotron radiation. It was studied how sensitive the algorithm performance is to the chip settings, as well as the uniformity of parameters of the analog front-end blocks. Presented results prove that the C8P1 algorithm enables counting all photons hitting the detector in between readout channels and retrieving the actual photon energy.
I. INTRODUCTION
H YBRID pixel detectors, working in a single photon counting mode, open new possibilities in fields such as medical imaging, biology, material science, or synchrotron radiation experiments. In the last 15 years, several research groups focused on hybrid pixel detectors developments [1] - [8] . The popularity of the new generation of photon counting hybrid pixel detectors has increased recently, because of their unique features, such as good spatial resolution, high dynamic range, high count rate, adjustable energy thresholds, and noiseless imaging [9] . All that allows using them in spectrometry applications. However, with a decreasing pixel size, detectors suffer more from charge sharing, which occurs when charge generated in the detector volume is divided between two or more adjacent pixels, mainly due to diffusion. This may result in a loss of detection efficiency due to errors such as missing some of events, counting extra events, or incorrect photon energy detection [10] , and to give an example of consequences, higher patient dose should be delivered in medical applications in order to compensate for the errors [11] . The charge sharing phenomenon is illustratively presented in Fig. 1 .
The recent studies of charge sharing lead to circuit implementations that aim at solving the issue of losing information about a photon hit and its energy. Known integrated circuits (ICs) designed for dealing with charge sharing effects are Medipix3RX [12] built in a CMOS 130-nm technology and PIXIE-III [13] built in a CMOS 160-nm technology. These solutions allow working in both the single photon counting mode and the charge summing mode. The latter is dedicated to mitigation of charge sharing. However, studies of the count rate linearity show that achieving high operation speed is a significant issue for chips working in the charge summing mode [14] .
The first studies done by the FNAL and AGH groups on the charge reconstruction algorithms, introducing the C8P1 algorithm, were presented in 2011 [15] . Using the C8P1 algorithm, an IC was implemented in a CMOS 40-nm technology, and it is presented in this paper. The first prototype of the chip was already described in a previous paper [7] , however, at that time, the tests focused on the STanDard single photon counting mode (STD). The second version of this IC had some digital 0018-9499 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. Fig. 1 . Illustration of the charge sharing phenomenon in a hybrid pixel detector. If a photon A hits the detector close to the pixel center, all charge is collected by one pixel. Whereas, if a photon B hits the detector on the border between four pixels, the charge is divided between four readout channels. issues that were corrected for efficient elimination of the charge sharing related problems. The chip is called CHASE Jr., (from: CHArge Sharing Elimination) and it was extensively tested with an 8-keV X-ray microbeam. The results obtained in the tests are given in this paper. The architecture of the chip and the idea of the C8P1 algorithm, with emphasis on the interpixel communication, are described in Section II. Section III focuses on the correction procedures and synchrotron measurements. Results of the count rate linearity test are presented in Section IV.
II. C8P1 ALGORITHM AND THE CHASE Jr., CHIP FUNCTIONALITY
The proposed solution for dealing with the charge sharing effects is the C8P1 hardware implemented algorithm, which requires interpixel communication, both for analog and digital signals. The aim of the algorithm is to retrieve actual information about each photon's hit position and its energy. The situation, when a photon interacts with a detector and charge is collected by four adjacent pixels P1, P2, P3, and P4 due to diffusion, is presented in Fig. 2(a) . The red dot represents the accurate hit position (in the pixel P3) and the circle represents the charge cloud. In this case, the charge is divided unevenly between four pixels, which is visualized by the differences of the color intensity of the P1-P4 pixels (the higher the intensity, the more charge was collected). To reconstruct the information about the total photon energy, the algorithm uses signal rebuild hubs allocated in corners of each four neighboring pixels, which rebuild signals from these pixels. The signal rebuild hubs are marked with the "+" signs in Fig. 2(b) . Then, the signals from rebuild hubs are shaped and undergo discrimination. If a signal in any of four rebuild hubs, surrounding a pixel, exceeds a given threshold, such a pixel is activated. In the case shown in Fig. 2(b) , signals from four signal rebuild hubs exceed the threshold what is (marked with the big red "+" symbols), thus, nine pixels are activated (marked gray). Each pixel has eight neighbors and there are eight comparators (one per each pixel border) on the borders between the pixels. Each comparator is shared by two neighboring pixels. If a pixel is activated and all of eight comparators point to it as to a pixel with the highest amplitude, it is chosen to register a hit by the C8P1 algorithm. Fig. 2(c) shows nine active pixels and the arrows indicate the pixel P3 pointed to be "the winner" by eight comparators.
The implementation of the C8P1 algorithm is presented in Fig. 3(a) from the point of view of a single readout channel. When a charge cloud is collected by the readout channel [ Fig. 3(b) ], a signal is processed by the charge sensitive amplifier (CSA) first. It transforms a current pulse from the detector to a voltage step of an amplitude proportional to the input charge. The CSA output is connected to two independent signal processing paths: fast for total charge reconstruction, and slow, for hit allocation [7] .
A signal in the fast processing path is reconstructed from four adjacent pixels (a given pixel and its North->N, North-West->NW and West->W neighbors) using the signal rebuild hub to retrieve the information about the total energy deposited by a photon [ Fig. 3(c) ]. Then, the signal from the rebuild hub is amplified and filtered by the shaper fast (SH FAST). The resulting signal is discriminated. Since, a signal from one pixel contributes to four surrounding signal rebuild hubs, four discrimination results are logically ORed to determine if a pixel should be activated [ Fig. 3(d) ].
In parallel, a fractional charge deposited in a given pixel is processed by the slow path. The output signal of the CSA is amplified and filtered by the shaper slow block (SH SLOW) and then compared with eight corresponding shaper slow signals from adjacent pixels, to determine if the signal amplitude in a given pixel is the highest among the neighbors. The eight comparators are located one on each border of a given pixel with its eight neighbors-N, NW, W, SW, S, SE, E, NE [ Fig. 3(e) ]. If all the comparators point at a given pixel, the output "preliminary winner" signal is set, making this pixel a candidate to register a hit.
The output signals from the fast path ("activate pix") and from the slow path ("preliminary winner") are used by the C8P1 digital block to assess if a given pixel should be the one in which the hit is registered [ Fig. 3(f) ]. Registration of a hit increments a counter. To illustrate the signal processing in the fast and slow path, a timing diagram is presented in Fig. 4(a) . The shaper fast signal is discriminated. Logically ORed discriminator outputs of the neighbor pixels, resulting in the signal "activate pix," determine the time when the shaper slow signals are compared. The exact time of comparison t LATCH can be adjusted and is digitally controlled by the 5-bit register called "latch delay" (LD). This functionality is realized in the C8P1 block, as it is presented in Fig. 4 (b). The C8P1 block enables stretching of the pulse "activate pix". The stretched signal is used to latch the comparison result. Consequently, a hit is registered. The LD signal controls the time of comparison for the neighboring shaper slow signals.
The readout channel, presented in Figs. 3 and 4, works in the C8P1 mode, however, more modes of operation, including the STD mode, are available for calibration and testing. Additional circuitry, allowing trimming of offsets at discriminators and gains in the CSA and the SH SLOW, is not shown in Fig. 3 . The details are given elsewhere [7] , [16] . Moreover, the calibration pulse injection circuit was implemented. This allows for bench tests before the final tests of the chip with an X-ray beam at a synchrotron facility.
III. CHARACTERIZATION OF THE CHASE Jr., WITH SYNCHROTRON RADIATION
The C8P1 algorithm has been tested both at the simulation stage and experimentally using the calibration pulse circuitry [7] , [16] . However, synchrotron measurements are necessary to precisely determine the accuracy of the system and assess the C8P1 performance, especially for photons interacting in known positions, e.g., on pixel borders.
A. Measurement Setup
The CHASE Jr., chip consists of an array of 24 × 18 pixels of the size of 100 μm × 100 μm each and is bump bonded to a silicon, 320-μm-thick detector produced by Hamamatsu. The module was tested on the 1BM-B beam-line at the Advanced Photon Source at Argonne National Laboratory. The 8-keV energy beam was used. This energy has been the target for the detector development for which the CHASE Jr., chip was built as a small-scale prototype [17] . A pinhole collimator of the diameter equal to 3.5 μm was mounted in front of the detector and the beam intensity was tuned to register 10-30 kphotons/s after the pinhole. The chip operated far from the pile-up conditions. A crystal monochromator was used to select the energy of X-rays. The module was positioned perpendicular to the beam and the XY positions were adjusted using the ESP301 motion controller and the Newport CM25A X-Y stages. The measurement setup is presented in Fig. 5 .
B. Correction Procedures
It was found that equalization of thresholds and gains significantly improves the detection efficiency of the charge sharing compensation algorithm [16] . Thus, the correction procedures of threshold dispersions and CSA gain and shaper slow gain dispersions were executed before the tests. The analog chain can be configured for correction procedures using a multiplexer. The CSA output, shaper fast output, and shaper slow output can be redirected via multiplexer to a discriminator [16] . During the correction procedures, the chip was operating in the STD mode and the analog chain was configured accordingly to the parameter chosen for each correction type.
At the first step of the trimming process, the dc offsets at the discriminator inputs are adjusted by performing threshold scans without any input signal. The response is registered noise hits from the analog front-end with the maximum counts at the threshold equal to the pixel baseline. Such scans were performed for different codes of the trimming digital-to-analog converters (DACs) (in the CHASE Jr., chip, 6-bit DACs were implemented) for all pixels. The choice of trimming DAC values was optimized for each pixel, which led to the minimum offset spread in the whole pixel matrix. As a result of the discriminator offset trimming, the global threshold for the whole matrix could be set properly. Trimming DACs are used only for discriminators. Comparators use auto-zero technique to reduce offsets.
The next step of the trimming procedure was the CSA gain trimming for equalizing the response of the fast paths. The gain can be adjusted by switching feedback capacitors connected in parallel that is controlled with a 3-bit register. The threshold scans were performed when the full area of the detector was illuminated with the 8-keV X-rays. It was done for all possible gain settings and the register value resulting in the gain value closest to the mean gain was chosen. Similarly, the shaper slow gain was trimmed in the whole pixel matrix. If all pixels had ideally the same gain, the contribution of a pixel would be exactly proportional to the charge collected by a pixel and the C8P1 algorithm would be able to choose the pixel that collected most of the charge. However, in the case of equally shared charge, noise contribution needs to be considered.
The shaper fast design provides only two gain modes, high and low gain mode. Shaper fast is used only for triggering of resolving of an event. Thus, it is not critical for equalizing the fast shaper gains.
Using the correction procedures described above, the CSA gain spread was reduced from 11.5% to 3.8% (calculated as the standard deviation to mean gain ratio) and correspondingly, the shaper slow gain was reduced from 10.4% to 2.3%. The total average gain in the fast path equals 27.6 μV/e − . The effective threshold spread, referred to the input, was reduced from 543 e − rms to 36 e − rms. The noise, measured for the corrected pixel matrix connected to a detector, equals ENC = 117 e − rms in the STD mode. The power consumption of the pixel analog part was about 35 μW. The results of the correction procedures for one module tested with the synchrotron radiation are shown in Fig. 6 . The analog parameters spread before and after correction are presented for the CSA and slow path gain.
C. Measurement With 8-keV Photon Beam
The main aim of the measurements was to verify whether the C8P1 algorithm addresses registration of events well when the impacts occur near pixel borders in a detector. Hence, two types of tests were performed. First, the threshold scans were measured when the charge was shared between two and four pixels, to verify if the information about the photon energy is reconstructed correctly by the algorithm. Second, chosen regions of the detector of 250 μm × 250 μm and 700 μm × 700 μm size were scanned by the beam with the threshold value set to the half of the photon energy = 4 keV. The beam position on the detector was changed with the 5 μm steps along the x-axis. After scanning the whole line in the x-direction, the beam was moved to the next Y position. The total accumulated number of events registered in the two regions of interest (ROIs) was measured and plotted as a function of the beam position on the chip.
The measurements with an 8-keV source, a typical energy for XPCS experiments, are challenging for photon counting systems. The lower the energy of the incoming photons, the more difficult testing the algorithm for charge sharing effects elimination becomes. This is because noise and parameters spread are approaching or are comparable to fractional signals. Energy of 15 keV was for example used for studies of the charge collection process in the Medipix3RX characterization [12] .
To illustrate the problem of the low energy photon detection, the test with 8-keV photons was performed by hitting the detector near the pixel border. In the case of measurements from Fig. 7 , the beam was positioned so that one pixel received majority of charge carriers. In the case of Fig. 8 , the beam was positioned nearly exactly at the pixel corner, so the charge was shared more equally. The threshold scans in the STD and C8P1 modes were measured. The measurement in the C8P1 mode was restricted to register the events of the energy larger than 4 keV and it is represented with the red line in Figs. 7(b) and 8(b) .
When hits occur between two pixels, like in Fig. 7(a) , two neighbor pixels register signals of the lower energy in the STD mode. The beam position is slightly shifted to the left, so more charge is collected by the pixel P1. The total energy of the incoming photon is successfully reconstructed in the C8P1 mode and the hit is allocated to one pixel that is selected by the algorithm [Fig. 7(b) ].
In the case shown in Fig. 8(a) , the charge is divided nearly equally between four neighboring pixels P1-P4 and each pixel registers the signal corresponding to a photon of the energy about 2 keV in the STD mode. However, a signal of such low energy is very difficult to distinguish from the noise. Thus, it is hard to set the threshold properly, while operating without any charge sharing compensation algorithm. Fig. 8(b) presents the threshold scans registered with the synchrotron beam showing signals in four pixels in the STD mode and the recovered signals in four of these pixels in the C8P1 mode. The results prove the feasibility of recovering of the total energy of 8 keV when the chip works in the C8P1 mode. Since the events occur near the pixel corner, each time the C8P1 algorithm allocates a hit to one of four pixels. The choice of the winning pixel depends on the actual hit position, and, as the consequence, on the proportions of the charge collected by the P1-P4 pixels. However, the actual registration does depend also on the readout channels noise, gain dispersions and threshold dispersions.
D. Intensity Correction
To evaluate the C8P1 algorithm, numbers of counts registered by the system should be compared for both modes, i.e., the STD and C8P1 modes. When interactions occur in the center of a pixel, there is no charge sharing there. If there is no difference in the mean number of counts in the pixel center in both cases, it means that all the hits registered in the STD mode can be registered in the C8P1 mode and the C8P1 algorithm does not decrease the detection efficiency. Thus, the total number of registered events was measured and plotted as a function of the beam position on the chip. However, during the experiments typically lasting more than 3 h, changes in the beam intensity were observed. The changes were due to the slowly decreasing temperature of the crystal monochromator that was heated in the preceding experiment by another group. Fig. 9(a) presents the scan results for the chip operating in the C8P1 mode, for the 700 μm × 700 μm ROI. In the intensity plot in Fig. 9(a) , a black square is visible. It corresponds to a bad pixel, which does not register any counts due to the faulty bonding. The pulse injected by the calibration circuit is registered for this pixel, however, it is blind to X-ray photons.
The two cross sections along the indicated lines parallel to the x-and y-axes are shown in Fig. 9(b) . A slope profile along y-axis is clearly visible. Two scans including a scan in the C8P1 and the STD mode were performed 20 min one after another. It was observed that during the first scan, the cooling of the monochromator crystal resulted in a significant count number decrease (due to detuning of the monochromator). The scan lasted about 160 min and the average number of the registered counts decreased by almost 9%. The cooling process as well as the intensity dependence on the temperature of the monochromator crystal is a nonlinear process, however, it is assumed to be well approximated with the linear function in the period of testing. Fig. 10 presents the results obtained during two experiments. The first one is for the 700 μm × 700 μm ROI in the C8P1 mode and the second one is for the 250 μm × 250 μm ROI in the STD mode. The experiments lasted about 200 min altogether. In each experiment, the total number of counts in the whole pixel matrix was measured for different beam positions. A steady decrease of the number of counts in time is visible. A linear fitting was performed for the first experiment and the intensity correction coefficient (equal to −3.2 counts/min) was estimated. The results presented in the next paragraphs have been corrected accordingly to the estimated drop of intensity with time. Fig. 11 shows a plot, allowing the comparison of the results for the same 250 μm × 250 μm ROI scan performed for the module after dc offset and gain correction procedures.
E. C8P1 Experimental Results
The total number of counts in the matrix is plotted. The results show that the events are lost for the STD mode when the beam is at the pixel borders. The situation is even worse for the corners, since the charge is divided there between four pixels. In comparison, pixel borders are nearly not distinguishable in the C8P1 mode. The performance of the C8P1 algorithm was assessed in the further analysis concentrating on the variations of the registered intensities along the pixel borders. Fig. 12 presents the cross section along x-axis for Y = 150 μm for both 2-D intensity plots from Fig. 11 . The number of counts drops significantly for the pixel borders in the STD mode, while the number of counts remains stable for all beam positions in the C8P1 mode. In the next step of the analysis, the whole ROI was divided into two regions, i.e., the pixel borders and the pixel central areas, according to the results from the scan in the STD mode. Fig. 13 shows how the areas were chosen. 
F. Dependence of the C8P1 Performance on Corrections
Most commonly, the correction of the dc offsets at the discriminators is performed to improve the matrix response uniformity [18] . However, especially in the measurements with low energy photons, the high gain spread between the channels may introduce problems with global threshold settings and, also, may lead to errors in the comparison between the neighbor pixel signals. Preliminary tests with the injection of test calibration pulses [16] proved that the gain uniformity is crucial for C8P1 algorithm performance in addition to the uniformity of the dc offsets at the discriminators input. To examine the dependence of the performance of the algorithm on the gain spread between channels, a scan in the C8P1 mode was performed for the uncorrected pixel matrix using the synchrotron radiation collimated beam.
The dc offsets at the discriminators were corrected with the procedure described in Section III.B. However, the CSA gains were not equalized, and all the CSA gain registers in the pixel matrix were set to the value of 5, which corresponded to the average gain of 10.89 μV/e − for the ROI of 700 μm×700 μm tested. The threshold for the discriminator was set to the value of 1.165 V. The same test, like this presented in Fig. 9(a) , oriented on measuring the total number of events in the pixel matrix for different beam positions on the chip, was performed. The results of the scan are presented in Fig. 15 .
It can be noticed that large CSA gain spread and lower average gain values than calculated from the correction procedure [10. 89 μV/e − in comparison with 13.06 μV/e − for corrected matrix presented in Fig. 6(a) ] result in improper device functioning. The total average number of registered counts is lower than in the corrected pixel matrix, the borders between the pixels are clearly recognizable and the number of counts varies from pixel to pixel. These issues may originate from diverse conditions, including the potential problems like a lack of activation of the C8P1 algorithm due to the low gain in the fast path or wrongly resolved comparison between pixels in the slow path. Since the comparison block was affected by the large gain spread, more detailed investigation of this topic is underway.
The experimental results show that the algorithm for charge sharing effects elimination enables retrieving total photon energy and registering correct number of hits only if the correction procedures are performed to the satisfactory level in advance.
IV. HIGH COUNT RATE TESTS
The first tests concerning the count rate linearity for the module operating in the C8P1 mode under the high flux conditions were also performed. The measurements were done using a high-power X-ray generator of 8-keV photons, with the 40-kV voltage applied and the current changed within a range of 10-200 mA. The IC was corrected using the aforementioned procedures and the standard parameter settings were applied. The module was tested both in the STD and C8P1 modes.
A paralyzable detector model, given by (1), was used to fit the count rate data N out = N in e −τ N in (1) where N out is the measured output count rate (counts/pixel/s) and N in is the input count rate (counts/pixel/s). The dead time τ was estimated from the model. The intensity varied due to the nonuniform flux delivered by the tube, so the whole module was not uniformly illuminated. Thus, a representative group of pixels for the STD and C8P1 modes was taken into consideration for which the beam intensity was high enough to observe a nonlinear behavior of the output count rate as a function of the input count rate. Fig. 16 presents the count rate measurement results. The average dead time for the C8P1 mode, extracted from the paralyzable detector model, was equal to τ = 1.01 μs, and for the STD mode τ = 0.21 μs.
V. CONCLUSION
The C8P1 algorithm, implemented in the CHASE Jr., IC, was characterized using synchrotron radiation. The architecture of the IC, including the C8P1 algorithm implementation was described in detail. The presented experimental results taken at Advance Photon Source at Argonne synchrotron prove that the C8P1 algorithm enables the chip to register proper number of low energy (8 keV) photons, even when the events occur at the pixel's edge. A direct comparison to the standard mode, when no charge sharing takes place due to exposure of a very narrow beam to the very center of the pixel, proves that no photons are lost due to the C8P1 algorithm implementation. The quality of the algorithm operation depends on the uniformity of the parameters of analog and digital blocks in the whole pixel matrix. The uniformity of the offsets at discriminators and gain of the preamplifier and the shaper can be trimmed in the CHASE Jr., chip. The procedures for calculating values of trimming DACs were proposed and verified in the experiment. The energy of the incoming photon, in the case of charge sharing, is properly reconstructed due to precise gain trimming in a charge preamplifier, just before the charge is summed up from four neighboring pixels. The correction of the second stage amplifier's gain is required for proper allocation of an event to a certain pixel and the offset trimming is essential in order to set a single discriminator threshold value for all the pixels in the matrix. It was also proven that the IC with the C8P1 algorithm implemented can work under the high flux conditions, and the dead time extracted from the paralyzable model for the C8P1 mode is τ = 1.01 μs.
