Abstract-An on-chip implementable algorithm for allocation of an X-ray photon imprint, called a hit, to a single pixel in the presence of charge sharing in a highly segmented pixel detector is described. Its proof-of-principle implementation is also given supported by the results of tests using a highly collimated X-ray photon beam from a synchrotron source. The algorithm handles asynchronous arrivals of X-ray photons. Activation of groups of pixels, comparisons of peak amplitudes of pulses within an active neighborhood and finally latching of the results of these comparisons constitute the three procedural steps of the algorithm. A grouping of pixels to one virtual pixel, that recovers composite signals and event driven strobes, to control comparisons of fractional signals between neighboring pixels are the actuators of the algorithm. The circuitry necessary to implement the algorithm requires an extensive inter-pixel connection grid of analog and digital signals, that are exchanged between pixels. A test-circuit implementation of the algorithm was achieved with a small array of 32 × 32 pixels and the device (e-mail: farah@fnal.gov, jimhoff@fnal.gov; holm@fnal.gov, trimpl@fnal.gov; tzimmer@fnal.gov was exposed to an 8 keV highly collimated to a diameter of 3-µm X-ray beam. The results of these tests are given in this paper assessing physical implementation of the algorithm.
I. INTRODUCTION
P ROVIDING position and performing timing and amplitude spectroscopy of incoming photons are typical features of pixelated X-ray detectors. In highly granular systems, charge clouds drifting towards detector electrodes are ending collected on more than one electrode, i.e. fractional charges are distributed among several neighboring terminals [1] . If a pixel pitch is significantly smaller than a thickness of a sensor, a substantial number of events result with charge shared among adjacent pixels at practically achievable polarization of a sensor. Hence, fractions of the liberated charge are processed individually in adjacent electronic channels of a pixel detector. In classical systems, charge sharing has been dealt with in one of two ways. First, the signal discrimination threshold could be lowered to capture as many of the fractional charges as possible, but this leads to a high readout bandwidth due to an excess of false (noise) and duplicate (accounted fractional signals) registrations of X-ray photon imprints, called hits. Second, the signal discrimination threshold could be raised to capture only those hits that result from substantial individually collected charge, but this leads to a loss of some real hit data. In addition, splitting of charge clouds between multiple electrodes may cause inaccurate photon time of arrival (ToA) measurements due to the time walk effect. ToA measurements can be made insensitive to signal amplitudes by using, for example, the constant fraction discrimination (CFD) method [2] . However, this technique requires delay elements and extensive electronics that cannot be fit into a small pixel footprint. Instead, an alternative approach, endorsed in this paper for mono-energetic X-ray photons is to reconstruct the full signals from the charge fractions and to use the leadingedge technique for the ToA measurement [3] .
The emphasis of this paper is on the introduction of an algorithm for allocation of X-ray photon hits that is named C8P1 being a shorthand of "compare eight if one virtual pixel is above threshold" algorithm [4] . It allows simultane-ously for ToA measurements and handles allocation of hits to single, dominant pixels in a presence of charge sharing. The paper covers also a test implementation of the algorithm that was achieved with the miniVIPIC chip designed in a 130 nm CMOS process. Detection of monochromatic 8 keV X-ray photons that is the typical case for experiments using synchrotron radiation underpins this effort. The design hosts a small array of 32 × 32 pixels, which is a study towards the future steps in the VIPIC (Vertically Integrated Photon Imaging) project [5] . The work has been done for a planar sensor with a matrix of charge collecting electrodes placed on a square grid. Yet, the reasoning can be generalized for other arrangements of the electrodes and different sizes and forms of their groups needed for rebuilding of the full charge signals.
The paper is organized in seven sections. The first section is an introduction. The second section discusses comparisons of pulses based on their peak amplitudes vs. durations of timeover-threshold (ToT) in presence of noise [6] . This analysis contrasts the efforts presented in this work with a few other approaches, know from literature [7] - [9] . The third section provides a description of distinctive features of the hardware realizable C8P1 algorithm, including the terms for navigating in its flow. The fourth section discusses the flow of the algorithm, analyzing practical aspects of forming activated groups of pixels following impacts of photons for selected scenarios of photon impacts. The fifth section is about the physical implementation of the C8P1 algorithm in the miniVIPIC chip and discusses block and circuit level representations of the critical analog and digital parts. The sixth section presents the results of tests achieved with a miniVIPIC chip bonded to a 280 μm thick Si sensor using the Sn-Pb bump-bonding technology and exposed to an 8 keV, highly collimated X-ray beam. The last section of the paper concludes and summarizes the material presented.
II. COMPARISON OF PEAK AMPLITUDES VS. DURATIONS OF TIME-OVER-THRESHOLD IN PRESENCE OF NOISE
Typical analog signals on an output of a continuous time front-end block in a radiation detector possess some form of a semi-Gaussian waveform, e.g. CR-RC n [10] . When a charge cloud, drifting in a sensor towards collecting electrodes of adjacent pixels, is divided, charge sensitive amplification circuits generate pulses in these pixels. Pixels with activity form a cluster. Amplitudes or durations of the pulses are functions of how the charge cloud splits. Finding one pixel, which would have a hit allocated to, should be decided in some process, in which signals from pixels, forming a cluster, need to be compared. Weighing of signals through comparison of their amplitudes, preferably at their peaks, or time-lengths they stay above a threshold can be used for this purpose. Consideration of which of the two approaches should perform better in a noisy system helps in justifying selection of one before another for the concept of the algorithm.
Two CR-RC 2 -type pulse waveforms, exemplifying signals of two pixels, differing in their amplitudes and peaking both at 250 ns are depicted in the upper part of Fig. 1 . Below of the pulse waveforms in Fig. 1 , two results of discrimination of (it is assumed that the total charge of about 2,220 e − /h + , resulting from a conversion of an 8 keV photon in silicon, sums up to 1.5 signal units and the noise outlines represent 3 × σ levels).
the pulse waveforms at the same, arbitrarily chosen threshold are shown. A peaking time of 250 ns is representative value for an analog front-end in a pixel detector. The amplitudes and the threshold level, which is set at 20% of the larger signal peak amplitude as an example, are given in arbitrary units. It is assumed that the total charge of about 2,220 e-/h+, resulting from a conversion of an 8 keV photon in silicon, sums up to 1.5 signal units. Both pulse waveforms have equal noise envelopes added, representing a 3 × σ level, with 1 × σ being equal to 100 e − . The noise envelopes introduce uncertainties in times when the pulse waveforms return below the threshold. This fact is marked by rectangles, defining zones where real edges of the discriminator signals can occur. There is an overlap range, meaning that the smalleramplitude signal, amp1, can be misidentified as the largeramplitude signal, amp2, because its time-over-threshold, tot1, measurement may come out longer, than that of the second signal, tot2. On the other hand, the pulse waveforms can be unambiguously separated, yielding the expected relation of amp2 being greater than amp1.
This qualitative analysis illustrates that weighing peak amplitudes of signals is more immune to noise in performing comparisons of signals between the pixels. On the other hand, considering typical constraints of pixel sizes for X-ray pixel detectors (between 50 and 100 μm) and process nodes that can be afforded for pixel readout circuits, the comparison of signals is entirely enforced in the analog domain using precise, offset-free comparators. This approach has been postulated earlier [11] , [12] and is applied in the current work to the concept of the C8P1 algorithm.
III. THE C8P1 ALGORITHM
The objective of the C8P1 algorithm is to assign each individual photon hit to a particular location in an array of pixels in the presence of charge sharing. Additionally, what should also be possible in a chip with the C8P1 algorithm implemented is stamping of hits with an accurate time.
A. Introduction of the Basic Concepts
A description of the key terms of the C8P1 algorithm makes use of the following glossary:
• Physical Pixel -smallest addressable element that reflects the actual segmentation of a sensor in an array of charge collecting electrodes, • Adjacent Pixels -Physical Pixels that share sides or corners with a given Physical Pixel, Pixel is to be unambiguously determined. The cloud of charge generated by a photon impact is pointlike initially, then this cloud grows with time due to diffusion and the magnitude of it reduces with radial distance from the impact center. Therefore, the Chosen Pixel would generally be the pixel that has the largest Fractional Charge deposited upon itself. Equally, so called 'corner' (in equal distance from centers of four Physical Pixels) or 'side' (on the border between two Physical Pixels) hits that split their charge evenly between multiple Physical Pixels are possible. In either case, the goal of the C8P1 algorithm is to assign a hit to a single Physical Pixel, even if the charge cloud is evenly shared between Physical Pixels. The circuitry for this purpose needs to fit into a typical footprint of a Physical Pixel for detection of X-rays, which is on the order of 100 × 100 μm 2 or less. The circuitry needs to be capable of starting the algorithm and to continue progression through its steps asynchronously as X-ray photons to be detected may be spontaneously emitted from some sources and times of arrival of X-ray photons are random. This also implies a capability of simultaneous detection of multiple hits in a detector. To meet the mentioned requirements, the C8P1 algorithm is based upon a concept of creation of groups of so called 'activated' pixels for every X-ray photon. The groups of the minimum required sizes are distinguished from the rest of the array of pixels by having the NA signals activated by all Physical Pixels in a group. Such an activated group forms a topologically connected set of Physical Pixels. A group is designed to grow around an impact point, and, then, to collapse inwards, leaving always one Physical Pixel with the assigned hit.
Considerations in the paper are carried out for a square layout of an array of pixels and a range of pixel grid sizes to sensor thicknesses ratios, such that dissipation of a cloud of the liberated charge occurs on a distance smaller than the size of a pixel. Thus, it is always possible to find at least one Signal Sum in the whole detector that corresponds to the full charge liberated in interaction with radiation in a sensor. And, therefore, a time walk-free leading edge ToA measurement is possible. On the other hand, there is no restriction in the range of the charge cloud dissipation for just registration of hits with the C8P1 algorithm.
B. Flow of the C8P1 Algorithm
Charge collecting electrodes of a sensor unceasingly sense Fractional Charges. The first stage of the front-end electronics processes continuously these Fractional Charges in every Physical Pixels. The resulting signals are replicated, and, on one hand, are appropriately added in every group of four Physical Pixels that share one corner to form the Signal Sums, and, on the second hand, are further processed in every Physical Pixel.
The actual operation of the C8P1 algorithm begins at this point. Its flow has two phases. The first phase is the Activation Phase, during which time when a photon has arrived is estimated and all Physical Pixels that are involved in an event are determined. The second phase is the Evaluation Phase, during which signals, resulting from the Fractional Charges, are compared between the Physical Pixels and the Chosen Pixel is revealed as a single Physical Pixel that is to store a hit. Due to the asynchronous nature of impacting of X-ray photons on a pixel detector, an internally generated signal is necessary to start the Activation Phase. Another signal is required to conclude the Evaluation Phase. Both goals are achieved by assigning different purposes, correspondingly, to the rising and falling edges of the NA control strobe signal.
1) Activation Phase:
A Virtual Pixel can be seen as a block containing a fast shaping filter with a short pulse peaking time, called the Trigger Shaper, followed by a discriminator. Owing to the fact that Signal Sums are generated from all 2 × 2 sub-arrays of Physical Pixels in a detector, Virtual Pixels can be seen as if they were located in vertices of Physical Pixels and each Virtual Pixel was orderly assigned to one Physical Pixel. The role of the Virtual Pixel is to continuously process its Signal Sum signal, and detect whether a discrimination criterion is met. When this happens, a rising edge on the RTE signal is generated and sent to the logic circuitry of the Physical Pixel that is associated with a Virtual Pixel. The latter receives the RTE signal and replicates it to its Adjacent Pixels. The RTE signals that may be coming from the associated Virtual Pixels or from the Adjacent Pixels are logically summed in every Physical Pixel and the result, being the NA signal, if is high, translates to the inclusion of a Physical Pixel in a group of active pixels for an X-ray impact currently being processed. The rising edge of the NA signal starts the Activation Phase. A Physical Pixel belongs to an activated group as long as its NA signal stays high.
A rising edge of the NA signal is synonymous with the fastest RTE signal in a group of activated Physical Pixels due to the logical sum of all RTE signals. The first RTE signal is generated by the Virtual Pixel, whose Signal Sum is identified with the full charge of the event. Using it for ToA measurement consists in starting measuring time of an event in every Physical Pixel that gets its NA signal activated in the Activation Phase and latching the result of this measurement only in the Chosen Pixel at the end of the Evaluation Phase.
2) Evaluation Phase: The Evaluation Phase starts in each Physical Pixel by a rising edge of the NA signal. The major goal of the Evaluation Phase are comparisons of signals, resulting from processing of Fractional Charges, between Physical Pixels. The rising edge of the NA signal starts these comparisons and they keep going until its falling edge. The Evaluation Phase is simply an all-way comparison. Physical Pixels that are included in activated groups compare signals between them. For example, an eight-way comparison is conducted in a square geometry, i.e. a comparison between a Physical Pixel and the Adjacent Pixels immediately to the north (above -N) the north-east (above-right -NE), the east (right -E), the south-east (below-right -SE), the south (below -S), the south-west (below-left -SW), the west (left -W), and the north-west (above-left -NW). Physical Pixels that are not included in activated groups permanently signal that their Adjacent Pixels feature higher signal amplitudes, forcing election of the Chosen Pixel always inside activated groups.
3) Adjustment of the Size of the Activated Group of Physical Pixels: A pixel is elected for assignment of an X-ray photon hit to become the Chosen Pixel from a connected group of Physical Pixels with the NA signal activated. The size of an activated group should be as small as possible and all Physical Pixels in such a group should have some non-zero Fractional Charges collected for unambiguous comparisons of signals.
Allowing Physical Pixels with zero Fractional Charges to a group and having such pixels adjacent to each other may result in conveying of comparisons of noise levels and, consequently, in producing ghost Chosen Pixels. Such ghosts can particularly pop up when activated Physical Pixels with zero Fractional Charge are not directly neighbored by ones with non-zero Fractional Charges. In order to prevent such situations, a directional discernment of sending of RTE signals is introduced. Per the earlier description, a Virtual Pixel instructs one Physical Pixel in a 2 × 2 sub-array of pixels to issue RTE signals to itself and to its Adjacent Pixels if a Signal Sum passes a threshold. It can be noticed that such an arrangement introduces a diagonal, virtual shift with a step, equal to up to a half-size of the pixel pitch, in both directions, of an actual hit position to a center from where an active group of pixels is built. Therefore, sending of RTE signals by a Physical Pixel has been adjusted to be done only to three of Adjacent Pixels, i.e. those located on the side of the Virtual Pixel. So, assuming location of a Virtual Pixel in a common vertex of a 2 × 2 sub-array of Physical Pixels, and the bottom-right Physical Pixel in this group being responsible for activating its Adjacent Pixels, RTE signals are sent to the W, NW, and N directions only.
IV. PROCESSING OF EVENTS FOLLOWING THE C8P1 ALGORITHM

A. RTE Signals and Construction of Active Groups
An illustration of all categories of events and how the C8P1 algorithm would react to different distributions of the collected charge shared between Adjacent Pixels is not conceivable. Thus, three examples of impacts of 8 keV photons that produce the total charge of 2,200 e − /h + in a silicon sensor are presented. The scenarios, where charge, resulting from photon conversions in a sensor, is collected entirely by one pixel (3C -100%), is shared equally by four pixels in a 2 × 2 sub-array (3C, 3D, 4C and 4D -25%) and is shared unevenly by four pixels (3C-48%, 3D and 4C -21% and 4D -10%) are illustrated in Fig. 2, Fig. 3 and Fig. 4 , respectively. Charge distributions between Physical Pixels and Signal Sums processed by Virtual Pixels for each of the analyzed photon impact cases are shown first in Fig. 2a, Fig. 3a and Fig. 4a . Second, Fig. 2b, Fig. 3b and Fig. 4b illustrate generation of RTE signals and construction of groups of pixels with active NA signals. For the sake of a simple illustration, a square 5×5 (A-E×1-5) array of Physical Pixels is depicted and Virtual Pixels are marked as semi-transparent diamonds located in vertices of 2×2 arrays of Physical Pixels. Numbers, expressed as a percentage and marked in Physical Pixels refer to the division of the total charge, and numbers of electrons that are marked in Virtual Pixels represent respective Signal Sums. Red ovals with the RTE lettering mark Physical Pixels that issue their RTE signals to their Adjacent Pixels located on their W, NW and N sides. Those Physical Pixels that receive RTE signals and become members of activated groups are marked with NA labeled green ovals. For the purpose of the presented examples, it is assumed that the discrimination threshold for the Signal Sums is set to above 1,100 e − .
It can be observed that a group of activated pixels is a 3 × 3 and 2 × 2 sub-array of Physical Pixels for an X-ray photon impinging in the center point of a center Physical Pixel in the 3 × 3 sub-array and in a shared vertex of this 2×2 sub-array, respectively. The activation sizes are optimum. One extra row and column of activated pixels that would be correspondingly activated at the bottom and at the right side of the groups without the adjustment of the size of the activated group, would create patterns of two Physical Pixels with zero Fractional Charges being adjacent one to another and not adjacent to any Physical Pixel with non-zero Fractional Charges. By elimination of these extra elements from an activated group, ghosts that might be occurring next to the real Selected Pixels are prevented.
A form of an activate group can be irregular as it is shown in Fig. 4 . In this scenario, the 3C pixel collects only 48% of the total charge, resulting in the Signal Sum equal to only 1,056 e − which is below of the 1,100 e − threshold level.
Small flashes that are marked in Fig. 2-4 illustrate the fact that comparisons occurring at the boundaries of active groups always yield winning results towards the inside of any group.
B. Time Flow of the C8P1 Algorithm
Amplification and filtering of analog signals, comparisons, generation of RTE and NA signals and latching of Chosen Pixels occur asynchronously in a sequence and this sequence needs to be conveyed without any external control signals individually for every X-ray photon impact. A time progression 
V. CIRCUIT REALIZATION OF THE C8P1 ALGORITHM
A simplified view of the block/schematic diagram of a pixel implementing the C8P1 algorithm is given in Fig. 6 . A Physical Pixel is shown as a blue square, while four Virtual Pixels are depicted as magenta diamonds in each corner of the Physical Pixel. The circuit description is given for the miniVIPIC chip that is a test-circuit built using the Low Power version of a 130 nm CMOS process with the following features: 1 poly and 8 metals, deep-nwell, normal threshold and low threshold FETs. The main specification items of the miniVIPIC chip are summarized in Table I .
A. The Physical and Virtual Pixel
The first stage of the signal processing is a charge-sensitive pre-amplifier (CSA). It performs active charge integration and Fig. 6 into a voltage step. The signal from the pre-amplifier is replicated into five branches. Four of the branches go to Virtual Pixels through the capacitive couplings, C C , resulting in capacitive addition of the outputs of the CSAs from four Physical Pixels sharing a corner. The fifth branch feeds a Compare Shaper filter that further amplifies the signal and band-pass filters it. Injections of signals from the first stage amplifier across series capacitors onto an active integrator or band pass filter was chosen as the best solution for the implementation of the C8P1 algorithm for achieving high speed operation. Using capacitive coupling was preferred over for example ganging of directly-coupled replicas of signals in the current mode. The latter comes naturally, but it requires large bias currents, whose offsets may be larger than the processed signal for achieving high speed operation.
Each Virtual Pixel constructs its Signal Sum from the outputs of CSAs from four Physical Pixels. This Signal Sum is passed to a Trigger Shaper filter. A discriminator, comparing the Trigger Shaper signal with a threshold, generates the RTE signal. This RTE signal is broadcasted from the Physical Pixel to its three Adjacent Pixels in the W, NW and N directions and each Physical Pixel receive RTE signals from another three Adjacent Pixels from the S, SE and E directions. These three incoming RTE signals are logically summed together with the RTE signal generated inside the Physical Pixel and the NA signal is produced. Every Physical Pixel possesses own NA signal. Physical Pixels with the NA signal active that are adjacent to each other form an active group. A presence of the NA signal in any Physical Pixel depends on the statuses of other pixels in the group. However, at the current stage of the implementation of the C8P1 algorithm, individual members of an active group may fall out despite of others still remaining in such a group. Nevertheless, first are deactivated Physical Pixels with smaller Fractional Charges, due to shorter times that Trigger Shaper signals stay above a threshold for such pixels, and this is accommodating for Physical Pixels with larger Fractional Charges staying to the end in an active group to elect one Chosen Pixel.
B. Comparisons of Signals -Procedure
The processing of signals in each Physical Pixel requires eight comparisons. All eight comparisons are performed for every Physical Pixel, while each one houses internally only four comparators as it is shown in Fig. 6 . This optimizes the use of the real estate and circuit resources. The output of the Compare Shaper is driven to four internal comparators. These four comparators receive also analogous Compare Shaper signals from Adjacent Pixels that are located to the SW, W, NW and N directions. Similarly, the Compare Shaper signal of a pixel shown in Fig. 6 is also driven to four external comparators that are located in the Physical Pixels to the NE, E, SE and S directions. Outputs of these four external comparators are available in the corresponding locations and are also returned to the Physical Pixel that is shown in Fig. 6 what is marked as the CompIN bus. Analogously, four results of the comparisons, achieved internally, are sent out as CompOUT to the pixels located in the NE, E, SE and S directions. In this manner, all eight of the required comparisons of a Physical Pixel with the eight Adjacent Pixels are available for the Evaluation Phase.
The timing of the comparison process is controlled by the NA signal in every Physical Pixel. Active comparisons are ongoing and the results may keep changing when theNA signal is active, i.e. when at least one of the RTE signals from a 2 × 2 sub-arrays of Physical Pixels is active. When the NA signal is active, a Physical Pixel sends current states of its four comparators to the outside. A falling edge of the NA signal latches the state of all eight comparator outputs in a Physical Pixel. As the RTE signal is a derivative of the Trigger Shaper, adjusting properties of pulse responses of the Trigger Shaper and Compare Shaper to latch of the results of comparisons when the compared signals achieve peak amplitudes assures the optimal comparing. If a latched result is such that all eight comparators indicate that a Physical Pixel has the largest signal, an X-ray photon hit is attributed to this pixel. When the NA signal deactivates, a Physical Pixel starts sending the constant state of being the one whose signal is lower than others, and, also, what is described later, latching of the comparison results is ensued by auto-zeroing of the comparators [12] .
C. Comparisons of Signals -Circuit Designs of the Discriminator and Comparator
The discriminator that operates on Signal Sums and the comparator, looking at the differences between signals from Adjacent Pixels, need to be designed using a method for cancellation of static offsets to provide a uniform response of the entire matrix of pixels. The difference between the discriminator and comparators is such that the first one needs to be continuously active and the second one may activate only with the activation of the NA signal. The major constraint for implementation of the offset cancellation is the real estate in a pixel. Thus, the discriminator has its static offset trimmed by a 7-bit digital-to-analog converter (DAC), and the comparator uses auto-zeroing. Mixing of these two techniques was recognized as the optimum choice. A schematic diagram of the discriminator and comparator is shown in Fig. 7 and 8 , respectively.
The discriminator is built with the differential pair, composed of the transistors M1 and M2. The output conductance of the differential pair is decreased by the transistors M3 and M4 that are connected in the common gate configuration. The asymmetrical load of the differential pair is the cascode current mirror, built with the transistors M5-M8. The loaded differential pair drives the single ended amplifier with the transistor M9 in the common source configuration and the inverter, built with the transistors M11 and M12 that allows achieving the rail-to-rail swing by the output signal. The both inputs of the differential pair are driven by two source followers, built with the transistors M13 and M14, on the reference side, and M16 and M17 on the signal input side. The source followers provide global threshold for the discriminator that is the difference between the ThP and ThN levels, but also currents of both source followers are adjusted by the DAC, allowing cancelling out offsets of each discriminator individually. The source followers are connected to the differential pair via the transistors M15 and M18 that are configured as large resistances and the signal from the Trigger Shaper is connected via the 500 fF coupling capacitor, C C , to the signal side of the differential pair. The static current, flowing in the two stages of the discriminator, is 1.2 μA. The adjustment of DAC values can be easily obtained through the so called "threshold scan" procedure [13] in which each discriminator can have its characteristic of the threshold level vs. DAC value scanned thanks to the application of a common threshold level to all pixels in an array and sending its output to an in-pixel event counter.
The circuit network of the comparator has a very similar topology as the discriminator. The difference is an absence of the two source followers with adjustable bias currents, a presence of AC-coupled driving of both transistors M1 and M2 of the differential pair and a presence of a set of switches MR1-MR6 that are used for auto-zeroing. The first step of auto-zeroing occurs when the switches MR1, MR2 and MR3 are closed and the switches MR4 and MR5 are open. During this step, the second stage of the comparator is disconnected from the first stage and the output of the second stage is shorted with its input. Additionally, the gate voltage of the transistor M1 is forced to equal, with a difference amounting to the static offsets, the output voltage of the second stage. The open switch MR5 prevents flowing of a high static current in the common source amplifier that acts as an inverter stage. In the second step of auto-zeroing, the comparator is armed. In this step, the switches MR1, MR2 and MR3 opens and the switches MR4 and MR5 closes. An inverter chain generates a timed sequence of the rstb, rst, rst2b and rst2 signals that assures storing of the offset voltage on the gate of the transistor M1. The Reset signal is the inverted NA signal. The differential pair is driven by the Compare Shapers from twoAdjacent Pixels via the small 40 fF coupling capacitors C C1 and C C2 and two source followers, providing separation between each Compare Shaper and the appropriate comparator. The total static current, flowing in the two stages of the comparator, is 0.45 μA.
D. Sources of Signals for Comparison and DiscriminationCompare and Trigger Shapers
Both, the Compare Shaper and Trigger Shaper are active filters that provide pulses having the defined properties timing in response to charge signals integrated by the CSAs. Both shapers are built using the same circuit topology, i.e. a folded-cascode-based operational transcoductance amplifier that is followed by two source followers and a tunable R-C feedback network. The first source follower drives the feedback network and the second source follower sends the output signal to the stages located next in the processing chain. The second source follower is loaded with an R-C network to further decrease the bandwidth, thus to improve the noise performance. A schematic diagram of the filter circuit network with the component values for the Compare Shaper is shown in Fig. 9 . Tuning of the R-C network, defining the timing properties of the pulse response of the Compare and Trigger Shapers, is achieved with the voltage levels fedshP and fedshN. The nominal settings for these signals define peaking time and gain of the Compare Shaper and Trigger Shaper to 85 ns and 25 μV/e − and 40 ns and 37 μV/e − , respectively. The gain values include the gain of the CSA.
The design of the CSA is based on the same topology of the gain stage as both the Compare and Trigger Shaper, however a feedback network is simpler in the CSA. It includes only a capacitor connected in parallel to a PMOS transistor acting as a tunable resistance. Thus, the presentation of the CSA circuit is skipped.
E. The Digital Section of a Physical Pixel
The role of the digital section of a Physical Pixel is to register hits in Selected Pixels following the designation of the C8P1 algorithm. Registering of hits typically evinces through counting in imaging applications [14] . Thus, the designed circuit uses a counter that is fed by pulses generated in the logic that determines whether a hit should be allocated to a given Physical Pixel. In order to meet the requirement of the dead-time-less operation, counting of hits in the firsttime interval, called Frame0 (FO), needs to be interleaved with reading out of hits from the second-time interval that is called Frame1 (F1). Therefore, two counters per every Physical Pixel are required. A modified with respect to the actual implementation in the miniVIPIC chip, but functionally identical, schematic diagram of the digital section of a Physical Pixel is shown in Fig. 10 . There are three major components identifiable in the presented schematic diagram:
1) the neighborhood logic that produces the NA signal, 2) the C8P1 (pixel selection) logic that processes the results of inter-pixel comparisons, and 3) the frame/counter logic storing a hit. The local RTE signal is generated from the discriminated Trigger Shaper signal and the NA signal is then produced from summing of the RTE signals delivered from the S, SE and E directions. The NA signal is used to gate the logical conjunction of the results of the comparisons of the local Compare Shaper signal with the counterparts from the N, NW, W and SW directions. If there is no NA signal active, a Physical Pixel concludes that it cannot be allowed to register This connection produces a short pulse that is used to increment a counter to register a new hit. As the toggle flip/flop should be prevented from causing switching between the frames when an event is still being processed, the NA signal is also used to delay the rising edge on the Frame_Clk signal using an R-S flip/flop. Toggling of the frame changing flip/flop may occur only after arrival of a rising edge of the Frame_Clk signal but not when the NA signal is active.
One of the two counters is used for registering new hits, and another counter may have its contents read out on the common OUT_BUS bus. Reading out of the counter is achieved with the Read_Release signal that enables the buffers. Then, the Read_Ack signal resets the counter that was read out making it ready for counting hits after next switching of the frames.
Due to large simplification of the here-presented circuit topology of the digital section of a Physical Pixel, some blocks, e.g. generating the actual signals used for reading out the counters or interfacing with the zero-suppressed (sparsified) readout engine [15] are not shown. Also, the presented circuitry skips anything not related to counting of hits, while, in reality, the frame logic is much more complicated. It allows selectable operation, first, in the counting mode (counting of photon impact events in a given time frame), and, second, in the timing mode (ToA measurements of photon impact) with a counter used as a part of time-to-digital converter (TDC).
F. Inter-Pixel Connections
The degree of inter-pixel connectivity in the C8P1 algorithm is significant. As it can be seen in Fig. 6 and 10 , multiple analogue and digital signals must be transmitted over relatively long distances and crossing of these signals is unavoidable. Distributing of analog signals is particularly challenging, as with the goal of maintaining power consumption at the level comparable to conventional pixel detectors [16] , assuring low output impedance of buffers is impossible. Thus, routing of analog signals was done in the vertical and horizontal shielding routing channels going across every Physical Pixel. The routing channels were about 10 μm wide and a special care was given to minimize and control stray capacitances of the connections. Stray capacitances of the connections were also accounted as components of the analog filters. Routing of digital signals was simplified by a symmetricity of the interconnects, i.e. all signals, coming to a Physical Pixel, are analogous to those that are broadcast by it. A significant care was taken to have all digital control signals run vertically over the digital section only. Additional routing channels were also used around the Physical Pixel borders to allow distributing of the digital signals.
G. Layout of the Physical Pixel
Big number of circuit blocks and dense interconnects resulted in a highly compact layout of a Physical Pixel. The layout, with marked main components of the analog section, occupying the left side of the pixel, and of the digital section, occupying the right side of the pixel, is shown in Fig. 11 . Its size is 100 × 100 μm 2 , and it was done entirely as a full custom layout. A pad for the bump-bonding connection to a silicon sensor has a conservative diameter of 60 μm. The substrates of the digital and analog sections are separated by placing the digital section in a deep-nwell in addition to having both the sections in different parts of the Physical Pixel layout. The digital circuitry could be fitted into the available space thanks to a highly compact, custom digital standard cell library that was developed in-house. The developed cells provided the 30% of the real estate savings comparing to the cells distributed by the technology vendor.
VI. ATTACHMENT OF A SILICON SENSOR AND TESTS
WITH A SYNCHROTRON X-RAY BEAM The miniVIPIC chip was fabricated through a multiproject-wafer service and featured external dimension of 5.5×5.5 mm 2 . Several miniVIPIC dies were assembled with a silicon sensors using the solder (SnPb) bump-bonding technology. The sensors were p-on-n (collecting holes), 280 μm thick with resistivity of 5 k cm devices. The sensors featured external size of about 4×4 mm 2 that included the 32 × 32-pixel array, matching the miniVIPIC chip, and biasrings for better distribution of the gradient of high voltage on the sensor surface side. The assembly was done depositing, first, electroless-nickel-gold under bump metallization on both the chip and the sensors dies that was then followed by the deposition of Sn-Pb bumps and flip-chip bonding [17] of the two parts.
The purpose of the tests was assessment of the C8P1 algorithm effectiveness in counter-acting the performance degradation of a pixilated X-ray detector due to charge sharing. An efficient approach for this goal is raster scanning of a surface of a sensor bonded to a miniVIPIC chip with a highly collimated to a size of a few micrometers only beam of monoenergetic X-ray photons and registering responses for different beam positions. The beam intensity should be such that, on the one hand, sufficient counts for each position would be obtained in a short time, and on the other hand, the miniVIPIC front-end circuit would not be saturated. In order to limit statistical fluctuations to up to a few percent only, at least 1,000 photons should be registered for each beam position. Then, in order to be able to keep durations of the scans practical, an acquisition should last on the order of 100 ms for each point. Such experimental conditions can only be met at a synchrotron radiation facility. A concept diagram of the experimental arrangement that was built on the 1BM-B beamline at the Advance Photon Source and Argonne National Laboratory [18] is shown in Fig. 12a . The detector was installed on an X-Y precision movement stage and a beamstopper with a 3 μm pin-hole, placed about 5 mm before the detector, was sending 8 keV X-ray photons on the detector. The primary beam intensity was high enough to yield a flux of 10 4 -10 5 X-ray photons per second after the collimator. A picture of a test setup with the pin-hole mounting in front Fig. 12 . An experiment, consisting in raster scanning with a high intensity, highly collimated 8 keV X-ray beam and using X-Y precision movement stages a) a concept of the experimental arrangement, b) the test setup with the pin-hole mounting in front of the miniVIPIC chip with a sensor attached on the daughter PCB that is plugged on the mother PCB connected to the data acquisition system, c) the daughter PCB with the miniVIPIC chip bump-bonded to the silicon sensor.
of the miniVIPIC chip with a sensor attached on the daughter PCB that is plugged on the mother PCB connected to the data acquisition system is shown in Fig. 12b . The daughter PCB with the miniVIPIC chip bump-bonded to the silicon sensor that has a high voltage wire connected to it for biasing of the sensor is shown in Fig. 12c . The described experimental arrangement was used to perform raster scans in steps of 5 μm in both directions over arrays of several adjacent pixels.
Before performing raster scans, detailed characterization of DACs used for trimming of static threshold offsets for discriminators was accomplished. This step provided DAC settings that were programmed to the chip for the tests with X-ray photons. Two examples of scans of static threshold offsets dispersions before any adjustments and after trimming of thresholds are shown in Fig. 13 . It can be noticed that the initial one sigma dispersions, amounting to 11.8 mV (∼350 e − ), were reduced to about 1.5 mV (∼45e − ).
Examples of the results of the raster scans with a collimated X-ray beam are given in Fig. 14 . The tests were performed by moving the X-Y table in steps in the plane normal to the beam, exposing a 2 × 2 -pixel sub-array. For each step, the total number of hits that was summed from a sub-array of 5 × 5 pixels, extending radially from the pixel with the highest individual count number, was recorded. For each beam position, the exposures were repeated for varied threshold settings, going from the point in which spurious noise hits were entirely occupying the detector to the level at which no more 8 keV signal was observed.
The results given in Fig. 14 consists of two plots, showing the registered number of hits for a selected position in the scanned 2 × 2 -pixel sub-array, and six intensity plots showing the recorded counts for the scans over the exposed 2 × 2 -pixel sub-array for six, selected threshold levels. The upper plot in Fig. 14 is done for the centers of the pixels marked as NE, NW, SE and SW, and the second plot is done for the N, E, W, and S borders between the pixels and the corner C between all four pixels, accordingly to the marking added to the upper-center intensity plot. It can be noticed from the plots that below the 10-mV threshold, noise hit starts to dominate, while above the 70-75 mV threshold, no more 8 keV signal is present. The signal beyond the latter threshold level is a higher order harmonic of the 8 keV that is always allowed by a monochromator in a small quantity. There can be seen that an effective gain is the lowest on the C position, then it is a bit higher in the N, E, W, and S border conditions and it is maximized for the central NE, NW, SE and SW positions. This can be explained by a presence of some mismatch-driven differences in timings between the Fractional Signals, translating to smearing of the peaks of the Summed Signal. Nevertheless, there is a range of threshold levels for all analyzed points, where the number of the recorded hits is constant. This is expected from the correctly working circuit that uses the C8P1 algorithm [19] . The confirmation can be found on the intensity plots, where the "pixel borders" and even the C point are not distinguishable, as all are confirming registration of the same numbers of hits. It is worth highlighting that color scales overly expose dark areas even if differences between the actual numbers in the intensity plots are as small as 10% only.
VII. CONCLUSIONS
The C8P1 algorithm of allocation of hits to single pixels in the presence of charge sharing in a highly segmented pixel detector was presented together with its first hardware implementation in the miniVIPIC chip and the results of tests assessed the effectiveness of the C8P1 algorithm. The miniVIPIC chip was designed in a 130 nm CMOS process. This led to a relatively large 100-μm-side pixel, where the charge sharing is still affecting small number of events. Use of scaled down process nodes and exploiting three-dimensional integration are allowing smaller pixels and integration of more blocks in a small footprint of a pixel. Thus, this work sets the foundation for future developments. The algorithm was conceived for square pixels; however other pixel layouts are possible. Another variant of the C8P1 algorithm, in which the property governing the latching (duration of the RTE signal) of the comparisons is defined with a user adjustable delay, was implemented in a 40 nm CMOS process [20] . Both designs rely on a highly dense interconnection between the pixels.
In the course of the analysis, it was concluded that equalization of gains of the Trigger Shaper path is an important factor for robustness of the operation of the circuitry implementing the C8P1 algorithm. The relative gain variations were measured and are presented in Fig. 15 , where the dispersions were found to be on the order of one sigma equal to 4.2%. Nonequal gains force, as it can be seen from Fig. 14, to operation with threshold levels closer to the noise, which increases the lowest detectable X-ray energy, for which the device with the C8P1 algorithm can be used. Thus, gain trimming, added to the aspect of trimming of static discriminator offsets, is foreseen for future improvements.
