# ASYNCHRONOUS FRONT-END ASIC FOR X-RAY MEDICAL IMAGING APPLICATIONS IMPLEMENTED IN CMOS 0.18μm TECHNOLOGY

**R. DLUGOSZ** 

<sup>1</sup> UNIVERSITY OF NEUCHÂTEL, SWITZERLAND <sup>2</sup> UNIVERSITY OF ALBERTA, CANADA

# **KEYWORDS: X-ray medical imaging, Parallel data processing, Analog** asynchronous circuits

**ABSTRACT:** An idea as well as a CMOS implementation of the novel multi-channel readout front-end ASIC for nuclear X-ray imaging system has been presented in the paper. The circuit has been designed in an example configuration with eight equal channels, but the modular structure enables an easy realization of larger systems with even hundreds of channels. Various new circuit solutions have been proposed by author and used in the circuit, such as: an asynchronous output multiplexer, a pulse shaper and a peak detector with a built-in clock generator, which activates the circuit only in the situation when a new impulse occurs at the input. This technique allows for very low power dissipation. In the worst case scenario, i.e. when all channels would be active at the same time, the power dissipation is kept below 2 mW. By introducing an efficient RESET mechanism that turns off a given channel just after reading out the information, the counting rate of a single channel has been increased to about 3 MSps. The proposed circuit solutions allow for a very low chip area usage that for a single channel is equal to 0.021 mm<sup>2</sup>, while the total chip area is equal to 0.17 mm<sup>2</sup>.

### **INTRODUCTION**

Specialized integrated circuits (ASIC) are integral parts of modern apparatus that find application in medical Xray imaging diagnostic tools used in nuclear medicine.

Systems of this type allow for detection of X or gamma photons emitted by radionuclides previously introduced to the patient's body together with pharmaceuticals. These photons are then converted to equivalent analog signals using multi-channel specialized chips. Finally analog signals are converted to digital signals that are further processed using computers. Analog-to-digital conversion may be performed inside the chip e.g. using a number of slow and low power ADCs, or externally using one very fast ADC.

In typical applications of this kind a full data processing path consists of several steps, including conversion of the X or gamma photons into the visible light using e.g. the scintillator and then to electrical signal using the photomultiplier. Multi-step conversion in such systems is source of errors that significantly reduce the system's accuracy. In recently introduced solid-state detectors the X or gamma photons may be directly converted into electrical signals that allows for increased accuracy [1]. Front-end readout ASICs that are core blocks in such systems are in the scope of interest since many years [2, 3]. The observed development in this area is partially possible due to continuous progress in microelectronics, but mostly due to development of new circuit solutions.

This paper is devoted to development of such circuits. A new multi-channel readout ASIC has been designed by author in CMOS technology. An example system of this type is presented in Fig. 1 [2]. Particular building blocks have been reported by author earlier as separate blocks. All these circuits have been now significantly

optimized and equipped by new mechanisms that allow for low power operation and low chip area occupation. The experimental chip contains 8 equal channels that operate in parallel as well as an asynchronous output multiplexer. In typical commercial systems of this type, number of channels can vary between several dozen to even several hundreds. System described in this paper can be easily redesigned for higher number of channels.



Fig 1. Typical front-end ASIC for multi-element detectors [2]

In typical system of this type a single channel usually consists of a charge amplifier (CHA), a pulse shaper (PS) and a peak detector (PD) as show in Fig. 1. The signal processing scheme in the channel relies on detection of an incident radiation by an associated sensor, which generates an equivalent charge. The problem here is a very small amount of this charge that is on the level of several dozen aC for 1 keV X-rays and its random distribution over time. The task of the CHA block is to amplify this charge and then to integrate it in the associated capacitor that generates an equivalent voltage proportional to the charge amount. When this process is quick enough then the PS that is subsequent block in the chain obtains the signal that approximately may be described as a Haeviside step function. The PS block is a continuous-time band-pass filter that converts this function into a voltage pulse with a given peaking time and amplitude, which is proportional to the step value. The task of the subsequent peak detector is to properly determine the peaking time, to catch the peak's amplitude in this moment and finally to set up the logic (FLAG) signal that is a request directed to an output multiplexer to readout the peak's value.

# IMPLEMENTATION OF PARTICULAR BUILDING BLOCKS

#### **Pulse shaper**

There are several problems associated with the pulse shaping operation [4]. The first problem is that the input step signal is usually accompanied by noise that alters the peak amplitude. The noise spectrum can be limited by decreasing the signal bandwidth at the PS output, but this in practice increases width of the peak and lowers the count rate of a single channel, which leads to some trade-off that exists between these parameters.



*Fig. 2. The pulse shaping CR-RC filter with transistors that enable the RESET and the programming functions* 

Various techniques have been proposed to overcome this problem, which have been summarized in [4]. One of described solutions enables a relatively long peaking time, while shortening a decay time of pulses, thus increasing the count rate. In this solution the PS block has been realized as a serial connection of a single highpass CR filter and several lowpass RC filters, with amplifiers between particular filters that compensate the signal amplitude. This solution suffers from several disadvantages. One of them is necessity of using many passive filters, which contain relatively large elements, e.g. the average value of capacitors is equal to about 1 pF, while resistors have values of hundreds  $k\Omega$ . This significantly increases the chip area of entire block. The other problem is power dissipated by amplifiers.

The additional problem associated with the shaping operation is overlapping of adjacent impulses, in case when falling edges of pulses are long. To overcome this problem a sufficiently long time must be allocated for each impulse to make them independent events. When this time is to short then a new radiation event triggers the transition state with non-zero initial conditions, which is the source of additional errors. Furthermore, resetting the CHA circuit in practice means a negative step function at the PS input that also starts a new transition state with non-zero initial conditions.

To overcome described problems a new circuit, shown in Fig. 2, has been proposed by author, which contains only one CR and one RC filter and no amplifiers. In this solution the falling edge is quickly reset, just after the peaking time, by the FLAG signal that discharges both capacitors to the baseline level ( $V_{BL}$ ). It is worth noting that the pulse shaper is blocked as long as the FLAG signal is equal to logical "1". Just after reading out the peak by the multiplexer, the pulse shaper is ready for a new shaping operation with zero initial conditions. Two additional voltages  $V_{ctr1}$  and  $V_{ctr2}$  are used to control time constants of both the CR and the RC filters. This mechanism has been introduced to enable testing the chip under different conditions.

#### **Peak detector**

Peak detectors reported in literature may be generally classified as sampled or asynchronous. Sampled PDs theoretically offer higher precision but suffer from high circuit complexity. An example sampled peak detector described in [3] contains a delay line composed of 112 delay elements. As a result, the chip with 128 channels designed in BiCMOS 0.8µm process occupies the area of 64 mm<sup>2</sup>. Asynchronous PDs offer simpler structure but high precision is possible for relatively large power dissipation. An example asynchronous PD reported in [5] achieves an accuracy of 99.8%, dissipating power of 3.5 mW.

The peak detector used in designed system described in this paper is a significantly modified sampled solution proposed initially in [6]. This circuit is very precise and offers very low power dissipation of several µW only. In the previous approach the delay line had two memory elements, while now the delay line contains three delay elements. Moreover, the general principle in both cases remains the same. In proposed PD an input voltage after conversion to current is subsequently sampled in the delay line. Two samples from delay line are compared in an current mode comparator. A special configuration of switches causes that a recent sample  $I_{IN}(n)$  is always directed to the comparator as a positive current, while the previous sample  $I_{IN}(n-1)$  as a negative current. Now, if the input signal is rising, the comparator's output has positive value. After the peak is reached the input signal starts falling and the comparator's output becomes negative that stores the peak value in the sample& hold element. In the previous PD approach the comparison between both samples was performed during storing a  $I_{\rm IN}(n)$  sample in the delay line, resulting sometimes in flipping of the comparator's output while sampling the signal near the peak value, especially in the presence of large noise. In the current approach the comparison is performed between two samples already stored in the previous clock phases and the circuit is more stable.

An important element in proposed peak detector is the MN6 transistor that operates as a diode. Amplitude of the input signal can vary in a relatively wide range between 0.4 and 1.5 V, but as the MN6 transistor limits the signal on capacitors in delay elements, the output stage of PD works properly in the wide range of the input signals. Delay elements contain dummy switches (DS) that compensate the charge injection effect. As signals in delay elements vary in relatively small range, therefore such compensation is very precise.





Fig. 4. Internal 3-phase clock generator that controls the peak detector

The PD circuit may by additionally controlled using two BIAS currents  $I_{level1}$  and  $I_{level2}$ .

In the previous approach the PD was sampled with an external clock generator. Now each channel contains its own clock circuit, shown in Fig. 7, which is activated just after a new impulse occurs at the input. As a result, the PD circuit works only during the shaping operation and is turned off after setting up the FLAG signal i.e. after the peak has been detected and stored.

The proposed clock circuit is power economic and features a simple structure. It consists of two blocks i.e. the impulse generator that generates two complementary signals, ck1 and ck2, and a digital 1-bit delay line that is controlled by these signals and generates three clock phases N1 – N3 that control the PD block. Frequency of the ck1/ck2 signals is controlled by the  $V_{ctr3}$  voltage, which charges the capacitor. When voltage across this capacitor becomes larger than a threshold voltage of the first NOT gate, all gates change their states and the RST

signal discharges the capacitor by transistor MN2 that starts a new cycle. When the generator is turned off i.e. when ENBL=0, then signals in points B=0, D=1, E=0, F=1, G=0, ck1=0, ck2=1, A=0, z1=1, z2=0 and z3=0. Appropriate placement of transistors MN3, MN4, MP2-MP4 causes that just after the ENBL signal becomes 1, the clock phase N1 is equal to 1, so the PD circuit can start immediately, and always under equal conditions. Each following clock cycle (ck1/ck2) moves a "1" value in the delay line by two NOT gates, generating in turn the clock phases N2, N3. When both the z1 and the z2 signals become logical 0 i.e. when G=1, the NOR gate generates a new logical 1 in the point A.

The clock circuit is synchronized with the ENBL signal i.e. with the beginning of an impulse. As the PS ensures equal peaking time independently on amplitude of the input step function, therefore impulses are always stored in the same moment. This allows for very high precision of proposed circuit that is larger than 99 %.

### Logic control circuit and output S&H

A proper cooperation between particular blocks in the channel is ensured by a special control logic circuitry that consists of several building blocks shown in Fig. 5. When the step function occurs at the channel's input (PULSE) the initialization block sets up several signals. The first element is the asymmetrical NOT gate, which flips for small values of the step functions. This element is additionally controlled by the  $V_{\rm ctr}$  voltage that allows for setting up the minimal value of the step function that is recognized by the system as the useful information.



Fig. 5. Components of the control logic circuitry: (a) initialization block, (b) output sample-and hold element, (c) two stage Miller's OA (d) latch circuit [d], (e) flag generation circuit [5]

When the PULSE signal passes the threshold voltage of this gate the ENBL signal becomes 1. This signal is then latched in the latch circuit that generates an SVB signal, which turns on the operational amplifier (OA) in the sample-and-hold element. The SVB signal turns on the bias voltage  $V_{\text{B1}}$  as well as the output stage in the classic

two-stage Miller OA, shown in Fig.5 (c) [7]. The ENBL signal is equal to 1 until the FLAG is set up by the flag generation circuit. Ones this happens both the PD and the PS circuits are immediately blocked and zeroed, and the FLAG signal does not allow for a new shaping operation. The flag generator may be reset only by an external signal, after reading out the peak's value. It is worth noting that reset is allowed only after reset of the input step function. This prevents the situation when the same step function is processed more than one time by a single channel. The SVB signal is maintained by the latch circuit until the information is readout. After that the channel goes to the standby mode and dissipates only very small power that is below  $0.3 \mu$ W.

### Channel

Particular channel's components have been described in details in previous subsections. It is also worth noting that all analog blocks in the channel i.e. the PD, PS and S&H have some equal baseline voltage  $V_{\rm BL}$ , which has been applied as transistors in the current-mode delay elements in the PD circuit must be biased by some DC value. By adding some DC voltage to the  $y_{\rm PS}$  signal we enlarge the range of the input amplitudes that can be recognized by the system as the useful information.

Operation of a single channel is illustrated in Fig. 6 for postlayout Hspice simulations. Particular signals shown in this Figure have been explained earlier and pointed in the schematics presented in Figs.2-5.

Note that the RESET signal is effective only when the input PULSE signal is reset to the baseline voltage.



Fig. 6: Postlayout simulation results of a single channel: (top) analog useful signals (middle) clock generator in PD (bottom) logic control signals



Fig. 7. Accuracy of a single channel (PS +PD) (top) voltage in the point MEM vs. the PULSE signal together with ideal characteristic, (bottom) error between the simulated characteristic and this ideal case

Accuracy of a single channel i.e. between the PS input and the PD output has been verified for different values of amplitude of the input signal. The results are shown in Fig. 7. The error is keeping within the range of 0.5 %in a wide range, only for small values exciding 2%. This is due to the fact that for small amplitudes the PD circuit is to slow and stores a value from the falling edge.

## IMPLEMENTATION OF ENTIRE READOUT SYSTEM

The experimental chip contains 8 channels working in parallel connected to an asynchronous multiplexer. Two concepts of such multiplexer have been proposed earlier [8, 9] that have been now included in this experimental chip for verification. In this paper only one solution is presented [8], but chip works properly in both cases. The general idea of this circuit is that only one channel can be connected to the output in a given time moment, otherwise information of two or more channels would overlap. The problem here is that pulses at the inputs of particular channels can occur randomly and sometimes in the same time. The multiplexer must be able to detect the channel that became active and in case when two or more channels are active in the same time must hold the information in other channels to enable a serial reading out. This multiplexer is based on the binary tree concept shown in Fig. 8 (a). The main block in this circuit is the active channel detection blocks (ACDB), presented in Fig. 8 (b), which groups two input signals, generating

three output signals, where one of them (F) goes to the next layer in the tree, while two others (X1 and X2) are used to generate address (see Fig. 8 c) of the channel that will be read out.



Fig. 8. Synchronized MUX used in experimental chip to detect the active channel: (a) general binary tree structure, (b) a single active channel detection block (ACDB) (c) circuit that generates address of the active channel, (d) internal clock



Fig. 9. Layout of experimental chip with 8 channels. Die area of 8 channels and both multiplexers is equal to 450x380 µm

The ACDB is controlled by an internal clock generator, which works on similar principle as the clock generator shown in Fig. 4. This circuit is activated only when one of the input signals (F1 or F2) is active and the clk1 or the clk2 signal does not indicate it. In this situation the EN signal becomes "1", switching over a DFF. As a result, the EN signal becomes "0", which deactivates the clock. In case when two inputs are "1" in the same time, then the DFF indicates one of these signals and the clock generator does not need to be activated. After reading out an associated channel, this input signal is reset and the EN signal becomes "1" switching over DFF to point at the second input. Details of this circuit and the simulation results have been reported in [6].

The layout of designed system is shown in Fig. 9. The experimental chip consists 8 channels connected to the MUX and additionally 1 channel that can be tested in detail. In this case different internal signals are additionally available.

Selected postlayout simulation results are presented in Fig. 10. in the worst case scenario when all channels became active in the same time. After reading out particular channel are reset and multiplexer immediately generates address for new active channel. The current consumption is shown in the last panel. When particular channels are subsequently reset the supply current  $I_{DD}$  decreases, and finally when entire system in the standby mode is at the level below 1  $\mu$ A only.



block together with a peak detector

# CONCLUSIONS

An experimental front-end ASIC for medical imaging designed in CMOS  $0.18\mu$ m process has been described in the paper. Designed system is based on many new circuit solutions that allow for a very small chip area and very low power dissipation with a very efficient power down mode. The chip area of 8 channels and two

multiplexers is equal to 0.17 mm<sup>2</sup>, which means that in case of e.g. 128 channels a single chip will occupy an area of only 3 mm<sup>2</sup>. Power dissipation in the worst case scenario for 8 channels being active in the same time is equal to only 2 mW. An effective reset function used in PS and PD blocks simplifies the circuit's structure and allows for counting rate of a single channel to be on the level of more than 5 MHz.

## **THE AUTHOR**

Dr. Rafał Długosz

University of Neuchâtel, Institute of Microtechnology, Rue A.-L. Breguet 2, CH-2000, Neuchâtel, Switzerland; University of Alberta, Department of Electrical and Computer Engineering, 114 St – 89 Ave, Edmonton Alberta, T6G 2V4, Canada, <u>rdlugosz@ualberta.ca</u>; *fellow of the MC EU Outgoing International Fellowship* 

### REFERENCES

- [1] G. Knoll, *Radiation Detection and Measurements*, Wiley, 2000.
- [2] G. De Geronimo, P. O'Connor, and J. Grosholz, "A generation of CMOS readout ASIC's for CZT detectors", *IEEE Transactions Nuclear Science*, vol. 47, pp. 1857–1867, Dec. 2000.
- [3] F. Anghinolfi, W. Dabrowski, E. Delagnes, J. Kaplon, U. Koetz, P. Jarron, F. Lugiez, C. Posch, S. Roe, P. Weilhammer, "CTA-a rad-hard BiCMOS Analogue Readout ASIC for the ATLAS Semiconductor Tracker", IEEE Nuclear Science Symposium, 1996, vol. 1, Nov. 1996, pp. 46 50.
- [4] H. Spieler, Semiconductor Detector Systems, Oxford University Press
- [5] G. De Geronimo, A. Kandasamy, P. O'Connor, "Analog peak detector and derandomizer for highrate spectroscopy", IEEE Transactions on Nuclear Science, vol. 49, Issue 4, Part 1, August 2002, pp. 1769 – 1773.
- [6] R. Długosz, K. Iniewski, "High precision analog peak detector for X-ray imaging applications", Electronics Letters, 2007
- [7] A. Dąbrowski, R. Długosz, P. Pawłowski, Design and Optimization of Operational Amplifiers for FIR SC GSM Channel Filter Realized in CMOS 0.8 μm technology, Signal Processing Workshop, Poznan, Poland, 2004
- [8] R. Długosz, K. Iniewski, Synchronous and Asynchronous Multiplexer Circuits for Medical Imaging Realized in CMOS 0.18um Technology, SPIE International Symposium on Microtechnologies for the New Millennium, Gran Canaria, Spain, May 2007
- [9] R. Długosz, K. Iniewski, Hierarchical Asynchronous Multiplexer for Readout front-end ASIC for multi-element detectors in medical imaging, *International Conference Mixed Design* of Integrated Circuits and Systems (MIXDES), June 2007, Poland