Abstract-This paper presents an area-and power-efficient application-specified integrated circuit (ASIC) for 3-D forwardlooking intravascular ultrasound imaging. The ASIC is intended to be mounted at the tip of a catheter, and has a circular active area with a diameter of 1.5 mm on the top of which a 2-D array of piezoelectric transducer elements is integrated. It requires only four micro-coaxial cables to interface 64 receive (RX) elements and 16 transmit (TX) elements with an imaging system. To do so, it routes high-voltage (HV) pulses generated by the system to selected TX elements using compact HV switch circuits, digitizes the resulting echo signal received by a selected RX element locally, and employs an energy-efficient load-modulation datalink to return the digitized echo signal to the system in a robust manner. A multi-functional command line provides the required sampling clock, configuration data, and supply voltage for the HV switches. The ASIC has been realized in a 0.18-µm HV CMOS technology and consumes only 9.1 mW. Electrical measurements show 28-V HV switching and RX digitization with a 16-MHz bandwidth and 53-dB dynamic range. Acoustical measurements demonstrate successful pulse transmission and reception. Finally, a 3-D ultrasound image of a three-needle phantom is generated to demonstrate the imaging capability.
I. INTRODUCTION

C
ORONARY artery disease is caused by atherosclerosis of the coronary arteries of the heart [1] . It has become one of the most common causes of death worldwide [2] . Intravascular ultrasound (IVUS) imaging, using an ultrasound transducer mounted at the tip of a catheter, is an important tool for the visualization, diagnosis and treatment of atherosclerosis [3] .
Conventional IVUS catheters are side-looking (SL) devices that provide a 2-D cross-sectional image of the vessel wall. An ultrasound transducer mounted at the tip of the catheter is excited by high-voltage (HV) pulses to generate an acoustic pulse, and the resulting echo signals are processed to form the image. IVUS catheters employ either a mechanically rotating single-element transducer [4] , or 64 elements folded around the tip of the catheter [5] , [6] . Mechanical rotation is complex to implement, and relatively slow, leading to motion artifacts in the image [7] , while the use of a transducer array comes with an electrical interconnect challenge due to the limited number of cables that can be accommodated in a catheter shaft.
A severe case of coronary artery disease is a chronic total occlusion, a condition in which atherosclerotic plaque completely blocks the vessel. Successful recanalization of such occlusions using guidewires is associated with improved left ventricular function and reduced mortality [8] , [9] . For this type of lesions, forward-looking (FL) imaging is required, since imaging ahead of the catheter tip can help to distinguish between plaque and normal vessel wall during the crossing procedures, hence reducing the risks of dissections and vessel perforation [3] .
Early implementations of FL-IVUS catheters are based on the scanning motion of an FL single-element transducer [10] , [11] or a rotating single-element oriented at 45°a ngle [12] . Both implementations require rotation and multiple acquisitions to construct a 2-D image, and are sensitive to motion artifacts. In order to achieve real-time FL 3-D imaging without rotation, a circular transducer array [13] , [14] or a 2-D transducer array [15] can be placed at the tip of the catheter. However, connecting the resulting relatively large number of elements (50-100) using micro-coaxial cables within the catheter diameter of <2 mm is extremely challenging.
Application-specified integrated circuits (ASICs) for FL-IVUS have been reported that employ pulsers and multiplexers to reduce the number of cables, but these still require at least 13 connections [13] , [14] , [16] . Moreover, they communicate the received echo signals to the imaging system in an analog form, which is relatively susceptible to interference and not amenable to digital multiplexing or data-reduction approaches. Also, the use of wireless data transmission has been proposed to reduce the cable count [17] , but the integration of the required antenna on a catheter is challenging, and successful wireless operation on a catheter is yet to be demonstrated. This paper presents a front-end ASIC that requires only four micro-coaxial cables to interface with a total of 80 piezoelectric transducer elements fabricated on the top of the ASIC: 64 receive (RX) elements and 16 transmit (TX) elements [18] . In contrast with prior work, the ASIC digitizes the received echo signals locally, allowing their transmission to an external imaging system in a robust form, and demonstrating the feasibility of in-probe digitization within the stringent size, power, and interconnect constraints of an FL-IVUS probe. The ASIC has been designed for FL-IVUS, but the presented approaches to in-probe digitization, cable-count reduction, and HV switching are equally applicable in an SL-IVUS probe, or in other miniature probes, such as intra-cardiac echography (ICE) catheters. Fig. 1 illustrates conceptually how the ASIC will be mounted at the tip of a catheter. The ASIC will be laser cut into a donut shape, with an outer diameter of 1.5 mm and an inner hole of 0.5 mm for the guide wire. The ASIC enables synthetic aperture imaging, in which acoustic pulses are transmitted using one or multiple TX elements, and the resulting received echoes are recorded by one RX element at a time. Expanding on our earlier publication [18] , which reports preliminary results obtained using a test transducer array that is connected to the ASIC using wire bonds, this paper presents a detailed description of the design and new experimental results obtained with transducer elements integrated on the top of the ASIC. This paper is organized as follows. Section II describes the proposed system architecture. Section III discusses the details of the circuit implementation. Sections IV and V present the experimental results and conclusions.
II. SYSTEM ARCHITECTURE
A. Transducer Array
FL-IVUS probes have been reported that apply a ringshaped transducer array based on capacitive micromachined ultrasonic transducers (CMUTs) [13] , [14] or a 2-D matrix array based on piezoelectrical (PZT) transducers [15] . These designs either have a relatively low signal-to-noise ratio (SNR) or require a large number of cables to be integrated. To overcome these limitations, we have developed a PZT-based matrix transducer with a total of 80 elements only needing four coaxial cables to be integrated inside a catheter. The transducer elements for transmit and receive are separated (16 for TX and 64 for RX), as illustrated in Fig. 1 . The transducer array built on the top of the ASIC uses the approach described in [19] .
We employ 80 μm × 80 μm transducer elements with a pitch of 100 μm. The center frequency of these elements is 13 MHz. This is lower than the frequencies typically used in SL-IVUS devices, which operate at 20 MHz or above [5] , [6] . The chosen frequency is comparable to earlier FL-IVUS designs, e.g., [14] , [20] , and tradesoff resolution for larger penetration depth as required in FL imaging. The −3-dB bandwidth of the elements is 44% (10) (11) (12) (13) (14) (15) (16) . The impedance characteristic of the transducer elements has been simulated using finite-element analysis software (PZFlex LLC, Cupertino, CA, USA), as shown in Fig. 2 . This simulation includes the whole transducer stack (PZT, matching layer, ASIC, glue, ground foil). Since the element thickness of 80 μm is close to the wavelength (100 μm), various mode vibrations occur in different layers, resulting in fluctuations of the impedance as a function of the frequency. To obtain a lumped-element model suitable for circuit simulation, we approximate the electrical impedance around resonance by a Butterworth-van Dyke lumped-element model (also shown in Fig. 2) , which captures the main resonance mode and the element's electrical capacitance. The parameters of this model are obtained by means of least-squares curve fitting on the simulated impedance data. The transducer's impedance at resonance is approximately 0.7 pF//5 k . The elements have a measured transmit efficiency of 0.4 kPa/V at 6 mm from the transducer. Their measured receive sensitivity is around 4 μV/Pa. To generate sufficient acoustic pressure to be able to detect the low backscattered signal from blood [21] at an imaging depth suitable for FL-IVUS, transmit voltages on the order of 30 V are required. Table I summarizes the key parameters of the applied transducer. While a matrix of transducer elements covering the full ASIC would be best for imaging, we leave out elements to make space for five bond pads, which provide electrical connections (one of which is a ground) to four micro-coaxial cables. In the prototype presented in this paper, the ASIC is connected using wire bonds on the same side as the transducer elements, but these can be replaced in the future by throughsilicon vias to realize more convenient connections on the back side of the ASIC. Elements are also omitted in the center of the ASIC to make room for the catheter's guide wire.
The 80 elements are divided into 16 transmit (TX) elements and 64 receive (RX) elements. The transmit elements are located around the guide wire hole, while the receive elements cover a larger aperture and determine the lateral imaging resolution. This partitioning allows the receive circuitry to be implemented using compact low-voltage circuitry, while the number of HV circuits associated with the transmit elements, which occupy a relatively large die size, is kept limited. Fig. 3 shows the top-level architecture of the proposed ASIC. The ASIC consists of three main parts: 1) HV switches; 2) a receive signal chain; and 3) a clock and data recovery circuit.
B. ASIC Architecture
The HV switches are responsible for exciting the 16 TX elements using HV pulses in order to generate enough acoustic pressure. Several implementations of on-chip HV pulsers have been reported that can serve this purpose [22] , [23] . Pulsers employing a resistive pull-up to an HV supply [13] are simple and area efficient, but are relatively power hungry due to the current flowing through the pull-up resistor. A push-pull architecture [22] is more power efficient, but requires more HV transistors to implement a level shifter to control the pullup transistor and thus occupies more die area. An alternative to using on-chip pulsers is to use on-chip switches to route an externally generated HV signal (HV TX PULSE in Fig. 3 ) to selected transducer elements [24] . This approach reduces the on-chip power dissipation and allows the use of arbitrary transmit waveforms. We present an area-efficient HV switch implementation, by means of which one or more of the TX elements can be connected via a TX cable to the imaging system. This allows for the implementation of a syntheticaperture transmit scheme, in which each of the 16 TX elements are successively pulsed, or, alternatively, a plane-wave transmit scheme, in which multiple TX elements are pulsed simultaneously.
The receive signal chain is responsible for transferring the received echo signals to the imaging system. In contrast with prior analog approaches [13] , [14] , we digitize the echo signals locally. In order to do so, we connect one of the 64 RX elements via an analog multiplexer and an analog frontend (AFE) circuit to an ADC. This allows for the implementation of a synthetic-aperture receive scheme, in which the echo signals received by the 64 RX elements are digitized in 64 successive pulse-echo sequences. Combined with the synthetic-aperture transmit scheme, a complete syntheticaperture image can be obtained in 1024 (16 × 64) sequences. The estimated volume frame rate is on the order of 100 fps for an imaging depth of 8 mm. While this is sufficient for our application, higher frame rates could be achieved by using multiple AFEs and ADCs and less multiplexing. This, however, comes with the challenges of increased die size and output data rates.
The AFE circuit amplifies the echo signal with a programmable gain to match it to the input range of the ADC. To detect the low backscattered signal from blood, an ultrasound system requires a dynamic range (DR) of at least 70 dB [21] . The DR of the receive signal chain can be lower than this, because the beamforming operation that combines the 16 × 64 receive signals to form the image enhances the DR. In the case of full synthetic phased array imaging, the DR will increase by √ (16 · 64), or 30 dB, while other imaging schemes can provide even more [25] . To reach an overall DR of 80 dB, we aim for a DR of 50 dB per channel. Considering this DR requirement and the 13-MHz transducer center frequency, a 60-MS/s 10-bit SAR ADC is adopted. The ADC's output data is transmitted serially to the imaging system using a loadmodulation-based data link (RX DATA).
The clock and data recovery circuit is responsible for extracting the ADC sampling clock and the pulsewidth encoded configuration data, including the switch and gain settings, from a COMMAND signal generated by the imaging system during the RX phase. This same signal is also used to distinguish between the TX and the RX phases. During TX, the line is pulled to 5 V by the system, providing a supply voltage for the HV transmit circuit, as will be discussed in Section III-D.
In addition to the three mentioned cables (HV TX PULSE, RX DATA, and COMMAND), a fourth cable provides the ASIC with a supply voltage (V DD ) of 1.9 V. Fig. 4 (a) shows the circuit diagram of the AFE, which converts the signal current of the selected RX element to a differential input voltage for the ADC. It consists of a trans-impedance amplifier (TIA), a buffer stage (BUF), a programmable-gain amplifier (PGA), and an ADC driver. Note that the multiplexer at the input has been omitted in Fig. 4 (a) for simplicity.
III. CIRCUIT IMPLEMENTATION
A. Analog Front End
The TIA is used to amplify the signal current I in produced by the small PZT elements. Considering, as shown in Fig. 2 , the relatively high source resistance (∼5 k ) and the small capacitance (∼0.7 pF), a TIA is a power-efficient circuit topology. We opt for a single-ended TIA in view of the inherently single-ended nature of the transducer elements, due to the common ground electrode shared by all elements. A fully differential implementation would require more area, while the benefit of lower harmonic distortion is not critical in our application, since harmonics are out-of-band and will be filtered out. The TIA senses the signal current I in and converts it into an output voltage. A feedback resistor of 115 k sets the trans-impedance gain, while a feedback capacitor of 77 fF ensures stability and sets the −3-dB bandwidth to 18 MHz.
To make the dc biasing of the TIA's virtual ground independent of that of the transducer, a coupling capacitor C ac of 5.6 pF is used. This allows the transducer to be shorted to ground during the TX phase and connected to the TIA during the RX phase, without voltage transients that lead to parasitic acoustic transmissions. During the TX phase, the TIA's feedback network is shorted, thus keeping the amplifier in unitygain feedback and preventing parasitic on-chip coupling of the HV transmit pulse from affecting the receive path.
A current-reuse operational transconductance amplifier (OTA) is employed to increase the power efficiency of the TIA, as shown in Fig. 4(b) [26] . The applied differential architecture allows the use of a tail-current source that isolates the signal-path from interfering signals superimposed on the supply voltage. The OTA is biased at 220 μA to ensure that the TIA's bandwidth covers the transducer's bandwidth and that the input-referred noise of the receive signal chain is dominated by the transducer (whose in-band rms noise is around 4.4 nA) rather than the TIA. The OTA's non-inverting input is biased at a mid-supply reference voltage V cm , using a resistive divider (not shown).
The PGA is implemented as an inverting amplifier with a switchable capacitive feedback network, which provides a gain of 6, 12, or 18 dB. This programmability allows the amplitude of the received signal to be adjusted to the input range of the ADC. The PGA has a constant input capacitance of 616 fF. The programmable gain is realized by switching the feedback capacitors, from 308 fF at the lowest gain setting to 77 fF at the highest gain setting. The architecture of the OTA is similar to that used in the TIA. It is biased at 220 μA for gain accuracy and linearity. In order to prevent the PGA's input capacitance (616 fF) from loading the TIA, a flipped voltage follower (BUF) biased at 110 μA [27] , shown in Fig. 4(c) , is employed as a buffer stage.
Finally, a fully differential amplifier is implemented to convert the single-ended voltage to a differential voltage and drive the ADC's input sampling capacitors of 2 pF. To relax the bandwidth requirements on this amplifier, as will be discussed below, the ADC employs two pairs of timeinterleaved sampling capacitors, so that one sampling clock period, i.e., roughly 16.6 ns, is available for settling, rather than only a fraction of the sampling period. This still requires a bandwidth of 66 MHz, which is realized in a power-efficient way by adopting a modest gain of 2 and using a fully differential current-reuse OTA, shown in Fig. 4(d) , biased at 400 μA. This OTA provides a differential output voltage to the ADC with a peak-to-peak range of 1.2 V.
Like the TIA, the PGA, and the ADC driver are both switched in unity-gain feedback during the TX phase. This blocks interference from the TX signal and provides a well-defined dc biasing for the capacitive feedback networks. The complete receive path consumes approximately 950 μA and provides an overall trans-impedance gain of 112, 118, or 124 dB for the three PGA gain settings.
B. ADC
Local digitization of the receive echo signals helps to provide a robust digital output. Pipelined ADCs and deltasigma ADCs have been reported for ultrasound applications with suitable power consumption levels [28] - [30] . However, these designs require a die area that exceeds what is available in our application [28] , [29] , or are implemented in advanced technology nodes that do not offer the HV devices required in our design for the transmit switches [29] . Considering the stringent requirement of power and area, and the relatively modest DR requirement, an SAR ADC is a good choice for our application in 0.18-μm HV CMOS technology. A 60-MS/s 10-bit SAR ADC is employed which can be implemented in a power-efficient manner and is well matched to the 50-dB DR. Moreover, it allows asynchronous operation, which eliminates the need of an external oversampled clock and thus reduces the system complexity.
In a conventional charge-redistribution SAR ADC, the voltage on the capacitive DAC (CDAC) needs to settle to the required accuracy (errors within +/−0.5 LSB) before the comparator makes a decision. Less than 1.7 ns would be available for the CDAC to settle to the reference voltage and the comparator and SAR logic to finish the decision. Allocating more time for the CDAC settling would reduce the power consumption of the reference buffer but more power hungry and faster digital logic (comparator and SAR logic) would be required.
A 10-bit charge-sharing SAR ADC, in contrast, can be implemented using only 67 unit capacitors, and relaxes the settling requirements of the reference buffer [31] . Therefore, we employ a charge-sharing SAR ADC in this paper. Fig. 5 shows a simplified circuit diagram of this SAR ADC. The differential analog input signal is first sampled on the sampling capacitors C s . In the meanwhile, the CDAC is precharged by an on-chip reference buffer to generate nine binaryscaled reference charges which will be used to quantize the signal charge. To determine the most significant output bit, the comparator evaluates the polarity of the voltage on C s . Based on this, the largest CDAC capacitor is connected either in parallel (CP) or anti-parallel (CN) to C s , causing the associated reference charge to be added to or subtracted from the charge on C s . The comparator then again evaluates the polarity of the voltage on C s to determine the next bit. This successive approximation process continues until all 10 bit have been determined, and is controlled by the SAR logic, whose differential outputs CP1-CP9 and CN1-CN9 are used to control the switching of the CDAC. By using sampling capacitors Cs of the same size as the total CDAC capacitance and a reference voltage of 0.3 V, the desired differential input voltage range of 1.2 V is obtained [31] .
The implemented CDAC consists of 67 unit capacitors (not shown in detail in Fig. 5 ), 63 of which are pre-charged to the reference voltage to create the six most-significant binaryscaled reference charges (32:16:8:4:2:1). Of the remaining four unit capacitors, one is pre-charged to the reference voltage, after which successive charge redistribution with the other 3 is used to create the three least-significant reference charges (0.5:0.25:0.125) [31] . A unit capacitor C u of 32 fF is used to meet the 10-bit matching requirement, taking into account both matching between the unit capacitors and the effect of parasitic capacitance of the switches controlled by CP and CN shown in Fig. 5 . This leads to total CDAC capacitance of 2 pF. As said, the sampling capacitors C s are of the same size. The associated kT/C noise is well below the LSB step of the ADC.
A self-timed architecture is implemented for the SAR logic to avoid the need of an oversampled clock. Via the COMMAND line, an external 60-MHz clock is used to trigger the start of the conversion. The comparator clock is generated by asynchronous SAR logic. In order to increase the linearity of the input sampling switches, the drive signal of these NMOS switches is boosted to achieve an over-drive voltage of ∼1.3 V [32] . The power consumption of the ADC is fully dynamic and amounts to 2.8 mW. In order to reduce the power consumption of the ADC driver, a two-path time-interleaved sampling scheme is employed to reduce the bandwidth requirement of the ADC driver. The associated schematic and timing diagram are shown in Fig. 7 . In every clock cycle, one pair of sampling capacitors connects to ADC driver, making a full 16.6 ns available for settling, while the charge stored on the other pair is digitized.
C. Load-Modulation Datalink
To transmit the ADC's 10-bit output code through a single cable, a conventional implementation is to serialize the 10-bit code using an oversampled clock generated on-chip with a power-hungry phase-lock loop (PLL) or delay-lock loop (DLL). Instead, to simplify the circuitry and reduce the on-chip power dissipation, the ADC output code is transmitted asynchronously by means of load modulation. Fig. 8 shows the principle. The comparator's asynchronous differential outputs COMP and COMN represent the output code. Two differently sized NMOS load-modulation switches are driven by the comparator outputs, resulting in three different loads resistances. Together with a 50-pull-up resistor on the system side, this forms a resistive divider on which a three-level waveform can be observed from which the ADC bits can be recovered. The supply voltage to which the pull-up resistor is connected is a tradeoff between signal swing and power consumption. A voltage of 0.7 V was chosen experimentally (see Section IV-A), which leads to an on-chip power consumption associated with the data transmission of about 2.7 mW, comparable to the power consumption of the remaining building blocks.
D. High-Voltage Bootstrapped Switch
The HV switches serve to pass a transmit pulse generated by the imaging system to selected transducer elements during the TX phase. Earlier implementations of HV switches employ either level-shift circuits [22] , [23] or bootstrapped capacitors [24] to provide overdrive to turn on an HV MOS transistor. A level-shift circuit translates a low-voltage switch-control signal to the voltage level at which the switch operates, and tends to require an HV supply, which is not available in our system. A bootstrapped switch is a better match. It employs a capacitor at the gate of an HV MOS transistor which is charged up to turn on the switch when the voltage level is low. The capacitor then maintains the overdrive when the voltage level increases, thus keeping the switch ON. In [24] , a switch is described that employs back-to-back connected transistors to provide bi-directional isolation. However, to turn on and off these transistors, two additional HV MOS transistors and several HV diodes are needed per switch, which requires significant die size.
In this paper, we limit ourselves to unipolar pulses, so that back-to-back isolation is not required, allowing an implementation using only one HV switch transistor MH1 to connect the transmit element (V el ) to the off-chip HV pulser (V TX ), as shown in Fig. 9(a) . To turn on MH1, a bootstrap capacitor C gs is charged to 5 V through transistor MH2. This happens when V TX is still low, at the start of the TX phase. To turn on MH2, its gate voltage V g2 is pumped to approximately 10 V, while its source voltage V s2 is pushed to 5 V, as shown in Fig. 9(b) . The 10 V at the gate is generated by charging a capacitor C boost to 5 V and then pushing the voltage V boost at its bottom plate to 5 V. Because MH2 is now turned on, C gs will be charged to V s2 (5 V). Once C gs has been charged, MH2 is switched OFF by dropping V boost , thus isolating the charge on C gs and keeping MH1 switched ON. HV transmit pulses on the TX line can then be passed to the transducer element. The bootstrap capacitor C gs of 1.7 pF is sized such that MH1 maintains enough overdrive even if some charge is lost when the TX line goes high due to parasitic capacitance at node V g1 . A Zener diode D2 protects the gate of MH1. After the TX phase, MH1 is turned off by dropping V s2 , which causes C gs to be discharged through MH2. A diode D1 then precharges C boost for the next TX cycle. To prevent the switch from turning on during the TX phase, V boost and V s2 are simply kept low, as illustrated in the second TX phase in Fig. 9(b) .
In order to generate the 5-V control signals V boost and V s2 needed for the HV switch, a control signal generator is required, shown in Fig. 10 . The required 5-V level is derived from the COMMAND line. When the voltage V cmd on that line is pulled to 5 V by the system to identify the start of the TX phase, M4 pulls the V 5V line up to 5 V. If V cmd is between V SS and V DD , i.e., during the RX phase, this is detected by the circuit consisting of M1-3, which turns on M5 to pull V 5V to ground, completely turning off the HV switch circuit.
As said, the HV switch MH1 is turned on by pulling V boost and V s2 to 5 V during the TX phase. Whether this happens is determined by a 1.8-V logic enable signal EN, provided by the chip's configuration shift register. With the help of a 1.8-to 5-V level shifter consisting of M6-9, this enable signal drives transistor M10 to pulling up V s2 . The shorter pulse on V boost is generated by M12-15, R1, and C1, where the time required to charge C1 through R1 determines the duration of this pulse, which is set to approximately 20 ns. When EN is low, V s2 and V boost are pulled down by M11 and M12, respectively, preventing the switch from turning on. Although the V 5V is pulled down to 0 V during the RX phase, the diode D1 prevents C boost from being discharged. Circuit diagrams of (a) clock and data recovery circuit, (b) continuous-time comparator, and (c) associated timing diagram.
E. Clock and Data Recovery Circuit
Besides providing a 5-V level during the TX phase, the COMMAND line also provides a clock for the ADC, and, in the form of pulsewidth modulation (PWM)-encoded data, configuration bits that are loaded into a shift register to control the RX multiplexer, the gain of the AFE, and the EN signals of the TX switches. In order to recover the clock and data from the COMMAND line, the 5-V pulses are first clamped to protect the following low-voltage circuitry, and then, the extracted PWM signal is demodulated into a clock and data, as shown in Fig. 11(a) . A 5-V NMOS transistor M0 limits the V cmd signal to V DD -V th , where V th is the transistor's threshold voltage, after which a simple continuoustime comparator, shown in Fig. 11(b) , turns the signal into proper logic levels. The comparator is biased at 190 μA to make sure that the latency of the comparator will not affect the duty cycle of recovered PWM signal. The rising edges of the resulting signal are used to trigger the ADC.
To decode the data bits, the signal level is sampled at half a clock cycle after the rising edge, as shown in Fig. 11(c) . To do so, a delayed version of the signal is used to clock a flip-flop. This delayed version is also used as the clock for the chip's 30-bit shift register. In order to prevent the current state of the chip from being affected by the loading of new data into the shift register, the shift register output is buffered by a second set of flip-flops, which are clocked by the TX signal [obtained from the HV switch circuit (see Fig. 10)] . Thus, new configuration data only becomes active at the start of the succeeding TX phase.
IV. EXPERIMENTAL RESULTS
A. Experimental Setup
The ASIC has been realized in a 0.18-μm HV CMOS process with a total area of 2 ×2 mm 2 , as shown in Fig. 12(a) . The circular layout has a 1.5-mm outer diameter and a central hole with a 0.5-mm diameter, so that it can be laser cut into a donut shape to fit at the tip of a catheter. Five bond pads provide electrical connections for the four micro-coaxial cables; 80 bond pads are positioned to connect to transducer elements. The die area around the donut is used for test circuits, which are not connected in the acoustical measurements reported below. These test circuits contain digital buffers that provide a parallel 10-bit ADC output (D out in Fig. 5) , as well as a test bondpad through which an electrical test signal can be applied to the AFE. Fig. 12(b) shows a fabricated prototype with the transducer array built on the top of the ASIC using the approach described in [19] . The bond pads on the ASIC that provide electrical connections to the transducer elements are equipped with gold bumps using a wire-bonding tool. After this, an epoxy layer is applied to ASIC that is grinded down to expose the gold, thus providing reliable electrical contacts for the transducer elements. The acoustic stack consisting of a backing layer, a piezoelectric material (PZT), and a matching layer (with a total thickness of approximately 160 μm) is glued on the top of the grinded epoxy layer, which is cut into the desired 100-μm-pitch array pattern using a diamond saw. Finally, the array is covered with an aluminum foil that forms the common ground electrode of the elements. Fig. 13 shows a block diagram of the experimental setup. The ASIC is wire-bonded to a daughter board PCB. The ground foil is connected to the ground of the ASIC and the PCB. The ASIC's four cable connections are connected to four headers on the daughter board which are then connected through 1.5-m-long micro-coaxial cables (AWG 42) to a mother board. The serial output of the ADC is captured by a high-speed oscilloscope with 1-GHz bandwidth (DL9710L, Yokogawa) and processed in MATLAB on a PC. The test signals are connected to the mother board through headers. The HV pulse is provided by an external pulser (AVTech, AVR40, Avtech Electrosystems Ltd., Ottawa, ON, Canada). The mixed-voltage multi-functional command line is generated by a field-programmable gate array (FPGA) and an analog switch (ADG719, Analog Device) which is used to pull the line to 5 V. The switch control signal and 60-MHz PWM signal are generated by the FPGA.
B. Electrical Measurements
In order to characterize the receive signal chain, an external test voltage was applied to the ASIC's analog test input, which is connected on-chip via a 31-k resistor to the TIA input to produce a test current, as shown in Fig. 14 . This ensures that the parasitic capacitance (C p ) associated with the PCB trace and the on-chip bond pad do not affect the test current applied to the TIA. Fig. 15 shows the measured transfer function of the receive signal chain for the three gain settings. The measured trans-impedance gain ranges from 108 to 119 dB with a gain step of 5.7 dB , and is 4-5 dB lower than the design target because of the limited open-loop gain of the OTAs in the AFE.
The measured input-referred noise spectrum is shown in Fig. 16 . The slight increase at higher frequencies is due to the roll-off of the transfer function (see Fig. 15 ). The total inputreferred in-band (10-16 MHz) rms noise is 4.8 nA in which the thermal noise of the 31-k resistor is also included. While comparable to the 4.4-nA noise level of the transducer, this noise level is larger than the design target. A possible cause of this is coupling of the transient supply currents drawn by the ADC and the logic to the input of the LNA via the ground connection of the chip, which in this design also serves as the connection to the transducer's ground foil. This coupling can be reduced in the future by connecting the ground foil to the ground bond pad on the chip, rather than via the PCB. A sinusoidal input signal was applied to evaluate the SNR and the linearity of the receive signal chain. The measured spectrum shown in Fig. 17 illustrates a 42-dB in-band (10-16 MHz) SNR and ∼37-dB HD2. The DR measurement results are shown in Fig. 18 . The measured DR, accounting for the programmable gain, is around 53 dB, sufficient for the IVUS application.
The HV switch was designed to transmit 30-V pulses at a frequency of 13 MHz (pulsewidth of 38 ns) for a PZT device with a parasitic capacitance of 0.7 pF. Fig. 19 shows the measured HV pulse input and HV pulse output of one selected channel in the HV switch array. A 380-ns HV pulse is applied to show the highest voltage amplitude that the chip can provide, which is 29.3 V. A 38-ns pulse is also applied. The lower amplitude of the HV pulse output shown is caused by incomplete settling due to the large PCB parasitic. However, the transducer array used in the acoustical measurement is directly integrated on the top of ASIC so that no additional parasitic are introduced and the HV pulse should be able to fully settle. shown in Fig. 8 , the signal has clearly been low-pass filtered, partially due to the finite bandwidth of the oscilloscope. Nevertheless, the data bits can still be correctly extracted from the captured waveform. To do so, the captured waveform is first synchronized to the 60-MHz input clock and divided into chucks that corresponds to the 10 data bits of an individual sample. The data bits are then extracted by detecting the peaks in the waveform using MATLAB. By comparing the extracted data with the data captured from the parallel test outputs, a bit error rate (BER) of approximately 0.2% is found, which hardly affects the SNR. Fig. 21 shows the power breakdown of the ASIC. The total power consumption is 9.1 mW, which is dominated by the power consumed by the ADC, the analog circuits (including the receive signal chain, the comparator in clock, and data recovery circuits, and the reference buffer) and the loadmodulation datalink.
C. Acoustic Measurements
To perform acoustic measurements, a water bag was mounted on the top of the fabricated prototype, as shown in Fig. 14 . Selected channels of the HV transmit chain were excited and the pressure was measured using a hydrophone (SN1302, Precision Acoustics) suspended in the water 5 mm above the transducer array. The measured TX pressure for increasing number of excited elements is shown in Fig. 22(a) , showing, as expected, a peak pressure that increases roughly proportionally with the number of excited elements.
To test the receive functionality acoustically, a singleelement 5-MHz transmit transducer (PA865, Precision Acoustics) was placed under an angle in the water above the transducer array generating short acoustic pulses. Two receive elements were selected to receive the acoustic signal as shown in Fig. 22(b) . The time delay between the recorded signals is consistent with the fact that the transmitter was placed at an angle, leading to different arrival times of the acoustic pulse on the two elements.
In order to show the 3-D imaging capability of the prototype, a three-needle phantom was placed in the water above the Fig. 23(b) ] was reconstructed by means of delayand-sum beamforming of the received echo signals. The three needle heads can be clearly recognized in the image. For the central needle, sidelobe artifacts are visible close to the main signal. These artifacts are present for all the needles but since the central needle reflects most of the signal, only there the artifact is visible in the 10-dB DR image. Note that better image quality can be obtained by employing synthetic aperture also in TX, and by applying more sophisticated reconstruction techniques. This, however, is beyond the scope of this paper.
A comparison of our ASIC performance and characteristics with the prior art is provided in Table II . The designs described in [13] and [14] employ ring-shaped CMUT transducer arrays with separate RX and TX elements. The on-chip circuitry includes per-element receive amplifiers, multiplexers and buffers that provide four parallel receive outputs, as well as pulsers with the associated control logic. The design described in [6] is for an SL IVUS probe, in which four ASICs are connected to a 64-element PVDF array. Each ASIC includes 16 receive amplifiers and transmit pulsers, with a single differential current-mode receive output shared by the four ASICs. A unique feature of our work is that it not only includes a receive front end, but also an ADC and a loadmodulation datalink. Rather than employing on-chip pulsers, we use on-chip HV switches, which connect selected TX elements to a pulser on the system side, thus reducing the onchip power dissipation. Our ASIC features the lowest cable count and the lowest power consumption, and is the first to provide a digitized output signal.
V. CONCLUSION
This paper has presented a front-end ASIC for 3-D IVUS imaging which interfaces with 16 transmit elements and 64 receive elements using only four 1.5-m micro-coaxial cables. The chip has 1.5-mm-diameter donut-shaped lay-out to facilitate placement around the guide-wire of a catheter. A multi-functional mixed-voltage command line is used to transmit clock, data, and a power supply for the HV switches through only one cable. A load-modulation-based data transmission scheme is used to transfer the ADC's 10-bit output asynchronously through one cable. A HV switch array with a compact and power-efficient circuit-level implementation allows 16 transmit transducers to be excited through one cable. To overcome the challenges of limited area-and powerconstraints, an inverter-based AFE and a charge-sharing SAR ADC have been implemented. The effectiveness of these techniques has been successfully demonstrated in a 3-D ultrasound imaging experiment. Since 2003, he has been a Staff Member with the Electronic Instrumentation Laboratory, Delft University of Technology, focusing on impedance measurement systems and smart sensor systems.
