Abstract-Intravascular ultrasound (IVUS) and intracardiac echography (ICE) catheters with real-time volumetric ultrasound imaging capability can provide unique benefits to many interventional procedures used in the diagnosis and treatment of coronary and structural heart diseases. Integration of capacitive micromachined ultrasonic transducer (CMUT) arrays with front-end electronics in single-chip configuration allows for implementation of such catheter probes with reduced interconnect complexity, miniaturization, and high mechanical flexibility. We implemented a single-chip forward-looking (FL) ultrasound imaging system by fabricating a 1.4-mm-diameter dual-ring CMUT array using CMUT-on-CMOS technology on a front-end IC implemented in 0.35-µm CMOS process. The dual-ring array has 56 transmit elements and 48 receive elements on two separate concentric annular rings. The IC incorporates a 25-V pulser for each transmitter and a low-noise capacitive transimpedance amplifier (TIA) for each receiver, along with digital control and smart power management. The final shape of the silicon chip is a 1.5-mm-diameter donut with a 430-µm center hole for a guide wire. The overall front-end system requires only 13 external connections and provides 4 parallel RF outputs while consuming an average power of 20 mW. We measured RF A-scans from the integrated single-chip array which show full functionality at 20.1 MHz with 43% fractional bandwidth. We also tested and demonstrated the image quality of the system on a wire phantom and an ex vivo chicken heart sample. The measured axial and lateral point resolutions are 92 µm and 251 µm, respectively. We successfully acquired volumetric imaging data from the ex vivo chicken heart at 60 frames per second without any signal averaging. These demonstrative results indicate that single-chip CMUT-on-CMOS systems have the potential to produce realtime volumetric images with image quality and speed suitable for catheter-based clinical applications.
I. Introduction G enerating true volumetric ultrasound images in front of a flexible catheter would be beneficial in the diagnosis and treatment of arterial diseases such as chronic total occlusions (cTos) and complex transcatheter interventions in the heart [1] . several different approaches have been developed for implementing volumetric intravascular ultrasound (IVUs) and intracardiac echography (IcE) catheters based on mechanical rotation [2] , [3] . 2-d solid-state arrays provide a more robust and compact solution. ring-shaped annular arrays that allow a center opening for a guide wire for IVUs applications and a port for interventional devices in case of IcE, are especially suitable for forward-looking (Fl) volumetric imaging for guiding interventions [4] , [5] . The first attempt to realize these arrays used a single-ring piezoelectric transducer array [6] . This structure combined a side-looking (sl) IVUs ring array with Fl-IVUs by utilizing different vibration modes of the piezoelectric elements and relied on the same integrated electronics as the sl-IVUs system. The small sizes of the transducer elements, strong cross coupling, and difficulties in reliable fabrication have limited the success of this approach. More recent piezo-based ring arrays may have similar issues [7] . capacitive micromachined ultrasonic transducer (cMUT) technology has potential to overcome most of these limitations because it offers flexibility to fabricate arrays of different shapes and sizes [4] , [5] , [8] [9] [10] . In addition, it enables monolithic or flipchip-bonding-based electronics integration [11] [12] [13] [14] . single-ring 64-element annular cMUT arrays operating at 15.5 MHz, 19 MHz and 13.5 MHz have been demonstrated for Fl-IVUs [4] , [5] , [15] . a 64-element 10-MHz singlering cMUT array was integrated in a multi-chip, flip-chipbonded package for Fl-IcE [10] , [16] . More recently, a single ring array was connected to a commercial ultrasound system to demonstrate and compare its imaging capabilities in different modes including full phased array and synthetic aperture processing with spatially coded excitation for high-frame-rate imaging with improved snr, respectively [17] . These systems use the same array of elements for both transmit (Tx) and receive (rx), which results in a suboptimal noise performance because of the protection/switching circuits [18] . Furthermore, because each element is still connected to the outside system, the number of cables in the catheter is quite large.
dual-ring annular cMUT arrays with separate Tx and rx rings enable separate optimization of Tx and rx element geometry and electronics for higher snr with negligible loss in resolution [18] [19] [20] . For successful realization of Fl-IVUs imaging catheters with small-sized elements, Manuscript received May 7, 2013 ; accepted october 16, 2013. The project described was supported by award number r01Eb010070 from the national Institute of biomedical Imaging and bioengineering (nIbIb). The content is solely the responsibility of the authors and does not necessarily represent the official views of the nIbIb or the national Institutes of Health (nIH).
G. Gurun close integration of front-end electronics and the transducer array within the catheter is very critical. compared with a multi-chip integration scheme that requires many chip-to-chip interconnects, a system that integrates the transducer array with Tx and rx electronics on a single chip has several advantages. This approach can significantly mitigate the interconnection complexity and reduces the required steps in the manufacturing of the Fl-IVUs probe. In addition to these manufacturing advantages, one can multiplex receive channels after front-end pre-amplification, and thus the ultimately miniaturized single-chip Fl-IVUs system can be as thin as 1 mm with ~10 cables. This, along with a through-silicon via for each electrical connection, can lead to a flexible catheter tip for easy navigation through tortuous arteries, as shown schematically in Fig. 1 . Fl-IcE catheters should also enjoy similar benefits with more relaxed size constraints.
We recently realized a single-chip volumetric imaging system using monolithic cMUT-on-cMos integration in which dual-ring cMUT arrays were fabricated directly on top of pre-processed custom-designed cMos wafers and presented some initial results in conferences [21] , [22] . Here, we present the design, testing, and quantitative characterization of the overall single-chip front-end system suitable for volumetric IVUs and IcE imaging.
The paper is organized as follows: section II briefly describes the monolithically fabricated dual-ring cMUT arrays. We present the design and characterization of integrated front-end electronics in section III. We then describe the imaging setup and present detailed results on array characterization in section IV. We demonstrate volumetric imaging results and provide quantitative analysis in section V. Finally, we discuss the results and future improvements in section VI.
II. design and Fabrication of dual-ring
cMUT-on-cMos array Monolithic fabrication of cMUT arrays on top of preprocessed cMos wafers reduces parasitic capacitances [11] , [18] , and reduces the complexity of design while optimizing the use of silicon area. The 0.35-µm cMos electronics for the system are fabricated by the Taiwan semiconductor Manufacturing company (TsMc) foundry on 200-mm silicon wafers which contain 48 repeating dies per wafer. The wafers are subsequently diced into 6 approximately 4 × 7 cm (3 × 2 die) rectangular reticles to allow for cMUT fabrication using standard micromachining tools designed for 100-mm (4-in) wafers. The picture of the cMos wafer with custom-designed Ic's is shown in Fig. 2 . To fabricate cMUT arrays on the cMos electronics, a low-temperature process is used. detailed information about the particular cMUT-on-cMos fabrication process can be found in [11] .
The dual-ring arrays used in this study consist of 56 Tx (outer ring) and 48 rx (inner ring) elements with 1.31 mm and 1.13 mm center diameters, respectively. The device was first fabricated as full rectangular pieces and then etched to form a donut shape which is suitable for guide wires [ Fig. 3(right) ]. In the particular array shown, the Tx and rx elements are identical. Each element contains 4 individual membranes and is approximately 70 × 70 µm in size. The cMUT element capacitance is calculated to be 90 fF. Table I summarizes the physical parameters of the fabricated cMUT array. The membrane thickness and lateral size are determined by the center frequency and the trade-off between the coupling coefficient and fractional bandwidth, which is well balanced with a 50% fractional bandwidth in this case [23] . With a gap thickness of 120 nm, chosen for ease of fabrication, a collapse voltage of 140 V is obtained. overall, these geometrical parameters are not optimized in terms of overall cMUT performance, but provide a good balance between bandwidth, coupling coefficient, and ease of fabrication.
III. single-chip system for Fl-IVUs
To implement the necessary receive and transmit electronics in a single chip, we custom-designed a 200-mm (8-in) wafer reticle in 0.35-µm cMos process. The Ics in this wafer are custom-designed for monolithic integration of Fl-IVUs array with 1.4 mm diameter. Fig. 3(left) shows a micrograph of the Ic designed for Fl-IVUs dual- ring array. This Ic incorporates 48 low-noise receiver amplifiers and 56 pulsers, dedicated to receive and transmit elements in the array, respectively. The chip also includes buffers and digital control circuitry that is designed to synchronize transmitting and receiving sequences during the data acquisition. Fig. 3 (right) shows the external electrical connections to the imaging device. The rF data from 4 receive channels (out1-out4) are collected in parallel. The clk input has two functionalities. Its main function is to increment the counter in the digital control circuitry, which synchronizes the chip. It is also used to generate the pulse trigger signal that is routed to the active pulser circuitry. clr_ctr is the clear signal for the digital counter. V_pulse voltage input controls the magnitude of the high-voltage pulse. ctrl1 and ctrl2 are the two control voltages used in the preamplifiers. Two separate cMUT bias signals (V_rx and V_Tx) are provided for the separate receive and transmit cMUT rings.
note that this single-chip system requires only 13 external connections with the vdd and gnd connections. This figure can be reduced to 8 while still keeping 4 parallel rF outputs. This would be achieved by generating the cMUT and transmit pulser dc levels on-chip from a single dc bias input, and eliminating the amplifier tuning capability which was included for testing purposes. considering that the current 64-element sl-IVUs catheter [24] requires more than 200 chip-to-chip and chip-to-transducer electrical interconnect bonds and only provides a single output channel, the enormous advantage of this novel single-chip approach can be better appreciated.
note that some areas in the center and the perimeter of the Ic are left free of any metal traces or active cMos circuitry to enable etching through the silicon substrate to create the final donut shape suitable for placement on a tip of a circular catheter. The diameter of the gap at the center reserved for the guide wire is 430 µm. all the active circuitry and the cMUT array fit under a 1.5-mmdiameter silicon donut. The connection areas outside the diameter of the cMUT array are placed for initial testing of the Ic with wire-bonding and would be omitted in the final catheter implementation.
In catheter-based applications, the power requirement is tight to prevent over-heating of the dense single-chip system. For instance, in [25] , the average power budget for solid-state IVUs catheters is noted as 100 mW to make sure that the temperature of the catheter does not increase to damaging levels when the catheter is powered and allowed to dry. In this work, the power requirement is addressed in two different ways. Primarily, to reduce the power consumption, a power on-off capability is added to the receive amplifier. The amplifiers that are not actively used are biased off by the digital logic and at any given time only four of the amplifiers that are connected to the outputs are kept active. concurrently, to further reduce the chip power consumption, the receive amplifiers are designed with a low power consumption (0.8 mW) without significantly compromising their performance.
A. Preamplifier Design
To measure rF echo signals from cMUTs integrated with Fl-IVUs front-end chips, we designed two different low-noise receive amplifiers based on two different architectures, namely the resistive-feedback transimpedance amplifier (TIa) architecture and the capacitive feedback TIa architecture. The resistive feedback TIa implemented here is a revised version of the amplifier that was presented in [18] . a detailed discussion of the modifications, gain, and bandwidth and noise performance of this amplifier design was given in [26] and [27] . For brevity, here we only discuss the details of the capacitive-feedback TIa design (Fig. 4) .
This capacitive-feedback architecture TIa employs a current amplification to generate I oUT , which gets multiplied by R D to obtain a voltage output [28] . This architecture is especially promising for low-noise detection because it does not involve a noisy resistor in the feedback network. The total input capacitance, including the amplifier input capacitance (C In, aMP ), any parasitic inter- connect capacitance (C Par ), and the cMUT capacitance (C cMUT ), is referred as C In in the following expressions. assuming C 1 A 0 ≫ C In + C 1 and C 2 ≫ C 1 , the transfer function of the feedback system in Fig. 4 yields the following transfer function:
where
In these expressions, s is the complex frequency (laplace variable); g m1 is the transconductance of ) .
In this expression, ω dom represents the dominant pole and ω 2 is the second (nondominant) pole of the closedloop system. Further assuming that ω 0 ≫ g m1 /C 2 , the 3-db bandwidth and the nondominant pole of the system can be given as
It should also be noted that to ensure stability,
In this work, the simulated open-loop voltage gain (A 0 ) and the bandwidth of the core amplifier are 44 and 380 MHz, respectively. The simulated transconductance of M 1 is g m1 = 10 µs; the total input capacitance is C In = 150 fF which includes the cMUT capacitance, C cMUT = 90 fF, and the input capacitance of the core amplifier, C In_aMP = 60 fF. The effective C 1 includes the drawn capacitance, which is 25 fF, and the drain-to-source capacitance of M 5 , which is around 30 fF. Therefore, in this implementation, the effective C 1 is around 55 fF.
The input-referred spectral density of the current noise is expressed as
where g m is the transconductance, and i d 2 is the spectral density of the current noise square of the input transistor of the core amplifier that dominates the core amplifier noise. similarly, i dbias 2 represents the spectral density of the current noise square of the current bias circuitry that provides the bias current (I bias ) for M 1 . although not shown in (6) explicitly, M 1 also contributes some noise. However, the voltage noise at the gate of M 1 gets divided by the large A 0 value while getting referred to input, and therefore noise of M 1 can be neglected. From (6) , it can be seen that the current noise terms of the bias circuitry and the load resistance (R D ) gets divided by the current gain when referred to input, which is advantageous to get a low input-referred noise. note that the input dc node floats when the feedback network only contains capacitances. Therefore, a method is required to apply a dc bias to the input node. In this implementation, a Mos-bipolar device (M 5 ) is used in parallel with the feedback capacitance [29] . This acts as a very large value (on the order of a gigaohm) pseudo-resistor and provides a dc path to the input node. because of its very high effective resistance value, the noise of M 5 is negligible. The design consumes a 25 × 55 µm area and the simulated power consumption is 240 µa. note that most of the current is consumed by the core amplifier (M 2 − M 3 ).
B. Buffer Design
Each receiver set contains a buffer to drive the interconnect cable and scope capacitances. The designed buffer is a push-pull buffer including two source follower stages. The buffer is designed to have a wide bandwidth that is higher than 50 MHz with a 50-Ω resistive load. With a 50-Ω load (i.e., the case when driving a 50-Ω input of a network analyzer), the gain of the buffer stage is around 0.28 V/V. note that a buffer gain value that is less than unity does not have a negative effect on system snr because both signal and noise are scaled by this gain. The buffer consumes 1.2 ma of dc current and covers an area of 35 × 50 µm.
C. Measured Amplifier Characteristics
To measure the transimpedance characteristics of the design amplifier, we used the setup shown in Fig. 5 . For gain testing, the cMUT element itself was utilized to mimic an on-chip high impedance to convert the test voltage input to a current input into the amplifier. The details of this method of testing an integrated TIa-cMUT couple can be found in [18] . To extract the TIa gain from the overall voltage gain measurement, 90-fF is used as the cMUT capacitance. Fig. 6 shows the gain measurement of the designed capacitive-feedback amplifier. The measurement result demonstrates a transimpedance gain of 200 kΩ with a 40-MHz bandwidth. The simulated closedloop transimpedance gain is 160 kΩ and the bandwidth is 90 MHz on nominal process parameters. on a slow process corner, the simulated bandwidth drops to 35 MHz. Therefore, the measured 40-MHz bandwidth is within the simulated range.
To measure the noise characteristics of the designed TIa, we connected one of the outputs of the post-processed Ic to an agilent 4395a (agilent Technologies Inc., santa clara, ca) in spectrum analyzer mode. To measure the front-end electronics noise only, we did not apply any bias to the receiver cMUT. The input noise value of the capacitive-feedback TIa was obtained by dividing the measured output noise value by the buffer gain and the TIa gain (Fig. 7) . Table II outlines the measured characteristics of the capacitive feedback preamplifier design. The measured 310-fa/√Hz input-referred noise level at 20 MHz agrees with the simulation results. note that the measured input-referred current noise level of the amplifier is on the order of the thermal-mechanical noise of the cMUT elements, critical for the snr of the system [18] , [30] . It should also be noted that the input-referred current noise performance of this particular design is limited by the available area. The noise level can be significantly improved with a more relaxed area requirement, which enables increasing the values of C 2 and R D .
D. High-Voltage Pulser
The breakdown voltages of regular devices in the standard 0.35-µm cMos technology that the wafer is fabricated in are less than 10 V. To achieve higher pulse voltages, a high-voltage nMos based on an extended drain design approach is used [31] . Fig. 8 shows the pulser circuit implemented on-chip based on this high-voltage nMos design. The resistor (r) is implemented with a poly resistor. The pulser converts the 3.3-V unipolar input pulse into a unipolar high-voltage pulse. The width of the output pulse is controlled by the width of the low-voltage trigger pulse. To reduce the power consumption, the steady-state voltage of the output pulser is kept at high voltage. The output switches from high to low when the input trigger pulse arrives.
To eliminate a dedicated external connection, the lowvoltage pulse trigger signal is generated through the clk input. The clk signal is internally delayed for around 10 ns and then routed to the active pulser circuitry with the digital logic. The 10-ns delay is long enough for the switching transient to settle, which ensures that the in- Fig. 5 . setup for testing the transimpedance amplifier (TIa) transimpedance characteristics [18] . tended pulser to which the trigger signal needs to be routed is properly selected. note that, the pulse width of the low-voltage clk input determines the width of the output pulse. Each pulser consumes 35 × 50 µm area.
E. Digital Control and Power Management
a digital control block is designed to synchronize the operation of the transmitter and receiver elements in the array. during the initial pulsing stage, a single transmitter pulses, and during the receive sequence, four receive amplifiers are connected to the outputs. The digital block controls which of the four amplifiers are the active receivers and also which particular pulser is the active transmitter at any given time. It changes the active elements during the data collection with a single clock and the whole imaging process is completed in 1024 clock cycles.
For development and testing purposes, we have fabricated arrays with different element counts up to 64 Tx and 64 rx elements. To use the front-end Ic for such different array sizes, we designed the electronics for array sizes of 64 elements as Tx and rx. In this design, the control circuitry does not reset the pulse repetition. However, one can reconfigure the design for fixed-size array configuration. Fig. 9 shows a simulated timing diagram depicting the operation flow for the power control. The transmit trigger signal is generated approximately 10 ns after the receiver is enabled. In this figure, the pulse repetition rate is 20 µs and the pulse width is 20 ns, similar to normal operating conditions. a small signal is applied to the TIa input to show that the output is generated only when the power is turned on. When receive amplifier bias voltage is switched to the on position, it takes around 100 ns for the amplifier output to settle down to proper operation range. a peaking occurs during the transition of the amplifier output but it stays within the safe voltage limits of the transistors. With only 4 TIas active at any time, and with negligible duty cycle of the transmit pulsers, the average power consumption of the chip is about 20 mW, mainly dominated by the output buffers.
IV. system characterization

A. Experimental Setup
To demonstrate imaging performance of the single-chip dual-ring cMUT-on-cMUT arrays, a custom data collection setup was constructed. The fabricated cMUT array was first wire-bonded to a ceramic dual inline package (dIP) chip holder with 13 connections and placed at the bottom of a small Petri dish. The chip holder was placed in a custom printed circuit board (Pcb) for data acquisition and control signal connections. The rF output channels were digitized using 2 dual-channel digitizer cards (spectrum UltraFast M3i.4142) with 14-bit adcs and 250 Ms/s sampling rate. The clock signal for the digital control of the Ic was generated by the external digital I/o of the digitizer card. during each clock signal, a transmit element was fired and 4 parallel rx output channels were digitized simultaneously. The acquired data were transferred from digitizer's memory to the local hard drive for further data processing and image reconstruction.
B. Analysis of the Pulse-Echo Data
To characterize the performance of dual-ring array elements before the volumetric imaging experiments, we acquired pulse-echo data from an oil-air interface at nearly 10 mm above the array, where all rF a-scans from 56 Fig. 8 . Pulser element that is capable of generating 25-V pulses as narrow as 2-ns with the 90-fF cMUT loading in a cMUT-on-cMos implementation. Fig. 9 . circuit simulation results with a timing diagram that shows the rx enable, which powers up the receiver circuitry, Tx trigger signal (delayed clk input) and the amplifier current consumption. note that the type of the receive amplifier that is used in this timing simulation is the resistive-feedback transimpedance amplifier (TIa) design.
× 48 Tx-rx element combinations were collected. both Tx and rx arrays were operated in conventional mode and with the same dc bias at 90% of the collapse voltage. We used 25-V unipolar and 30-ns pulse width, which was experimentally optimized for maximum echo amplitude. We processed the raw rF a-scan data by a digital band-pass filter with a 10 to 30 MHz passband to eliminate out-of-band noise. Fig. 10 shows pulse-echo and frequency response of a single pair of array elements from an oil-air interface indicating 20 MHz center frequency and about 50% 6-db fractional bandwidth. The spurious response following the main echo is 15 db lower and is due to acoustic crosstalk in the array. a properly designed filter can clean spurious response which is low frequency in nature. The distribution of the center frequency and bandwidth for different Tx and rx elements are shown in Fig. 11 , where the 51st and 9th elements were the reference Tx and rx elements, respectively. The average center frequency is 20.1 MHz with 2% standard deviation of the center frequency, where the average FbW is 43%, showing the uniformity suitable for array imaging.
The snr a-scans calculated as the ratio of rMs amplitude in a time window including the echo signal from the plane reflector (oil-air interface) to the rMs amplitude in time window including no echo from the reflector, measured over transmit and receive array can be seen in Fig. 12 . The average snr for a single-element pulse echo for the plane reflector at 10 mm away from the array is 19 db without any averaging. In this particular array, one of the transmit channels was not functional because of a missing connection in the cMos electronics layout. all other system components worked with full functionality.
V. Volumetric Imaging results
The current chip sequentially uses single Tx-rx pairs, so that all Tx and rx combinations are available for offline processing of data for different synthetic aperture beamforming strategies. despite this arrangement limiting the data collection speed, the imaging potential of the single-chip cMUT-on-cMos dual-ring array was evaluated using a wire phantom and performing ex vivo imaging on a chicken heart sample. For image reconstruction, all ascans from 56 × 48 Tx-rx combinations were collected to perform off-line processing and beamforming. Each a-scan was recorded for 25 µs with a sampling rate of 250 Ms/s with 14-bit resolution and pulse repetition rate of 40 kHz. note that a sufficient amount of time was reserved for the successive firings to ensure the waves inside the medium were attenuated (~60 db in oil and tissue). With 4 parallel rF channels, this results in 60 fr/sec data acquisition rate, suitable for real-time volumetric imaging. a custom rF beamforming software was written to process a-scan data. Following band-pass filtering of data in the 10 to 30 MHz band, we applied synthetic phased array beamforming using standard delay-and-sum method to calculate the image intensity in each image voxel by using dynamic transmit and receive focusing [4] , [32] :
where u is the 3-d cartesian coordinates of the image voxel as (x, y, z); s(·) is the sampled rF a-scan; m is the sample index corresponding to the total flight time in terms of number of sample; w a (·) is the apodization coefficient; w n (·) is the norton's weightings coefficient for ring array apertures; the first and second sums are over the N t transmit and N r receive elements, respectively; f s is the Fig. 10 . Pulse-echo and frequency response of a single a-scan. 
where u i and u j are the 3-d position vectors of transmit and receive elements, respectively. To suppress the side lobes in the reconstructed images, a cosine apodization function was applied radially on the aperture. In addition, norton's weighting function defined for ring arrays was applied to obtain full circular aperture resolution [5] , [6] , [33] . This expression is a direct realization of synthetic phased array image reconstruction, where the dynamic transmit and receive focusing delays were included in the flight times implicitly. For envelope detection, time-delayed sampling technique was used to extract the quadrature component from the received signal [34] . Following envelope detection, logarithmic compression for a desired display dynamic range was applied to produce the final image. an imaging phantom of four 100-µm-diameter metal wires was used for quantitative testing of point resolution. The wires were immersed in oil and diagonally located on planes parallel to the xy plane at different depths (4, 5.2, 6.6, and 8.2 mm). The reconstructed b-scan image for the xz plane is shown with 40 db dynamic range in Fig. 13(left) . simulated ideal 2-d point spread functions (PsFs) for the wire target locations were produced with 50% FbW Gaussian pulses without any noise and plotted with 40-db dynamic range in Fig. 13(right) . The grating lobe artifacts for the first and second wire targets are seen at nearly 90° angle in the b-scan. This is expected because the interelement distance of the array elements was 74 µm, which is close to the wavelength for 20 MHz center frequency. It should be noted that the grating lobe level for the simulated PsFs is 45 db, and hence they are not observed in the simulated PsF image. The grating lobe artifacts for the first and second wire targets are seen at nearly 90° angle in the b-scan. because the interelement distance for this 20-MHz test array was 74 µm, close to the wavelength, such grating lobe artifacts spaced laterally approximately 90° away from the targets were expected. These grating lobes can be avoided by implementing ring arrays with reduced inter-element spacing.
Fig. 14 compares experimental and simulated lateral (1-d) PsFs for the second wire target, which was located on axis. The experimental lateral PsF was calculated by taking the axial average over a window covering the axial spreads. The experimental wire image and simulated PsF show close agreement around the main lobe. The noticeable differences between two PsFs are the multiple reflections on the axial direction and near-peak side lobe levels. The finite wire thickness and acoustic crosstalk are the main reasons for this structure. The axial multiple reflections are monotonically decreasing and measured as 20 db lower than the main lobe peak, which is consistent with the secondary echo level showed in the a-scan (Fig. 10 ). The peak side lobe level in the experimental PsF is 5 db higher as compared with the simulated PsF, whereas the far side lobes remain under −30 db in both PsFs. The experimental wire images also show asymmetry around the target because of the nonuniformity of array element responses. Table III summarizes the measured image parameters and compares them to the simulated values. The 100-µm wire target, larger than the wavelength (75 µm), does not ideally represent the point response. However, for the purpose of obtaining 2-d lateral resolution (PsF), which is ~250 µm for the second wire (see Table III) , it is adequate. overall, the values are within ± 10% of each other in terms of resolution metrics.
We have computed experimental image snr for the second wire target by calculating the ratio of rms values of image pixels within the 6-db windows of the target main lobe to the rms of image pixels in a window near the target without including any target echo. The image snr for the nearest wire, which has the maximum image snr, as expected, is calculated as 48 db. We also calculated an average image snr by measuring the image snr values of all four wire targets. as a result, the single-chip device produces 43-db average image snr for the wire phantom. The theoretical image snr produced by synthetic phased array beamforming using rF data are expressed as [35] : 
where a n,m is the channel gain corresponding to nth transmit and mth receive operation. Here, a n,m is estimated as the rms of rF a-scan data in a time window including the target echo by using the oil-interface data.
To demonstrate 3-d volumetric imaging on a medically relevant sample, we used an ex vivo chicken heart phantom. The phantom was immersed in oil inside the Petri dish and placed approximately 2.5 mm above the array surface, as shown in Fig. 15 . We reconstructed cross-sectional b-scan images in xz and yz planes as well as 3-d rendered image from the chicken heart phantom without any signal averaging. The xz and yz plane b-scan images with 40-db dynamic range are presented in Fig. 16 . The image size is 10 mm in both dimensions. The brightest spot in the image is the reflection from the wall of the chicken heart phantom closest to the array. although the aperture size of the cMUT array is very small compared with the phantom, the apex of the phantom is clearly visible in the b-scan images. The computed experimental image snr for the brightest location of the image is 44 db. We also calculated the contrast-to-noise ratio (cnr) using the following expression:
where µ b is mean intensity in decibels of the bright spot of the chicken heart image, µ s and σ s are the mean and standard deviation of the intensity in decibels within the speckle [32] . The measured cnr from the chicken heart image is 2.6. These images also show that 14 db image snr is obtained after nearly 3 mm propagation in oil and 4 mm sound penetration in heart tissue at 20 MHz at an angle of 35°.
VI. discussion
The imaging results presented here were obtained using the first-generation cMUT-on-cMos single-chip system. These results demonstrate the viability of the designed front-end system for volumetric Fl-IVUs and IcE imaging. several further improvements can be explored to enhance the image snr and imaging speed. by reducing the gap thickness from 120 to 50 nm, the receive snr can be improved by about 4 db [36] . With the same 50 nm gap thickness and 25 V available pulse amplitude, 5 db improvement in output pressure is predicted based on a detailed nonlinear model [37] . Further snr improvements up to 14 db should be possible using coded excitation schemes without increasing data collection time [38] . These snr improvements can easily overcome a 16 db diffraction loss when extending the imaging range from ~7 mm to 4 cm. considering that the ultrasound attenuation in blood (0.2 db/(MHz·cm)) is about 5 to 7 times less than that of heart muscle [39] [40] [41] , these front-end systems can be used for several IcE applications at 20 MHz [3] . In addition, these improvements can be combined with the flexibility of cMUT-on-cMos approach for placing array elements at any desired location on the cMos electronics. Using this flexibility, instead of the dual-ring annular array the locations of the Tx and rx elements can be determined to collect spatial information with fewer Tx-rx combinations [20] . Using spatially and temporally coded signals, a powerful defocused Tx array can be implemented [32] , [42] . These methods can be used to decrease the number of firings significantly for increasing imaging speed, especially for IcE applications. Finally, given the computational power offered by the latest GPUs, it seems quite feasible to implement a real-time 3-d imaging system based on the cMUT-on-cMos systems described here [43] , [44] .
VII. conclusion
The objective of this work was to show a monolithic cMUT array and circuitry suitable for 3-d imaging FlIVUs and Fl-IcE catheters. We successfully implemented a 20-MHz single-chip cMUT-on-cMos front-end system with a dual annular ring structure for this purpose. This integrated front-end system includes both high-voltage pulser and low-noise receiver circuitry dedicated to each Tx and rx array element on the array. The single-chip integration reduces the interconnect complexity significantly, enables separate Tx and rx circuitry for low-noise operation, and leads to the ultimate miniaturization. The front-end system, which can be further reduced in size, currently fits into a size of a 1.5-mm diameter, 300-µm thick donut shape suitable for placement at the tip of a catheter with only 13 external cables; its total power consumption is 20 mW. overall system characterization indicates cMUT uniformity and performance suitable for real-time volumetric imaging. acknowledgment The authors thank the anonymous reviewers for their constructive and insightful comments.
references Fig. 15 . Experimental setup of wire-bonded cMUT array in a chip holder combined with a modified Petri dish for imaging chicken heart phantom. 
