Abstract-ΣΔ beamforming is a promising technique for small analog front-end(AFE) of the medical ultrasound imaging system. Nonetheless, the high data rate from the ΣΔ modulator in the AFE and the high-Q reconstruction filter put harsh requirements on the digital beamforming circuits. Although the BScan-sample-based ΣΔ beamformer structure with FIR reconstruction filter reduces the speed requirement on multipliers so as to make the ΣΔ beamformer implementable in conventional digital platforms, it still requires large area for high-speed adders. In this work, a new BScan-sample-based ΣΔ beamformer structure with IIR reconstruction fitlter is developed.
INTRODUCTION
THE ΣΔ beamforming technique is a promising technique to reduce the size of the analog front-end (AFE) of the medical ultrasound imaging system as it uses simple ΣΔ modulators in the AFE instead of Nyquist-rate analog-to-digital converters (ADC), such as pipeline ADCs [1] . However, the fast data stream from the ΣΔ modulators in the AFE requires efficient ΣΔ beamformers.
Conventional beamformers perform beamforming continuously at the sampling rate of Nyquist-rate ADCs. This ADC-sample-based beamforming architecture works well with Nyquist-rate ADCs as the ADC sampling rate is normally 4 times of the ultrasound frequency. In the case of ΣΔ modulators, which samples at 64 or 128 times of the ultrasound frequency, this architecture requires extremely fast multipliers which can not be readily implemented in state-ofart CMOS processes. Another challenge of the ΣΔ beamformer is that it requires a very narrow low-pass reconstruction filter, which requires high quality factor.
ΣΔ beamformer structures have been extensively studied at the signal processing level to remove the distortion due to dynamic focusing [1] - [3] , without addressing implementation issues. In the only ΣΔ beamformer implementation so far, Tomov generated beamformed samples according to the required sampling rate on the beamline [4] , which is usually much lower than the ADC sampling frequency. This BScansample-based architecture allows multipliers with much lower processing speed compared to conventional ADC-samplebased beamformers. Using a finite-impulse-response (FIR) reconstruction filter, the ΣΔ beamformer could be implemented in field-programmable-gate-arrays (FPGA). Nonetheless, normal medical ultrasound imaging systems require very long FIR filters, which still demand many multipliers and particularly fast adders.
In this paper, we improve the BScan-sample-based ΣΔ beamforming technique by adopting infinite-impulse-response (IIR) reconstruction filters. The IIR filter can simplify the ΣΔ beamformer structure greatly so that slower and fewer multipliers and adders are needed. As a result, the new design reduces the ΣΔ beamformer small enough for practical medical ultrasound imaging systems. The new ΣΔ beamformer is implemented in FPGA and digital IC for a 128-channel 5MHz ultrasound system, and compared with the FIR beamformer.
II. CONVENTIONAL BEAMFORMER
A conventional beamformer is shown in Fig. 1 . Time-gaincompensation (TGC) amplifiers compensate the distance attenuation of the received echo. Nyquist-rate ADCs sample echo signals with sampling frequency (f s ) four or eight times higher than the carrier frequency. The coarse delay t coarse is determined by selecting samples in a first-in-first-out (FIFO) buffer. The fine delay t fine is obtained through interpolation. The delayed echoes multiplied by apodization weights are summed to form the beam. A matched filter is used to enhance the detection signal-to-noise ratio (SNR).
All digital operations are performed at the ADC sampling frequency. For an N-element transducer array, a beamformer requires 3N multipliers and a pipelined adder. Widths of digital units depend on sizes of beamforming parameters.
III. BEAMFORMING PARAMETER SIZE
Normally, beamforming parameters are designed to suppress the quantization grating lobe lower than 60dB. For a conventional medical ultrasound imaging system in Tab. 1, simulations in Fig. 2 show that 7-bit delay quantization is needed [5] . Simulations in Fig. 3 show that at least 10-bit amplitude quantization is needed. Simulations in Fig. 4 show that 6-bit apodization coefficient quantization is needed. The pulse used in the simulation has a 100% fractional bandwidth and is given by cos(2πλ)e -0.5λ2 .
IV. ΣΔ BEAMFORMING
A direct implementation of a ΣΔ beamformer is shown in Fig. 5 . The ΣΔ beamformer can select the delay without interpolation. Although previous ΣΔ beamformer designs suggested implementing the apodization in the analog TGC, it is much more desirable to implement it as digital weights because apodization for different beamlines can be different. In order to achieve 60dB SNR on the beamformed signal for an N-element array, the SNR of every channel needs to be ( ) N SNR channel log 10 60 − =
(1) For a 128-element transducer array, the SNR of every channel should be 49dB. A 2 nd -order 1-bit ΣΔ modulator will be able to provide enough dynamic range. In order to achieve 7-bit delay quantization, the over-sampling ratio (OSR) of the ΣΔ modulator would be 2 7 /2 = 64. The output single-bit data rate is 640MHz.
V. DESIGN OF ΣΔ BEAMFORMER

A. Apodization Weight
The single-bit data stream simplifies the apodization weighing into a switch as shown in Fig. 6 . The CMOS switches can select the 6-bit weights with high data rate.
B. Pipeline Adder
A pipeline adder can be used to sum signals from N channels in high speed. The first layer includes 6bit adders. The second layer includes 7bit adders. Until the summed result reaches 10bits, the following adders are 10bit adders. Overall, the pipeline adder requires (N/2+1)×log 2 N/2 adders.
C. Reconstruction Filter
In ΣΔ beamformer, low-pass filter is usually used for the reconstruction. Multipliers in the filter are difficult to implement at the 640MHz ADC-sampling-rate even with state-of-art CMOS processes. However, the required sample generation speed of the Bscan image is much lower than the ADC-sampling-rate.
Assume a Bscan image includes N a lines, N r samples on every line. Assume the frame rate of high-quality Bscan images is f hq . The ultrasound transmission number to form one high-quality Bscan image is N t . Then, the required Bscan sample generation rate is
If f hq is 40Hz, N t =256, and N r =512, f s,bf is only 5MHz. If the reconstruction filter is designed to generate samples at this rate, the speed requirement of the multipliers is much lower.
1) FIR filter:
The filter was designed as an FIR filter in [4] . In order to implement the filter with the spectrum in Tab. 1 at 640MHz sampling rate, the FIR filter needs M=400 taps to suppress the stopband lower than -50dB, which requires tremendous number of multipliers. The multiplier number can be reduced by serializing the FIR filter as shown in Fig. 7 . With 500MHz multipliers and adders, 4 6×10 multiplifiers and 4 10-bit adders are need for the first stage of the FIR filter. The sum in every segment are stored in registers and summed up by another 100 10-bit adders for the Bscan sample in the second stage. The apodization weighing and the pipeline adder need to operate at M·f s,bf, which is 2GHz for the typical ultrasound system. If the adder is limited to 500MHz, 4 parallel pipeline adders are needed, which requires 508 adders. This makes the FIR filter ΣΔ beamformer very large.
2) IIR filter:
We develop an IIR filter for reconstruction. Because the signal band is very narrow compared to the sampling frequency, the phase within the signal band can be mostly linear even for IIR filters.
An inverse Chebyshev IIR filter is designed for the 7.5MHz passband, 12MHz stopband, -1dB passband ripple, -50dB stopband suppression. Following a filter synthesis routine, the filter is found to be 9-th order. The analog coefficients of the derived IIR filter are quantized to estimate the required bit-width of the coefficients. By quantizing the coefficients into 6 bits, the filter is able to achieve -50dB suppression outside of the stopband as shown in Fig. 8 . Although the phase is not perfectly linear within the 7.5MHz passband, the phase linearity is still acceptable for the ultrasound imaging application.
The IIR filter can be implemented by its direct form I in Fig. 9 . The left part implements zeros, while the right part implements poles. Initially, the summed echoes fill the left shift register. The filter takes 19 cycles to generate the beamformed sample. With the Bscan-sample-rate design, the minimum speed of the multipliers is 19f s,bf , which is 95MHz for the target ultrasound system. If 500MHz multipliers and adders are used, 4 6×10 multipliers and 4 10-bit adders are needed. The required speed for the apodization weighting and pipeline adder is also 95MHz, which is well below the 2GHz speed required by the FIR filter. With 500MHz adders, the pipeline adder needs only 127/5=26 10-bit adders. Therefore, the IIR filter design reduces hardware cost for both the filter and circuits in front of the filter.
D. Coefficients
Apodization weights are designed for every Bscan sample. The weights are loaded from memory when a complete Bscan is formed after one transmission. 96kB memory is needed to store the weights.
The IIR filter coefficients are the same for all beamformers. Hence, only 16B memory is needed for the coefficients.
VI. IMPLEMENTATION RESULTS
Due to its flexibility, FPGA is still the most popular implementation platform for ultrasound beamformers. Particularly, the Xilinx FPGA Virtex5 includes DSP cells. The ADC-sample-based ΣΔ beamformer requires too many highspeed multipliers, which can not be accommodated by FPGAs. The two Bscan-sample-based ΣΔ beamformer structures are designed in this work: the beamformer with FIR filter, the beamformer with IIR filter. In order to draw comparison on logic gate numbers, the two beamformers are also synthesized to digital ICs in a 0.18μm CMOS logic process. The beamformer designs are described in VHDL and are implemented in Virtex5 FPGA. The highest processing speed of the FPGA is 600MHz. The FIR beamformer is designed with 500MHz adders and multipliers. After synthesis, every FIR beamformer requires 6725 look-up-tables (LUT), which needs about 1681 Virtex5 slices, and 4 DSP48E cells. Every IIR beamformer requires 809 LUTs, which needs about 203 Virtex5 slices, and 4 DSP48E cells. In order to implement 256 beamformers for the target ultrasound system, FIR ΣΔ beamformer would require 9 XC5VLX330T chips, while the IIR ΣΔ beamformer requires only 2 XC5VSX240T chips. The VHDL beamformer designs are also synthesized by the Synopsis tool into a digital IC using a 0.18μm CMOS logic process with the built-in library. In this process, the minimum delays of a 10-bit adder and a 6×10 multiplier are 1.96ns and 4.3ns respectively. Hence, the process allows 500MHz adders and 200MHz multipliers. With these speed constraints, every FIR beamformer is synthesized into 74089 basic logic gates, which covers 915,922 μm 2 die area. Every IIR beamformer is synthesized into 9052 basic logic gates, which covers 105,330 μm 2 die area. With the IIR ΣΔ beamformer, the beamformer array for the target ultrasound system can be designed into one 5.2mm×5.2mm digital IC.
VII. CONCLUSIONS
In this paper, a new ΣΔ beamformer structure is introduced. The new structure form the beam based on BScan-sample-rate with IIR construction filter. Compared to the existing ΣΔ beamformer with FIR filter, the IIR structure reduces the hardware requirement by about 8 times. Both beamformers are implemented in FPGA and digital IC. Implementation results show that, with the new ΣΔ beamformer structure, the 256 beamformer array can be implemented by two FPGA or a 5.2mm×5.2mm digital IC in 0.18μm CMOS logic process.
