VLSI Circuits for adaptive digital beamforming in ultrasound imaging by Karaman, M. et al.
IEEE TRANSACTIONS ON MEDICAL IMAGING. VOL. 12, NO. 4. DECEMBER 1993 71 1 
VLSI Circuits for Adaptive Digital 
Beamforming in Ultrasound Imaging 
Mustafa Karaman, Abdullah Atalar, and Hayrettin Koymen 
Abstract- For phased-array ultrasound imaging, alternative 
beamforming techniques and their VLSI circuits are studied 
to form a fully digital receive frontad hardware. In order 
to increase the timiig accuracy in beamforming, a computa- 
tionally efficient interpolation scheme to increase the sampling 
rate is examined. For adaptive beamforming, a phase aberration 
correction method with very low computational complexity is 
described. Image quality performance of the method is examined 
by processing the non-aberrated and aberrated phased-array 
experimental data sets of an ultrasound resolution phantom. A 
digital beamforming scheme based on receive focusing at the 
raster focal points is examined. The sector images of the resolu- 
tion phantom, reconstructed from the phased-array experimental 
data by beamforming at the radial and raster focal points, are 
presented for comparison of the image resolution performances 
of the two beamforming schemes. VLSI circuits and their imple- 
mentations for the proposed techniques are presented. 
I. INTRODUCTION 
HASED-ARRAY ultrasound imaging techniques have P been extensively used in modem medicine for diagnostic 
purposes. In reconstruction of phased-array ultrasound images, 
short bursts of ultrasound are transmitted, and echoes 
reflected from internal structures of body are received by 
a phased-array transducer. The imaging plane is scanned 
by beamfonning~lectronically steering and focusing the 
array-both in transmit and receive modes. In transmit mode, 
the beamforming process is performed at every scan angle, 
while in receive mode, it is dynamically repeated for every 
image point [ 11. Design of the transmit beamforming circuitry 
is relatively easy, since each transducer can be fired by digital 
timing, while design of the receive beamforming circuitry is 
an involved task and has been the subject of considerable 
research [2], [3]. 
Real-time phased-array ultrasound beamforming involves a 
significant amount of electronic signal processing at video 
rates. Receive beamforming hardware based on analog cir- 
cuitry is bulky and expensive. Recent developments in digital 
integrated circuit technology motivates research on advanced 
digital beamforming techniques based on special purpose 
VLSI circuits. [3]-[6]. In the design of such circuits, it is 
important to examine the beamforming algorithm performance, 
which critically affects image resolution [7], and the feasibility 
of algorithm for VLSI implementation [8]. In this study, 
we consider three major problems in beamforming: delay 
quantization, phase aberration, and receive focusing used for 
image reconstruction. We propose new digital beamforming 
schemes, and present VLSI circuits for their realizations. 
Delay quantization has a significant undesired effect on the 
transmit and receive responses of the system, which results in 
an increase in the side lobe levels of the array response [9]. 
Accuracy of timing information used in beamforming can be 
increased using fast analog-to-digital converters (ADC), with 
relatively high cost. It can be also solved by employing digital 
upsampling (increasing the sampling rate by interpolation) 
techniques [lo]. Since the timing accuracy in beamforming 
is much more critical than the amplitude accuracy, a simple 
upsampling scheme can be used for this purpose. As a cost- 
effective solution for high timing accuracy in beamforming, 
we examine a linear interpolation scheme with coefficients in 
discrete powers-of-two space [ 111. 
Computation of the timing information required for the 
beamforming is based on the assumption that the image plane 
is composed of uniform soft tissue. This assumption, however, 
is not valid in general, and causes significant phase errors [12], 
and hence degradation in image resolution. The solution of 
this problem involves phase error correction using an adaptive 
beamforming technique. The algorithm used for this purpose 
must be computationally efficient for real-time applications. 
As a solution to this problem, we study a phase aberration 
correction technique with very low computational complexity, 
based on the time delay estimation via the minimization of 
sum of absolute differences between the samples of adjacent 
array elements. 
In conventional ultrasound imaging, receive beamforming 
is carried out at the radial focal points. The radial data is 
converted to the raster data which correspond to the display 
pixels in rectangular coordinates. The conversion process, 
called scan conversion [13], imposes a significant hardware 
overhead and may degrade the image quality. In order to 
eliminate the scan conversion, we examine an alternative 
digital beamforming scheme based on receive focusing at the 
raster points. 
In the next section, a digital front-end hardware architecture 
is outlined together with the description of the linear interpo- 
lation scheme. In Section 111, the phase aberration correction 
method is Dresented. Section I v  examines the beamforming at 
Manuscript received July 14, 1992; revised May 17, 1993. This work was 
by the Turkish Scientific and Research Council, TUBITAK, 
The associate editor reswnsible for coordinatine the review of this DaDer and 
1 .  Y 
recommending its publi'cation was Lh. R. W. Martin. 
neering, Bilkent University. Ankara, 06533 Turkey. 
the raster focal points. In all cases, the hardware architectures 
are also discussed in detail. The VLSI implementations are The authors are with the Department of Electrical and Electronics Engi- 
IEEk Log Number 9213399. presented in Section V. 
0278-0062/93$03.00 0 1993 IEEE 
712 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 12, NO. 4, DECEMBER 1993 
PHASEDARRAY $0 ........... + /+ + .+ 
........... ...... 6 + ; + 
. . . . . . .  ...... 0" 
. . .  
SAMPLING '.+ ;' + 
o
o... .... .... . . .  
..... 
UPSAMPLING 
.o
... ...... 0 
........... 
+....'.....a':+ +. "Q 
. . .  
. . .  
PHASE 
DETECTION 
. .  . .  .... . . .  
6 +' . .  
; 0 ~ 
. .  +a ...... + ... o+ ABERRATION @- -+ (I 
; +  .' + ....I +- .... ,:.+ + FOR FOCUSING 
0 FOCUS DELAYS 
.......... 
. .  . .  ....... 
. . . . .  .a' 
.a., RESAMPLING . . .  
SYNCHRONIZA~ON 
AND ADDITION 
BEAMFORMER 
0 RADIALPOINT f RASTWPOWT 
Fig. 1 .  Radial and raster focal points. 
11. DIGITAL FRONT-END HARDWARE 
In phased-array digital receive beamforming, radio fre- 
quency (RF) echo signals received by the phased-array ele- 
ments are sampled using ADC's, and then all signal processing 
operations are handled by digital electronics. An architecture 
showing the processing units for the digital front end is 
depicted in Fig. 2. The timing accuracy in beamforming and 
in phase aberration correction is increased by upsampling. The 
phase aberration estimation is performed on the sampled data 
and then it is used to correct the beamforming timing computed 
for the receive focal points. The re-sampling process prior to 
beamforming is required to select the samples corresponding 
to the focal point. Finally, the samples corresponding to 
the focal point are synchronized and added to complete the 
beamfoming. The signal value of the focal point is obtained 
at the output of the beamformer unit. 
The sampling rate can be increased using baseband demod- 
ulation or bandpass interpolation techniques [ 101, [3]. The 
realization of the former technique requires mixing and low- 
pass filtering, while the latter requires only interpolation (zero 
padding and filtering). Since the level of time quantization is 
much more critical than the level of the amplitude quantization, 
a simple upsampling scheme such as a linear interpolation can 
be used for upsampling to increase the timing accuracy. In 
the linear interpolation, a new sample, point C, between two 
existing samples, points A and B, can be interpolated as 
where TI and TZ are the time distances between C and 
B, and A and B, respectively. For hardware realization, 
this expression can be further simplified by expressing the 
coefficients Tl and TZ in the sum of powers-of-two form 
[ 1 11. To do that we need to quantize TI and Tz. Choosing 8 
quantization levels for T2 is tolerable for both the beamforming 
and phase aberration correction purposes [14]. Thus, the 
coefficients 2'1 and TZ can be expressed as 
Ti = 2' + (- 1)" 2" Tz = 2p2 + (- 1)"22q2 (2) 
OUTPUT DATA FOR 
FOCAL POINTS 
Fig. 2. Block diagram of the digital receive front-end hardware. 
where p l ,  q1, SI, p z ,  qz, and s2 are non-negative integer 
numbers. As a result, the realization of the interpolation is 
reduced to three shift and three addition operations. 
The hardware structure shown in Fig. 3 is developed for 
the linear interpolation scheme. It takes two input samples, 
A, and B, and the value of time distance to the point at 
which the interpolation will be performed. It generates the 
interpolated sample, C, as its output. The power coefficients 
for representation of 2'1 and TZ -TI, are generated by decoding 
Tl.  The resultant signals, p l ,  q1, s l ,  p z .  q2, and 52, are 
used as the control signals for shift-left operations on the 
input samples. Then, the appropriately shifted versions of 
each sample are added to realize the multiplications, T1A = 
Then, the interpolated sample is obtained by adding these two 
results and performing a shift right operation on the output. 
The interpolator scheme is designed in five pipelined levels to 
achieve a high speed. 
(2f" + (-1)'129')A and (T2 - T1)B = (2** + (-1)s2292)B. 
111. PHASE ABERRATION CORRECTION 
The change in the sound velocity causes significant phase 
errors in the beamforming. These errors are further aggravated 
by the existence of near field inhomogeneities. The phase 
aberration can result in increase in side lobes of beam pattern, 
degradation in lateral and range resolution, and also range and 
lateral shifts [ 121. 
Various phase aberration correction methods have been 
reported for the nonuniform aberrating layers with random 
reflector distributions [ 15]-[ 181. Recently, two-dimensional 
phase aberration correction studies have been presented 
[ 191-[21]. Also, there exist time delay estimation techniques 
[22] which can be also used in off-line correction of 
phase aberration in ultrasound imaging. However, real- 
time ultrasound imaging necessitates on-line correction of 
KARAMAN et al.: VLSl ClRCUlTS FOR ADAFlTVE DIGITAL BEAMFORMING 
A B 
~ 
713 
I I I 
ADDER 
I 
SHIFTRIGHT 
C 
Fig. 3. Hardware structure of the interpolator. 
phase aberrations. Hence, the computational efficiency of the 
correction algorithm is crucial in real-time applications. 
The proposed technique is based on time delay estimation 
via the minimization of sum of absolute differences (SAD) 
between the sampled echo signals received at adjacent element 
pairs. For the sampled echo signals, the error measure is 
expressed as 
W 
~ ( k )  = Isn-l(m + k )  - sn(m)I (3) 
m=l 
where W is the total number of samples, and s,(m) is the 
mth sample of the echo signal received by nth element of the 
array. The time shift index minimizing this expression, k, is 
the sum of relative delays for focusing and aberration. Since 
the relative delays for focusing, g, are known, the aberration 
delay pattern can be obtained by a cumulative summation of 
the relative aberration delays: 
n 
At, = T, IC;  - gi (4) 
i=l 
where T, is the sampling period. To further increase the 
computational efficiency of the SAD technique the word length 
can be made shorter. The simulation results indicate that 1- 
bit word length is sufficient for phase aberration correction 
without a significant loss in the accuracy. Hence, the technique 
can be used on RF samples properly quantized to a single bit, 
where each addition reduces to a bit level exclusive-or (XOR). 
Thus, the error expression becomes 
W 
E(k) = am+kXOR b, ( 5 )  
m=l 
where a, and b, are the one bit representation of echo 
samples sn-l(i) and sn( i ) ,  respectively, and the indexes used 
for element number are skipped. Obviously the hardware 
realization of this scheme is much simpler than that of the 
full word SAD. Reduction of the word length to a single bit 
can cause poor convergence in phase estimation from speckle 
noise. This effect can be avoided, however, by averaging the 
estimated phase patterns over a number of scan angles [18]. 
In order to compare the performances of the phase aberra- 
tion correction techniques, the images of a standard graphite- 
gel AIUM resolution phantom are reconstructed using the 
aberrated and non-aberrated (control) data sets. The data' 
was acquired with a conventional phased-array transducer 
where the radio frequency (RF) A-scans were recorded from 
every possible combination of transmitter and receiver for all 
elements in a @-element, 3.3-MHz array [15], [23]. Each A- 
scan was digitized after appropriate time gain compensation at 
a sampling rate of 17.76 MHz with 10-bit ADC. The aberrated 
data set was obtained using a distortion plate, made of RTV 
silicone rubber, which was placed between the transducer 
array and the phantom. The image reconstruction is carried 
out by digitally processing the recorded data to simulate the 
operation of a real-time digital imaging system. Since the 
' Provided by Prof. Matthew O'Donnell of the University of Michigan. 
I 
714 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 12, NO. 4, DECEMBER 1993 
(d) (e) (f) 
Fig. 4. Ultrasound images of a section of the phantom: (a) control, (b) aberrated, corrected using (c) cross-correlation, (d) SAD, (e) I-bit SAD, and 
(f) I-bit SAD averaged over 5 scan angles. 
complete data set is recorded, it is possible to simulate any 
image reconstruction and beamforming technique. 
Fig. 4 depicts the control, aberrated and the corrected images 
(a portion of the full sector image shown in Fig. 6). In 
reconstruction of these images, the sampling rate of the RF 
data sets are increased by 8 times using the linear interpolation 
technique discussed previously. Thus the phase aberration 
correction and beamforming operations are performed with 
about 142 MHz sampling rate which corresponds to a phase 
accuracy of about 2 ~ / 4 3 .  For each corrected image, the phase 
aberration pattern used in correction is estimated from the 
diffuse scatterers in three iterations on the central scan angle 
of the sector. The window length used for correlation and SAD 
is 4096 samples which corresponds to 20 mm. It is observed 
from the images shown in Fig. 4 that the performance of the 
1-bit SAD is as good as the full word SAD and full word 
correlation. It is also noted that the averaging maintains the 
image quality performance of the 1-bit SAD while it eliminates 
the convergence problems. 
Hardware realization of the phase aberration detection using 
1-bit SAD method, requires delay elements and processing 
units for computation of the SAD terms for different shift 
indexes. The shift index of the minimum of the SAD terms 
corresponds to the time shift between the two channels. 
A processing unit consisting of three pipelined stages is 
developed (see Fig. 5) to form a flexible 1-bit SAD architecture 
by simply cascading a number of these units. 
In the processing unit shown in Fig. 5, the binary data 
in (5) 
are serially entered to the unit, from inputs A and B. The unit 
also takes a binary reset input, R, and cumulative data inputs, 
SADMIN, SIMIN, and SI. These are the minimum of SAD 
terms, the shift index corresponding to the minimum SAD, and 
the shift index indicating the shift count of the previous unit, 
vectors in the SAD window, ({u~}:=~ and {b,}m=l W 
I l l  11 I 
0 DELAY ELEMENT 
Fig. 5. Processing unit for the I-bit SAD hardware. 
respectively. The shift index is incremented and sent out to 
the succeeding unit. At any time, the shift index of the unit is 
regenerated. The computation of the SAD term corresponding 
to a different shift index is carried out using a counter which is 
updated by the result of XOR of the binary inputs. The counter 
is an upcounter which increments its value if the XOR of the 
input is 1. When the last data inputs are processed, then the 
output of the counter becomes the result of the SAD of all 
binary data within the window. This is compared with the 
SAD-MIN, and the smaller one and its corresponding index 
are sent out. 
KARAMAN ef al.: V U 1  CIRCUITS FOR ADAFTIVE DIGITAL BEAMFORMING 715 
The unit outlined above, computes the SAD term, ~ ( l c ) ,  and 
then compares it with the input SAD-MIN = min.{c(k - 1), 
~ ( k - 2 ) ,  . . . , ~(0)). It outputs the minimum of the SAD terms, 
{ ~ ( l c ) ,  ~ ( l c  - 1), . . . , ~(o)}, and corresponding index. One can 
cascade K stages of the unit to form a SAD network of K 
shift indexes. Therefore, the minimum of K SAD terms and 
associated indexes are computed in a pipelined manner and 
become available at the output of the last unit. The maximum 
number of the cascaded stages (i.e., the maximum value of 
shift index, SI, which equals to K), is determined by the 
size of the adder used for updating shift index, whereas the 
maximum number of input data (i.e., the length of the SAD 
window, W) is determined by the counter size. 
IV. DIGITAL BEAMFORMING 
In conventional phased-array ultrasound imaging, the radial 
data are converted to the raster format using digital scan- 
conversion techniques. A typical digital scan converter maps 
the radial data to the nearest display pixel. This corresponds 
to the quantization of the radial coordinates to the nearest 
rectangular coordinates (see Fig. l),  where the data is forced 
into alignment with the display grid. Thus, it results in an 
annoying artifact called Moire pattern which is a well-defined 
pattern of holes in the image corresponding to the unaddressed 
pixels. Artifacts can be decreased by using two dimensional 
interpolation techniques [ 131, [6]. This significantly increases 
the computational cost of the scan conversion process. 
The conversion process can be completely eliminated by 
performing the receive beamforming operations directly at the 
raster focal points of the imaging plane instead of the radial 
focal points. For this purpose, the timing data required for the 
receive beamforming must be computed for every raster focal 
point within the sector. Since the number of raster points in 
the sector is about 2.5 times the number of radial points, the 
number of receive beamforming operations increases by the 
same amount. This is a major reason why the analog systems 
employed beamforming at the radial focal points. However, it 
is possible to perform very fast beamforming operations using 
digital VLSI circuits. Here, we present a digital beamformer 
circuit and its VLSI implementation. 
In order to make a comparison of the image quality perfor- 
mance of the receive beamforming at the radial and raster focal 
points, the sector images of the phantom are reconstructed 
using the two beamforming techniques. For this purpose the 
control (non-aberrated) RF data set, outlined in the previous 
section, is used. The sampling rate of the RF data is increased 
by 8 times using the linear interpolation scheme discussed pre- 
viously, and the quadrature signal components are generated. 
The beamforming is performed using timing data computed for 
radial and raster receive focal points. The envelope detection is 
realized by taking the square root of the sum of the squares of 
in-phase and quadrature samples. The scan conversion which is 
employed in beamforming at the radial points, is realized using 
a first-order two-dimensional interpolation technique presented 
in [ 131. Finally, a purely logarithmic compression is applied 
on the images to obtain a 60 dB dynamic range. 
In the image reconstruction, the separation of the raster and 
radial focal points are chosen to be X0/2 = 0.23 mm. The 
number of scan-slices/90°-sector is 128. The images consist of 
512 x 512 pixels (Fig. 6). The transmit focal length is fixed at 
80 mm away from the array, which corresponds to the position 
of the point reflector at center of the sector. It is observed 
from these images that the image obtained by beamforming at 
radial points is blurred due to the scan conversion. A detectable 
resolution improvement is achieved by performing the receive 
beamforming at the raster focal points. 
In digital receive beamforming, the samples corresponding 
to a focal point are not synchronous. In order to find the 
signal value of a focal point, the samples must be suitably 
delayed and then added. This requires coherent addition of 
signals received by the array elements. However, a simple 
synchronization scheme designed for a regular data flow can 
not solve the synchronization problem in receive beamforming 
for sector imaging. This is due to the fact that the arrival time 
pattern of samples varies depending on the location of focal 
point within the sector. 
In a straightforward approach, a “global” coherent summa- 
tion scheme can be employed by using FIFO type registers at 
the adder front-end [4], [3], [24]: the samples from all channels 
are stacked in FIFO registers at each channel for synchroniza- 
tion, and then all of them are added. However, since the FIFO 
and adder sizes increase dramatically with the number array of 
elements, the scheme is not very feasible for implementation 
at the board level or in VLSI. Alternatively, the samples can 
be added recursively using partial sum registers [6]. But this 
technique is not particularly practical, because of the adder 
speed requirement. 
An efficient receive beamforming hardware structure can be 
obtained by employing a “local” coherent addition technique 
where the total coherent summation of all samples correspond- 
ing to a focal point is obtained by a sequence of pairwise 
partial coherent summations [25]. For an N channel system, 
at first, N / 2  partial coherent summations are obtained. Then, 
adjacent pairs of these partial sums are coherently added 
resulting in N/4 new partial coherent sums. This procedure 
is repeated until the number of new partial coherent sums 
becomes unity, which is the total coherent summations of all 
samples. This approach results in an inverse binary tree like 
architecture for receive beamformer. For an N-element array, 
the network consists of log,(N) stages, and the lcth stage has 
N / 2 k  ( I C  = 1, 2, . . . , log,(N)) processing units (see Fig. 7). 
Each unit consists of FIFO registers and a full adder for 
coherent summation of its two input data. FIFO length for each 
stage is different; since there is no regularity in the arrival 
times of the signals corresponding to different focal points, 
the worst case FIFO lengths for the stages are determined by 
means of the computer simulation of the phased-array imaging 
system with a 90” sector scanning. 
The processing unit takes two data inputs, A and B, along 
with two status bits, S A  and SB,  and generates the coherent 
summation of the inputs, DO, with a corresponding output 
status bit, SO (Fig. 8). Each status bit indicates that the 
data is valid. For a reliable real-time operation, the unit is 
designed in three pipelined stages: cross switch, FIFO, and 
116 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 12, NO. 4, DECEMBER 1993 
(b) 
Fig. 6. Ultrasound sector images of the phantom reconstructed by receive beamforming at (a) radial and (b) raster focal points. 
adder stages. The cross switch is a finite state machine which 
feeds the earlier of the inputs, A or B,  to the subsequent 
FIFO. After synchronization by FIFO, the data are fed to the 
full adder. The unit is reset by an extemal reset signal (R)  
at the beginning of operation so that cross switch and FIFO 
pointers are set appropriately. Two non-overlapping clocks (41 
KARAMAN ef 01.: VLSI CIRCUITS FOR ADAPTlVE DIGITAL BEAMFORMING 717 
INPUT DATA 
FIFO 
LENGTH 
SINGLE CHIP COHERENT SUM 
Fig. 7. Hardware structure of the beamformer. 
a2 
CROSS / SWITCH 
FIFO 
REGISTER 
/ 
I L1 I 
/ 
RE- . . . . . . . . ._ . . . . . . . . . . . . . . . . . . . ._ .. .... .. W I T E  -rN-I-ER vF. .:;:.:=:;<.:-* 1 
....._ 
so 
Fig. 8. Functional structure of one processing unit. 
I 
718 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 12. NO. 4, DECEMBER 1993 
(C) 
Fig. 9. Layouts of the designed chips: (a) Interpolator. (b) Phase estimator. (c) Beamformer. 
and $2) are used to control the pipelined operations and data 
flow. 
V .  VLSI IMPLEMENTATION 
The proposed hardware structures are implemented in 1.5- 
pm M’CMOS technology using full-custom VLSI design 
techniques. In VLSI design of the chips, Magic, Hspice, and 
h i m  are used for layout editing, timing, and logic simulations, 
respectively. The layouts of the chips are shown in Fig. 
9, and the characteristic parameters are outlined in Table I. 
The testing of the designed chips are easily accomplished 
by the functional test techniques, since the operations of the 
units are selectively probed by issuing proper test vectors 
tm. 
The linear interpolator structure shown in Fig. 3 is im- 
plemented as a single chip. Each of the input samples and 
the interpolated output sample is 16 bits, whereas the TI 
input is 3 bits. The throughput of the chip is 40 Mega 
interpolations/s, which can meet the speed requirements of 
adaptive beamforming applications. 
A phase aberration estimation chip, consisting of 8 cascaded 
units depicted in Fig. 5 ,  is implemented. The word lengths 
are chosen sufficiently long to offer the chip to be used 
for sufficiently large window sizes and to be cascaded for 
larger shift indexes. The single chip can be used for window 
sizes up to 4096 samples with 8 shift indexes, whereas 
for shift indexes up to 128, the number of chips to be 
cascaded is the smallest greater integer of K/8. A single 
chip can be used in phase aberration detection, where the 
shift index of adjacent array element pairs does not exceed 
8 for sampling rates up to 50f0, and for phase aberration 
variation within a cycle. On the other hand, the window 
size is sufficiently long for phase aberration correction from 
both point reflectors and the diffuse scatterers for sampling 
rates up to 50f,. As an example, the SAD window of size 
4096 corresponds about 20 mm for fo = 3.33 MHz and 
sampling rate of 43f,, which are the parameters used in 
KARAMAN er al.: VLSl CIRCUITS FOR ADAPTIVE DIGITAL BEAMFORMING 119 
a , ... a, a. ,_________._..._ _ _______........., 
1 ,  
~ DELAY ELEMENT I 1 I-BITXOR j 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ I  .__________________ 
TABLE I 
CHARACTERISTIC PARAMETERS OF THE DFSIGNED CHIPS 
Parameter Interpolator Phase Estimator Beamfomr 
Die size (mm2) 2.8 X 3.0 4.4 x 6.0 6.6 x 6.7 
5 28 40 Transistor count (K) 
Pin count 64 64 100 
40 50 40 Max. clock freq. 
( M W  
Max. throughput 40 5 0 N  40 
(Mega outputds) 
SAD ' ' ' ' 
0 1 2 3  
(U Bit) 
Fig. 10. High throughput architecture for 1-bit SAD. 
the phase aberration correction of the images shown in Fig. 
4. 
The computation at each pipelined stage is completed in 
about 20 nsec resulting in a pipelined speed of 50 Mega 
operations/s. Thus, the throughput of the chip is about 50/W 
Mega outputsls. This enables the phase aberration detection 
of about 12 K/s with W = 4096. In sector imaging, the 
phase aberration correction can be applied at each frame 
where the phase correction iterations can be carried out on 
the consecutive frames. Also, the angle dependent correction 
can be carried out for a number of scan angle groups on each 
frame. For 128 scan angleslframe and 30 framesh, the number 
of the phase aberration correction operations is 3840. Hence, 
the speed of the designed chip is adequate for phase aberration 
correction at each scan angle for every frame. 
A receive beamformer for N = 16 is realized by connecting 
15 processing units in an inverse binary tree structure to form 
the core of the chip. The chip consists of the lowest four 
stages of the network for N = 256, outlined by a dashed 
rectangle in Fig. 7. Since the lowest four stages have the 
longest FIFO's, one can connect the chips to form a network 
for N 5 256. Maximum throughput of the chip is 40 Mega 
beamforming operationsh. The chip has 16 multiplexed inputs 
and 1 multiplexed output, 16 bits plus 1 status bit each. Overall 
result of coherent addition is truncated to 16 bits from 20 bits, 
while the maximum dynamic range requirement is about 12 
bits. 
VI. DISCUSSION AND CONCLUSION 
In this paper, we present cost-effective digital beamforming 
techniques with high timing accuracy, phase aberration cor- 
rection, and efficient receive focusing. Performances of these 
techniques are tested and illustrated using the phased-array 
experimental RF data sets. Special purpose VLSI circuits based 
on these techniques are proposed. Single-chip realizations of 
the circuits using full-custom CMOS VLSI design techniques 
are outlined. 
In the digital front-end hardware as depicted in Fig. 2, 
the processing units can be realized using the designed chips 
(interpolator, phase estimator, and beamformer). Both of the 
re-sampling and the beamforming operations can be handled 
by the beamformer chip(s), where the input status entry 
controls the sample selection. For an N-channel phased-array 
imaging system, the number of chips for upsampling, phase 
estimation, and beamforming are N ,  N - 1, and smallest 
greater integer of N/16, respectively. 
The proposed front-end circuits involve sampling of RF 
echo signals received from all array channels. This is a major 
limitation in all digital beamforming schemes as well as the 
one discussed in this work. The speed of the ADC's, however, 
can be at the Nyquist rate, since a special interpolation circuit 
is also employed to decrease the effects of delay quantization. 
The phase estimator chip can meet the speed requirements 
of one- and two-dimensional angle-dependent phase aberration 
correction applications. Should a faster phase-error estimation 
be required, such as for correction of both angle- and range- 
dependent phase errors, the proposed 1-bit SAD hardware 
structure can be modified to increase the throughput further. 
This can be achieved in a systolic scheme by employing bit 
level pipelining and parallelism [27]. Such an architecture 
with very high throughput is depicted in Fig. 10. In this 
applied to the inputs in parallel. There are three bit-level 
processing elements: delay, half adder, and XOR elements. 
Since all operations are pipelined at bit-level, the computation 
time of a SAD term is determined by a bitwise XOR operation. 
All of the SAD terms, {t(i)}z1, are computed at K pipelined 
clocks. The minimum of the SAD terms can be found using 
a fast comparator or a sorter. VLSI design complexity of the 
architecture given in Fig. 10 is very low. Since the area of 
the architecture increases with W2/2, it is suitable for the 
realizations with a reasonably small W .  
Proposed receive beamforming scheme is an efficient struc- 
ture that possesses the following advantages: 1) small storage 
for data synchronization, 2) flexibility in applications with 
different number of array elements, 3) operating capability at 
real-time rate, and 4) feasibility for custom chip(s) implemen- 
tation. VLSI implementation for large N is not very feasible 
because of the large chip area and large number of VO's. 
The modularity of the structure can be employed effectively, 
however, by proper choice of FIFO sizes, and by using the 
architecture, the binary data, and {bm+k}m=l W are 
720 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 12, NO. 4, DECEMBER 1993 
beamformer chips With N = 16. Furthermore, the designed [12] G. E. Trahey, P. D. Freiburger, L. F. Nock, and D. C. Sullivan, “In vivo 
beamformer can perform real-time receive beamforming at the 
radial or raster focal points. 
measurements of ultrasonic beam distortion in the breast.” Ultrasonic 
Imaging, vol. 13, pp. 71-90, 1991. 
[I31 S. Leavitt and B. F. Hunt and H. G. Larsen, “A scan conversion 
ACKNOWLEDGMENT 
The authors would like to thank to Prof. Dr. Matthew 
O’Donnell of University of Michigan, for kindly providing the 
ultrasound phase array RF data sets of the phantom, and for 
his valuable helps in digital processing of the data. The authors 
would like to acknowledge the cooperations of C. Aydin, A. 
E. Kolagasioglu, M. $.Toygar, 1. A. Baktir, R. Tahboub, E. 
Enin, F. ICdig, and M. H. Asyali, in VLSI implementation of 
the beamformer chip. 
REFERENCES 
[ l ]  J. F. Havlice and J. C. Taenzer, “Medical ultrasound imaging: An 
overview of principles and instrumentation,” in Pmc. IEEE, vol. 67, 
pp. 620-641, April 1979. 
[2] M. E. Schafer and P. A. Lewin, ‘The influence of front-end hardware 
on digital ultrasonic imaging,” IEEE Trans. Sonics. Ultrason., vol. 31, 
[3] M. O’Donnell, “Applications of VLSI circuits to medical imaging,” in 
Proc. IEEE, vol. 76, pp. 1106-1 114, Sept. 1988. 
[4] J. P. Stonestorm and W. A. Anderson, “Custom NMOS chip for medical 
ultrasound,” VLSI Design, pp. 44-49, May 1982. 
[5] M. O’Donnell et al., “Real-time phased-may imaging using digital beam 
forming and autonomous channel control,” in Proc. 1990 IEEE Ultrason. 
Symp., pp. 1499-1502, 1990. 
[6] R. M. Lutolf, A. Vieli, and S. Basler, “Ultrasonic phased-may scanner 
with digital echo synthesis for Doppler echocardiography,” IEEE Trans. 
Ultrason. Ferroelec. Freq. Contr., vol. 36, pp. 496506, Sept. 1989. 
[7] R. A. Harris et al., “Ultimate limits in ultrasonic image resolution,” 
Ultrasound in Med. and Biol., vol. 17, pp. 547-558, 1991. 
[8] M. J. Foster and H. T. Kung, “The design of special purpose VLSI 
chips,’’ IEEE Computer, pp. 26-40. Jan. 1980. 
[9] D. K. Peterson and G. S .  Kino, “Real-time digital image reconstruction: 
A description of imaging hardware and an analysis of quantization 
errors,’’ IEEE Sonics. Ultrason., vol. 3 1, pp. 337-35 1, July 1984. 
[lo] R. G. Pridham and R. A. Mucci, “Digital interpolation beamforming for 
low-pass and bandpass signals,” in Proc. IEEE, vol. 67, pp. 904-919, 
April 1979. 
[11] Y. C. Lim and S. R. Parker, “FIR filter designed over a discrete 
powers-of-two coefficient space,’’ IEEE Trans. Acoust., Speech, Signal 
Processing,” vol. 31, pp. 583-590, June 1983. 
pp. 295-306, July 1984. 
algorithm for displaying ultrasound images,” Hewlett-Packard J., vol. 
34, pp. 30-34, Oct. 1983. 
[14] 0. T. von Ramm and S. W. Smith, “Beam steering with linear arrays,” 
IEEE Trans. Biomed. Eng., vol. 30, pp. 4 3 8 4 2 ,  Aug. 1983. 
[15] S. W. Flax and M. O’Donnell, “Phase-abemtion correction using signals 
from point reflectors and diffuse scatterers: Basic principles,” IEEE 
Trans. Ultrason. Ferroelec. Freq. Contr., vol. 35, pp. 758-767, Nov. 
1988. 
[I61 D. Zhao and G. E. Trahey, “Comparisons of image quality factors for 
phase aberration correction with diffuse and point targets,” IEEE Trans. 
Ultrason. Ferroelec. Freq. Contr., vol. 38, pp. 125-132, Mar. 1991. 
[17] F. Wu, etal., “Optimal focusing through aberrating media: A comparison 
between time reversal mirror and time delay correction techniques,” in 
Proc. I991 IEEE Ultrason. Symp., pp. 1195-1199, 1991. 
[ 181 M. Karaman, A. Atalar, and H. Koymen, “Adaptive digital beamforming 
for real-time phased array ultrasound imaging,” in Proc. I991 IEEE 
Ultrason. Symp., pp. 1207-1210, 1991. 
[19] G. E. Trahey and P. D. Freiburger, “An evaluation of transducer 
design and algorithm performance for two dimensional phase 
aberration correction,” in Proc. I991 IEEE Ultrason. Symp., pp. 
[20] M. O h n n e l l  and P. C. Li, “Aberration correction on a two-dimensional 
anisotropic phased array,” in Proc. I991 IEEE Ultrason. Symp., pp. 
[21] R. Kanda, Y. Sumino, K. Takamizawa, and H. Sasaki, “An investigation 
of wavefront distortion correction: Correction using averaged phase 
information and the effect of correction one and two dimensions,” in 
Proc. 1991 IEEE Ultrason. Symp., pp. 1201-1206, 1991. 
[22] K. Scarbrough, N. Ahmed and G. C. Carter, “On the simulation of a 
class of time delay estimation algorithms,” IEEE Trans. Acoust. Speech, 
Signal Process., vol. 29, pp. 534539, June 1981. 
[23] M. O’Donnell and S. W. Flax, “Phase-aberration correction using signals 
from point reflectors and diffuse scatterers: Experimental results,” IEEE 
Trans. Ultrason. Ferroelec. Freq. Contr., vol. 35, pp. 768-774, Nov. 
1988. 
[24] T. H. Song and S. B. Park, “A new digital phased array system for 
dynamic focusing and steering with reduced sampling rate,” Ultrasonic 
Imaging, vol. 12, pp. 1-16, 1990. 
[25] M. Karaman, E. Kolagasioglu, and A. Atalar, “A VLSI receive beam- 
former for digital ultrasound imaging,” in Proc. ICASSP’92, pp. V-657- 
660,1992. 
[26] J. A. Abraham and W. K. Fuchs, “Fault and error models for VLSI,” in 
Proc. IEEE, vol. 74, pp. 639-654, May 1986. 
[27] H. T. Kung, “Why systolic architectures?” IEEE Computer, pp. 3746, 
Jan. 1982. 
1181-1187, 1991. 
1189-1193, 1991. 
