Compact FPGA-based beamformer using oversampled 1-bit A/D converters by Tomov, Borislav Gueorguiev & Jensen, Jørgen Arendt
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  
General rights 
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners 
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. 
 
• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. 
• You may not further distribute the material or use it for any profit-making activity or commercial gain 
• You may freely distribute the URL identifying the publication in the public portal  
 
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately 
and investigate your claim. 
   
 
Downloaded from orbit.dtu.dk on: Dec 17, 2017
Compact FPGA-based beamformer using oversampled 1-bit A/D converters
Tomov, Borislav Gueorguiev; Jensen, Jørgen Arendt
Published in:
I E E E Transactions on Ultrasonics, Ferroelectrics and Frequency Control
Link to article, DOI:
10.1109/TUFFC.2005.1503973
Publication date:
2005
Document Version
Publisher's PDF, also known as Version of record
Link back to DTU Orbit
Citation (APA):
Tomov, B. G., & Jensen, J. A. (2005). Compact FPGA-based beamformer using oversampled 1-bit A/D
converters. I E E E Transactions on Ultrasonics, Ferroelectrics and Frequency Control, 52(5), 870-880. DOI:
10.1109/TUFFC.2005.1503973
870 ieee transactions on ultrasonics, ferroelectrics, and frequency control, vol. 52, no. 5, may 2005
Compact FPGA-Based Beamformer Using
Oversampled 1-bit A/D Converters
Borislav Gueorguiev Tomov and Jørgen Arendt Jensen, Senior Member, IEEE
Abstract—A compact medical ultrasound beamformer
architecture that uses oversampled 1-bit analog-to-digital
(A/D) converters is presented. Sparse sample processing is
used, as the echo signal for the image lines is reconstructed
in 512 equidistant focal points along the line through its
in-phase and quadrature components. That information is
suﬃcient for presenting a B-mode image and creating a
color ﬂow map. The high sampling rate provides the nec-
essary delay resolution for the focusing. The low channel
data width (1-bit) makes it possible to construct a compact
beamformer logic. The signal reconstruction is done using
ﬁnite impulse reponse (FIR) ﬁlters, applied on selected bit
sequences of the delta-sigma modulator output stream. The
approach allows for a multichannel beamformer to ﬁt in
a single ﬁeld programmable gate array (FPGA) device. A
32-channel beamformer is estimated to occupy 50% of the
available logic resources in a commercially available mid-
range FPGA, and to be able to operate at 129 MHz. Simu-
lation of the architecture at 140 MHz provides images with
a dynamic range approaching 60 dB for an excitation fre-
quency of 3 MHz.
I. Introduction
Medical ultrasound has gained popularity in the clin-ical practice as a quick, compact, and aﬀordable di-
agnostic tool. It has the advantage over computed tomog-
raphy and magnetic resonance imaging methods in that
the preparation for a scan is minimal, and no health haz-
ards are involved. Recently, portable and lightweight ultra-
sound scanners have been developed [1], [2], which greatly
expand the range of situations and sites for which medical
ultrasound can be used.
The evolution of ultrasound scanners is directly inﬂu-
enced by developments in analog and digital electronics.
The number of functions and image quality increases, and
the implementation price for any given function decreases
with time. One powerful approach for increasing the ﬂexi-
bility and compactness of an ultrasound scanner is to move
processing functions from analog to digital electronics [3].
Delta-sigma modulation (DSM) [4] is one of the tech-
niques that make it possible to decrease the complexity
of the analog interface electronics by using digital logic. It
oﬀers analog-to-digital (A/D) and digital-to-analog (D/A)
Manuscript received February 5, 2003; accepted October 1, 2004.
This work was supported by grant 9700883 and 9700563 from the
Danish Science Foundation, by B-K Medical A/S, Gentofte, Den-
mark, by the Thomas B. Thrige Center for Microinstruments, and
by the Danish Research Academy.
The authors are with the Center for Fast Ultrasound Imaging,
Ørsted•DTU, Technical University of Denmark, DK-2800 Kongens
Lyngby, Denmark (e-mail: bt@oersted.dtu.dk).
conversion using little chip area, provides robust perfor-
mance, and is compatible with the digital CMOS fabri-
cation process. The dynamic range of the conversion de-
pends to a great extent on the selected oversampling ra-
tio. Presently, converters based on the DSM principle are
widely used in audio applications, and their extensive use
in video and high-frequency applications is a matter of
time, depending to a large extent on the progress in inte-
grated circuit technology.
In this paper, a novel extendable beamformer archi-
tecture for use with oversampled 1-bit A/D converters
will be presented. It allows a complete 32-channel beam-
former to be implemented using a single, standard ﬁeld
programmable gate array (FPGA) chip.
In Section II the memory requirements and the neces-
sary processing power is assessed for a conventional digital
beamformer architecture. The principles behind the new
architecture will be described in Section III. The perfor-
mance of the architecture is compared to the conventional
beamformer performance in Section IV by processing syn-
thetic and real ultrasound echo data. The implementation
choices are described in Section V. The potential beneﬁts
and limitations of the architecture are discussed in Sec-
tion VI.
II. Conventional Beamformer Architecture
In the commonly used ultrasound scanners, images are
created line by line. A focused ultrasound pulse with cen-
tral frequency f0 of 3 to 12 MHz (for general applications)
is transmitted into the tissue along a particular beam line.
An image line then is created by continuously focusing
along that beam line in receive.
A typical architecture of a modern digital receive beam-
former is shown in Fig. 1. The received echoes are dig-
itized at a frequency of 20 to 60 MHz (usually at four
times f0) and stored in a delay buﬀer. At each clock cy-
cle, appropriately delayed samples from each channel are
chosen and combined, using a weighted sum, into a fo-
cused sample. The delay applied to each channel is cal-
culated as the diﬀerence in the times of ﬂight from the
current focal point to the receive element for that channel
and to the phase center of the aperture. The delay reso-
lution, the quantizer precision, and the apodization of the
aperture determine the quality of the beamforming [5]–[7].
For achieving suﬃcient delay resolution, interpolation be-
tween samples is used. After summing, the samples pass
0885–3010/$20.00 c© 2005 IEEE
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on December 2, 2009 at 03:54 from IEEE Xplore.  Restrictions apply. 
tomov and jensen: compact extendable beamformer utilizing oversampled a/d converters 871
Fig. 1. Beamformer architecture for dynamic receive focusing.
through a matched ﬁlter whose function is to maximize
the signal-to-noise ratio (SNR) of the signal. The envelope
of the signal is calculated as the square root of the sum of
the squares of the in-phase and the quadrature (90-degree
phase-shifted) components. The most accurate way of ob-
taining the quadrature component is to pass the echo sig-
nal through a Hilbert transform ﬁlter, because it provides
90 degree phase shift at all frequencies. After that stage,
decimation may be applied so that less data have to be
processed in the subsequent stages. The envelope then is
compressed logarithmically and put into an image buﬀer as
an image line. An image typically consists of 100 or more
lines. Scan conversion is applied to map the data to the
rectangular image display on a screen. The in-phase (the
original) and the quadrature components are used further
for ﬂow estimation.
To perform dynamic receive focusing, a digital beam-
former needs one sample index and one inter-sample preci-
sion parameter per produced sample for every contributing
channel. These two parameters can merge naturally into
one index with subsample precision that will be decoded
by the focusing logic. Another parameter is the weight-
ing (apodization) coeﬃcient for each channel. To main-
tain a constant F-number and minimize sidelobe levels,
the apodization function changes with depth.
A beamformer that reconstructs all samples along the
beam axis has to produce:
P =
2dfs
c
, (1)
samples, where fs is the sampling frequency of the analog-
to-digital converters (ADC), c is the speed of sound, and
d is the image depth. If no optimizations are used with
respect to memory, a N -channel digital beamformer that
produces L lines per image, with image depth d, has to
store PNL index values and PNL weight coeﬃcients.
The necessary calculations per channel are as follows:
two multiplications per channel per sample (which means
per clock cycle) in the case of linear interpolation, one ad-
dition for producing the contribution from that channel.
If better interpolation is desired, an interpolation ﬁlter is
used, and more multiplications and additions are neces-
sary. The apodization can be implemented either by using
one additional multiplication or by including the apodiza-
tion coeﬃcient in the interpolation coeﬃcients. The sum
of all channels is obtained using an inverted binary tree of
pipelined adders with N2 +
N
4 +
N
8 + · · ·+1 = N −1 adders.
The beamforming therefore requires 2 PNL multiplications
and PL (2N − 1) additions per image.
The matched ﬁltering is performed using a FIR ﬁl-
ter with K coeﬃcients, so K multiplications and K − 1
additions are needed per reconstructed in-phase sample.
If the quadrature component is created using a Hilbert-
transformed matched ﬁlter, the same number of operations
are needed for that too. Because the in-phase and quadra-
ture signals can be used directly in an autocorrelation
blood velocity estimation scheme, the further processing
for generating ﬂow estimation data will not be considered.
The reconstruction of the in-phase and quadrature com-
ponent requires 2 KPNL multiplications and (K −1) PNL
additions per image.
In a typical imaging situation, the image depth could
be up to 20 cm. For a sampling frequency of 20 MHz
and speed of sound c = 1540 m/s, the number of sam-
ples to beamform is P = 5195. The corresponding amount
of memory for a 64-channel system—making 100 lines per
image, using 8-bit coeﬃcients and 16-bit index—is approx-
imately 126 MB. In practice, algorithmic approaches are
sought and applied [8]–[10], which result in reductions by
several orders of magnitude in the memory requirements.
If the transmission follows immediately after reception
from the 20 cm depth, the pulse repetition frequency (im-
age line rate) is 3850 Hz, and the frame rate is 38.5 Hz.
The matched ﬁlter for the received echo signal in a con-
ventional imaging situation will be the emitted signal con-
volved twice with the impulse response of the transducer.
If the excitation is two cycles of a sinusoid at 5 MHz, and
the transducer has a 60% bandwidth of about 5 MHz, the
length of the matched ﬁlter K is 37. With the given pa-
rameters, the beamforming requires ≈ 2.57·109 multiplica-
tions per second and ≈ 2.52 ·109 additions per second. The
matched ﬁltration requires 2(K +1)PNL ≈ 97 · 109 multi-
plications per second and KPNL ≈ 47.8 ·109 additions per
second. Real-time processing, therefore, is possible only
with dedicated integrated circuits today.
III. Techniques for Compact and Efficient
Beamforming
Digital beamformers oﬀer high image quality and ﬂex-
ibility at the expense of using a lot of computational re-
sources. Optimization of the signal processing can lead to
signiﬁcant savings in power, chip area, and cost. In this
section, the principles behind a new, eﬃcient architecture
will be described.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on December 2, 2009 at 03:54 from IEEE Xplore.  Restrictions apply. 
872 ieee transactions on ultrasonics, ferroelectrics, and frequency control, vol. 52, no. 5, may 2005
A. Sparse Sample Processing
In modern scanners, much more information is pro-
cessed than what is actually displayed on screen. The ul-
trasound images are shown on raster displays, which rarely
have a vertical resolution beyond that of a television (525
lines for NTSC) or VGA (640 × 480 pixels). Therefore,
on such displays an image line is represented by no more
than several hundred pixels. Sparse sample processing in
the form of pixel-based focusing was proposed by Kara-
man et al. [11]. According to that approach, samples are
produced for focal points that correspond to raster dis-
play pixels. These focal points generally do not lie on the
straight line representing the beam axis, except for lin-
ear array imaging; therefore, the information for focusing
is hard to derive in a recursive fashion. The present ap-
proach processes samples that correspond to equidistant
focal points lying on the beam axis. In this way, it is pos-
sible to calculate the focusing information in a recursive
fashion.
The achievable image depth with an ultrasound scanner
is determined by its transmit power, the level of the noise
introduced by the analog front end, the number of channels
and the ﬁlters used. These determine the SNR budget of
the sampling system. For an imaging situation in which the
frequency-dependent propagation attenuation in tissue is
b dB/(cm·MHz) and the SNR budget is A decibels, the
image depth at which the SNR becomes 0 dB is:
d =
A
2bf0
, (2)
where f0 is the excitation frequency. The achievable im-
age depth can be expressed in wavelengths (λ = c/f0) as
follows:
dλ =
d
λ
=
A
2bc
. (3)
The axial resolution of an imaging system can be eval-
uated by creating the image (point spread function) of
a point reﬂector. The expected echo signal in that situa-
tion is the excitation waveform convolved twice with the
transducer impulse response. The matched ﬁlter applied
on the received radio frequency (RF) data in this calcula-
tion is the time reversal of the expected echo signal. The
image data is produced by ﬁltering the echo signal with the
matched ﬁlter and calculating the envelope of the result.
For a case in which the excitation waveform is one period
of a sinusoid at a central frequency f0 and the transducer
has 60% bandwidth measured at −6 dB around the same
frequency, the imaging system axial resolution at −3 dB is
approximately 1.87λ and approximately 2.67λ at −6 dB.
For avoiding loss in signal information, the distance be-
tween the reconstructed samples has to be less than the
calculated axial resolution. From the calculations for the
achievable image depth and for the axial resolution, the
necessary number of samples per line can be calculated.
For a sampling system with SNR budget of 150 dB, oper-
ating at 3 MHz in a medium with b = 1 (cm·MHz) and
c = 1540 m/s, dλ is approximately 487. Sampling that dis-
tance at each λ requires 487 samples to be reconstructed
along the image line.
As is the case for the conventional beamforming, the
reconstruction of the envelope of the signal requires its
in-phase and quadrature components. In the sparse sam-
ple processing approach, the quadrature signal cannot be
produced by ﬁltering of the in-phase signal because the
latter is an undersampled representation of the echo sig-
nal. Therefore, both components have to be created at the
same stage. This is achieved by using in-phase and quadra-
ture reconstruction ﬁlters, as explained below.
B. Beamforming Using Oversampled Signals
A delta-sigma modulator approximates the input signal
by feeding back the error into the decision loop, and shap-
ing the quantization noise spectrum away from the band
of interest. Appropriate ﬁltering applied on the modulator
output bit stream suppresses the noise, and valid samples
can be reconstructed at the same or lower sampling rate.
Because performing a large number of multiplications
and summation at a high clock frequency is not economi-
cal and the target data rate is much lower than the DSM
sampling frequency due to the necessary oversampling, the
output samples usually are produced by passing the mod-
ulator output through consecutive stages of comb ﬁltering
and decimation.
The oversampling conversion oﬀers several advantages
for ultrasound beamforming over the use of multibit ADC.
First, the delta-sigma modulators can be integrated in
large numbers on a chip, with the requirement of one input
and one-bit output per modulator. Second, the intersam-
ple interpolation that is used with multibit ﬂash ADC can
be avoided because the delay resolution of a DSM beam-
former is determined by the sampling rate of the modula-
tors, which is inherently high. Third, the time-gain com-
pensation and/or channel weighting can be incorporated
to a certain degree (25 dB of gain range has been demon-
strated [12]) in the modulator by varying the amplitude of
the feedback voltage.
The reconstruction process in DSM beamformers can
take place after summation of the aligned echo signals from
the channels. Although the reconstruction in this case is
applied on a multibit stream, the implementation is still
more compact than in the case in which separate sample
reconstruction is performed on each channel.
In dynamic receive focusing, only one channel (cor-
responding to the transducer element from which the
beam/line originates) can have a linear delay development
in time. All other channels have nonlinear delay develop-
ment; therefore, samples from the DSM output streams
have to be skipped or repeated. This introduces errors in
the reconstructed values from the beamformer.
1. Previous Approaches for Oversampled Beamforming:
Freeman et al. [13] developed a modiﬁed modulator archi-
tecture in which the amount of feedback of the modulator
is controlled by the delay logic of the beamformer in order
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on December 2, 2009 at 03:54 from IEEE Xplore.  Restrictions apply. 
tomov and jensen: compact extendable beamformer utilizing oversampled a/d converters 873
Fig. 2. Signal processing in the proposed beamformer, illustrated with four channels. The analog echo signals are modulated into one-bit
streams. Corresponding bit-stream sequences from the diﬀerent channels are added, and the result is ﬁltered to produce in-phase and
quadrature components.
to compensate for the skipped/repeated samples due to
focusing. Such an architecture requires specially designed
modulators and, therefore, cannot be easily upgraded with
improved generic modulators.
Kozak and Karaman [14] proposed sampling with
nonuniform sampling clock, speciﬁc for each channel, so
that the delays are incorporated and all channels produce
the same number of samples per image line. That solution
requires a large memory for controlling the sampling clock.
Also, it does not attempt to compensate for the introduced
discontinuities in the DSM bit streams.
Both of these approaches come close to using the per-
formance potential of the oversampled converters, at the
expense of more complex beamformer structure, and by
disrupting the modulation process.
2. Approach with Preserved Modulation Process: The
new oversampled beamformer architecture diﬀers from
the previously developed ones in that only the necessary
amount of samples for display are reconstructed, using FIR
ﬁlters that yield in-phase and quadrature signal compo-
nents.
The signal processing is illustrated in Fig. 2. The ana-
log input signals sk(t) (k being the channel index) from
diﬀerent channels are modulated into bit streams qk[n] in
the DSM. In order to sum the echoes coming from a cer-
tain focal point (indicated by arrows in the plots of sk(t)),
sequences of bits (shown in black) are selected from the
streams qk[n], at positions that correspond to the appro-
priate channel delays. The length of the sequences is equal
to the length of the reconstruction ﬁlters that will be used.
The selected sequences are summed into sequence r[n],
which then is ﬁltered by the in-phase hI [n] and quadrature
hQ[n] ﬁlters to yield in-phase sˆI [n] and quadrature sˆQ[n]
components of the signal from the chosen focal point. In
Fig. 2, all possible reconstructed in-phase and quadrature
samples are shown in gray. In accordance with the sparse
sample reconstruction approach, only one out of several
tens of possible samples is reconstructed. The in-phase and
quadrature components of the signal convey information
about its phase. Subsequent sample reconstructions for the
same position reveal the presence and the amount of phase
Fig. 3. DSM output spectrum (gray) for simulated echo signal for
random scatterers and frequency response of the matched ﬁlter for
the given situation (black) for an oversampling ratio of 20.
change in the echo signal from that position and can be
used for velocity estimation.
C. Reconstruction Filters
In general, the DSM reconstruction ﬁlter has to be in-
versely matched to the noise transfer function (NTF) of
the modulator, e.g., if the NTF is band-rejecting (pushing
noise away from a given center frequency), the ﬁlter should
be band-pass with the same center frequency.
The best ﬁlter for a known signal in the presence of
white noise is the matched ﬁlter, which is a time-reversed
and delayed version of the expected signal [15]. In an ul-
trasound beamformer, the expected signal from a point re-
ﬂector is the transmitted excitation convolved twice with
the impulse response of the transducer. Because the ampli-
ﬁers in transmit and receive have much greater bandwidth
than the transducer, their impulse response is not a limit-
ing factor and is not taken into account.
The matched ﬁlter for the expected echo signal should
be able to ﬁlter out the quantization noise because it has
a band-pass transfer function centered around the central
frequency of the useful signal, as shown on Fig. 3. The
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on December 2, 2009 at 03:54 from IEEE Xplore.  Restrictions apply. 
874 ieee transactions on ultrasonics, ferroelectrics, and frequency control, vol. 52, no. 5, may 2005
TABLE I
Target Beamformer Parameters.
Parameter Value
Transducer center frequency f0 3 MHz
Target image SNR 60 dB
Target delay-resolution-induced sidelobe level 60 dB
Number of channels 64
transfer function of the matched ﬁlter drops below −60 dB
for frequencies above twice the central frequency. There-
fore, the matched ﬁlter, sampled at the DSM sampling
frequency, is used as an in-phase reconstruction ﬁlter, and
a Hilbert transformation of it is used as a quadrature re-
construction ﬁlter.
IV. Image Quality
The image quality of the proposed beamformer was
compared to that of a conventional digital beamformer.
First, the necessary oversampling ratio (OSR) was calcu-
lated. Second, echo signals were processed using oversam-
pled beamforming and using conventional beamforming.
A. Calculating the Necessary Sampling Frequency
The target image quality parameters and the number
of channels are shown in Table I.
The delay resolution of a beamformer has a high im-
pact on its ability to focus in a given direction, while
rejecting signals from other directions. According to the
most restrictive of the published formulae, given in [6],
the worst-case discrete quantization sidelobe level (due to
periodic phase errors over the array) in a beamformer is
described as:
SLfocus ≈ 2
mL cosϕ
1
IPG
(
rλ
m
) 1
2
, (4)
where:
IPG =
1
N
N−1∑
n=0
w2n (5)
is the incoherent power gain of an N -element array with
apodization coeﬃcients wn, n = 1 . . .N , ϕ is the beam
angle from the normal, λ is the wavelength, m = fsf0 is the
ratio of the sampling frequency and the central frequency,
L is the aperture size, and r is the distance along the beam.
The maximum random quantization sidelobe level (due
to random phase errors over the array) is:
SLpeak ≈ π
m
(
4.6 ENBW
3 N
) 1
2
, (6)
where:
ENBW =
IPG
CPG
, CPG =
[
1
N
N−1∑
n=0
wn
]2
(7)
is the equivalent noise bandwidth, (CPG being the coher-
ent power gain of an N -element array). The maximum
sidelobe level is SLmax = max(SLfocus, SLpeak).
Calculating these values for the particular case, the
following results are obtained: In the near ﬁeld the ran-
dom quantization sidelobes are prevalent, and for achiev-
ing sidelobe level of −60 dB, the necessary delay resolu-
tion should be 25.6 times smaller than the period of the
ultrasound pulse. The calculated fs = 76.8 MHz provides
−30 dB sidelobe level in transmit and −30 dB sidelobe
level in receive.
Apart from the sidelobe level, the sampling frequency
also determines the level of the quantization noise. The
quantization noise power of a multibit ADC with quanti-
zation step δ, assuming white noise, is:
Pqe =
δ2
12
. (8)
The coherent sum of the signals across the array would
sum up the signal amplitudes and the channel noise pow-
ers. Therefore, the SNR improvement in the summed sig-
nal will be determined by the apodization proﬁle as fol-
lows:
GSNR =
N∑
n=1
wn
[
N∑
n=1
w2n
] 1
2
. (9)
A 64-channel array with Hamming apodization can pro-
vide GSNR ≈ 16.7 dB, while uniform apodization yields
18 dB. That gain in SNR relieves the requirements toward
the sampling frequency.
In the following calculations, formulae for the SNR of
a DSM modulator from Johns and Martin [16] and Nor-
sworthy et al. [17] are used.
Having the requirement for 60 dB signal SNR after sum-
mation and array contribution of 16.7 dB, the channel
SNR has to be 60 − 16.7 = 43.3 dB. Using a second or-
der modulator, the necessary oversampling ratio deﬁned
as OSR = fs2 fhigh and is calculated to be (using 14.32
from [16]): OSR = 10
SNR−6.02−1.76+12.9
50 ≈ 9.3. Using Fig-
ure 4.13 from Norsworthy et al. [17], the necessary OSR
is estimated to be about 19. For the desired application
regarded in this paper, f0 = 3 MHz and the upper limit of
the bandwidth of interest is fhigh = 1.3 × f0 = 3.9 MHz.
Therefore, the necessary sampling frequency according the
more strict requirement is fs ≈ 148.2 MHz.
Because an expanding aperture will be used, combining
only several channels should provide suﬃcient SNR. The
chosen initial number of channels is four and, by summing
their signals, the noise is suppressed by 6 dB. The remain-
ing 54 dB of SNR can be obtained with an oversampling
ratio of 32 (using Figure 4.13 in [17]). That translates to
a sampling frequency of 249.6 MHz.
The chosen target sampling frequency for the simu-
lations and implementation was 140 MHz. For that fre-
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on December 2, 2009 at 03:54 from IEEE Xplore.  Restrictions apply. 
tomov and jensen: compact extendable beamformer utilizing oversampled a/d converters 875
TABLE II
Simulation Parameters.
Parameter Value
Speed of sound 1540 m/s
Transducer center frequency f0 3 MHz
Sampling frequency fs 140 MHz
1 period of a
Excitation sinusoid at f0
Number of channels 64
Transducer pitch 0.26 mm
Transducer elevation focus and Tx focus 7 cm
Image depth 15 cm
Transmit apodization uniform
Receive apodization Hamming window
Receive focus dynamic
Image type phased array image
Image sector 90 degrees
Number of lines 135
quency, the image SNR was expected to be close to 60 dB
when all channels are in use.
B. Simulation Results
The ultrasound ﬁeld simulation program Field II [18]
was used for generating echo data from scatterers at dif-
ferent depths. The simulation parameters are given in
Table II. The echo signals then were beamformed using
ﬂoating-point beamforming and using a DSM beamformer.
The apodization was applied before DSM (i.e., in the ana-
log domain), and was not quantized. It did not vary with
depth.
1. Point Spread Function: The point spread functions
(PSF) obtained by conventional and oversampled beam-
forming are shown on Figs. 4 and 5. As can be seen, the
resolution is approximately the same, and the noise level
in the DSM beamformation lies at about −60 dB due to
quantization noise.
2. Blood Flow Simulation: Due to the sparse sample
processing, ﬂow estimation on DSM beamformed data can
be performed only using an autocorrelation approach. The
suitability of the DSM beamformation for ﬂow estimation
was evaluated by simulating parabolic ﬂow below a trans-
ducer and creating the velocity proﬁle along the normal to
the transducer. The parameters of the imaging setup, in-
cluding excitation and matched ﬁlters, are the same as in
the PSF simulation. The characteristics of the simulated
ﬂow phantom and the pulse repetition frequency are given
in Table III. The phantom did not contain any stationary
scatterers. The conventional beamformation was preceded
by quantizing equivalent to that of a 12-bit ADC.
The echo signals were scaled to −30 dB relative to the
maximum possible input signal amplitude for the corre-
sponding A/D converters. No stationary echo canceling
was applied as there were no stationary scatterers. The
results from ﬂow estimation using conventionally beam-
Fig. 4. Simulated PSF: conventional beamforming (gray), DSM
beamforming (black) for depths of (top to bottom) 1, 3, 5, and 7 cm.
TABLE III
Simulated Flow Parameters.
Parameter Value
Tube radius 0.01 m
Tube length 0.04 m
Tube center depth 0.04 m
Tube slope 45 degrees
Flow proﬁle parabolic
Maximum blood velocity 0.5 m/s
Number of scatterers 3900
Pulse repetition frequency 5000 Hz
formed data and DSM beamformed data are shown in
Fig. 6.
It can be seen that the shapes of the velocity proﬁles ob-
tained through oversampled and conventional beamform-
ing for a given number of ﬁrings are similar, which shows
that the DSM beamforming with sparse sample processing
can replace conventional beamforming successfully.
C. Phantom Image Comparison
A set of echo RF data, sampled at 40 MHz, was ob-
tained using the experimental sampling system RASMUS
[19]. The target was a tissue mimicking phantom model
525 (Danish Phantom Design, Jyllinge, Denmark) with at-
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on December 2, 2009 at 03:54 from IEEE Xplore.  Restrictions apply. 
876 ieee transactions on ultrasonics, ferroelectrics, and frequency control, vol. 52, no. 5, may 2005
Fig. 5. Simulated PSF: conventional beamforming (gray), DSM
beamforming (black) for depths of (top to bottom) 9, 11, 13, and
15 cm.
Fig. 6. Velocity proﬁles obtained using diﬀerent numbers of ﬁrings.
The real velocity proﬁle is drawn with a dotted line. The ﬂow esti-
mates for conventional beamformation are drawn in gray. The ﬂow
estimates for DSM beamformation are drawn in black.
Fig. 7. Images created using conventional beamforming and oversam-
pled beamforming. Dynamic range, 60 dB.
Fig. 8. Beamformer structure.
tenuation coeﬃcient of 0.5 dB/(MHz·cm). The phantom
consisted of randomly distributed background scatterers
(backscattering material) and wire targets. The transducer
was Vermon PA35/3D (Vermon, Tours, France). It is a ro-
tating phased array, here used without rotation. An aper-
ture of 40 adjacent elements was used. That data was re-
sampled at 140 MHz and was beamformed according to
the suggested architecture. The result, along with a con-
ventionally beamformed image, is shown in Fig. 7.
V. Implementation
In order to obtain performance and logic utilization ﬁg-
ures for the suggested architecture, it was implemented in
the hardware description language VHDL and synthesized
with target FPGA device XCV2000E-7 (Xilinx, Inc., San
Jose, CA). The functional blocks were tested only sepa-
rately for correct operation. In this section, the implemen-
tation parameters, choices, and results will be described.
The structure of the beamformer is illustrated in Fig. 8.
The functional blocks of a channel are sample buﬀer,
apodization multiplier, and delay/weight generator. The
channel outputs are connected to a pipelined adder, fol-
lowed by in-phase and quadrature ﬁlters.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on December 2, 2009 at 03:54 from IEEE Xplore.  Restrictions apply. 
tomov and jensen: compact extendable beamformer utilizing oversampled a/d converters 877
TABLE IV
Target Implementation Parameters.
Parameter Value
Target sampling rate fs 140 MHz
Length of the matched ﬁlter 120 coeﬃcients
Number of lines per image 80
Image depth dmax 15 cm
Samples per line 512
The target beamformer parameters are shown in Ta-
ble IV. The excitation was chosen to be the same as in the
simulations.
The length of the ﬁlters for the in-phase and quadra-
ture components is constrained by the number of 140 MHz
clock cycles that are available for producing a sample. That
number (denoted form here on Nr) is inversely propor-
tional to the density of the beamformed samples. For il-
lustration purposes, its maximum value for a given imaging
setup can an be calculated as:
Nr =
2fsdmax
Nsc
, (10)
where dmax is the image depth, Ns is the number of recon-
structed samples, and c is the speed of sound. The minimal
available number of clock cycles is observed for the outer
channels, between the ﬁrst and the second read operation
they have to perform. That is the number that is used in
the calculation for the size of the reconstruction ﬁlters.
With the given image geometry and sampling rate, the
minimum number of available clock cycles between two
consecutive reconstructed samples is 33, when using ex-
panding aperture (maintaining F-number of 1 until all el-
ements are used).
The desired length of the FIR ﬁlters is 168 coeﬃcients
if they should represent the matched ﬁlter for the chosen
excitation and transducer impulse response. That length
was obtained by truncating the tails of the matched ﬁlter
40 dB below its maximum amplitude. Therefore, the pro-
cessing path is parallelized by four, which allows shorter,
approximately matched FIR ﬁlters of length up to 132 co-
eﬃcients to be used. The options for the parallelization
factor is discussed further in Section VI.
A. Delay Buﬀer
For the FPGA implementation of the sparse sample pro-
cessing beamformer, using a Xilinx FPGA device is bene-
ﬁcial because it incorporates quite a large number of dual-
ported memory blocks called Block SelectRAM+ that pro-
vide simultaneous read and write capability with diﬀerent
word sizes. In the 4× parallelized case, the single bit sam-
ples are written one at a time but are read four samples
at a time.
Because the requested start address (from the delay
generator) for the read operation is speciﬁed with one-
sample precision, an alignment unit has to be used so that
the ﬁrst produced four-sample word from the buﬀer mem-
ory contains samples 1 to 4, starting with the speciﬁed
Fig. 9. Writing and reading from the sample buﬀer provides 4× par-
allelized data to the subsequent processing stages.
address; the second, samples 5 to 8, and so on. Such an
alignment unit is created using a set of eight, two-stage
latches. The structure of the sample buﬀer and the align-
ment unit is shown in Fig. 9. The two least signiﬁcant bits
of the start address determine the multiplexer positions in
the alignment unit during the present read sequence. The
more signiﬁcant address bits are used as a read address for
the four-sample words and are increased by one in each
clock cycle. In the ﬁrst clock cycle after a valid address is
selected, an initial 4-bit word is read into the alignment
register. On every following clock cycle, the four bits that
are read from the sample buﬀer are shifted by four posi-
tions. That register provides a valid, aligned 4-bit word
after the second clock cycle. Thus, in 33 clock cycles, up
to 128 samples can be read.
B. Delay Generation Logic
The authors presented several delay generation tech-
niques with reduced memory requirements in [10], and an
analytical recursive delay generation algorithm developed
by Feldka¨mper et al. [9] was adopted. Eﬃcient approxi-
mate recursive algorithms are also known [20].
The delay generator logic generates independent sam-
ple indexes for each channel. These sample indexes are
used as start addresses for the reading from the sample
buﬀer. Because the sample buﬀers are organized as circu-
lar buﬀers, care should be taken to avoid overwriting data
that is about to be used at a later time instant. This is done
by either using suﬃciently large sample buﬀers or limiting
the maximum delay (index diﬀerence) between channels.
Using an expanding aperture in receive eﬀectively accom-
plishes the latter.
The computation logic for the delay generator consists
of four adders and one comparator plus control circuitry.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on December 2, 2009 at 03:54 from IEEE Xplore.  Restrictions apply. 
878 ieee transactions on ultrasonics, ferroelectrics, and frequency control, vol. 52, no. 5, may 2005
Fig. 10. Filter block structure.
The number of parameters per line per channel is four
(12-bit words).
C. Channel Apodization
The apodization (aperture smoothing, tapering) can be
applied either on the digital data or in the analog domain
before DSM. Varying the gain of the preampliﬁers or the
DSM on each channel requires one additional digital-to-
analog converter (oversampled or using pulse-width mod-
ulation) and output line per channel, which complicates
the beamformer structure.
Applying the weight coeﬃcient in the digital domain
means that, after the apodization block, the channel data
bit-width is equal to that of the weight coeﬃcient (the
DSM stream had width of one bit before that). Thus, the
sum operation across the channels is performed on multibit
numbers rather than on 1-bit numbers. For maintaining
high-operation speed, the adders are pipelined and their
latency increases.
The channel weighting block consists of two 5-bit reg-
isters containing representations of the current weighting
coeﬃcient and its 2’s complement. The value of the modu-
lated signal (1 or 0) determines which register content will
be used in the summation across channels that follow. The
weighting coeﬃcients are generated in a recursive fashion,
using the same calculation scheme and entry parameters
as the delay generation logic [21].
D. Sum Across the Channels
The sum operation across all channels is pipelined in
order to incorporate numerous inputs and to process them
at high clock frequency. The ﬁrst stage in the pipeline con-
tains 5-bit adders that sum the weighted outputs from the
channels. The adder pipeline is ﬁve levels deep and the
output is 10-bits wide.
E. In-Phase and Quadrature Filters
The 120-tap ﬁlter structure is illustrated in Fig. 10.
The 10-bit coeﬃcients are stored in random access mem-
ory (RAM) blocks of the FPGA and can be reloaded from
an external source, for example a computer. The quadra-
ture ﬁlters use the same ﬁlter structure and are applied
simultaneously to the sum data.
F. Implementation Results
The software package Xilinx ISE Series 4.2i was used
in combination with Synopsis FPGA Express (Synopsys,
Inc., Mountain View, CA) for compiling the VHDL code.
After compilation, the estimated gate count for the 32-
channel beamformer is 1,274,116.
The estimated maximum operation frequency of a
32-channel beamformer for target device Virtex E
XCV2000E-7BG560C by Xilinx, Inc., is 129 MHz. That
estimate takes into account only the logic switching de-
lays. After taking into account the signal routing delays,
the estimated maximum operation frequency is 71.6 MHz.
The estimated power consumption of the beamformer logic
for a clock frequency of 140 MHz is 1.4 W.
The highest number of beamformer channels that can
ﬁt in the XCV200E device is 57, at which point the beam-
former suﬀers a severe performance drop due to complex
and suboptimal routing of signals.
Several approaches exist for achieving higher operating
frequency. One of these is to use a faster FPGA device. An-
other is to exert more control over the placement process
in order to keep the routing lengths (and signal delays)
low. If close placement of logic block is not possible but
increased latency is acceptable, registers can be inserted
manually at appropriate places.
VI. Discussion
From the simulation plots, it can be seen that the quan-
tization noise of the DSM limits the image contrast. Im-
provement can be achieved through using a more sophisti-
cated modulator architecture or increasing the OSR. With
the suggested beamformer architecture, it is easy to con-
nect higher order delta-sigma modulators with the same
output data size without changes in the beamformer. The
data ﬂow principle allows straightforward expansion (reim-
plementation) for accommodating modulator data widths
of two or more bits. Beamforming multiple beams in one
ﬁring cycle can be done by connecting several beamformers
in parallel. In such a setup, each 1-bit modulator output
should be connected to the corresponding inputs in diﬀer-
ent beamformers. The fact that single bit digital signals
are propagated is very convenient for this kind of expan-
sion.
Use of longer excitations or full length matched ﬁlter
would necessitate wider parallelization, i.e., the matched
ﬁltering will have to be implemented with more multipli-
cation blocks working in parallel. In the selected target
FPGA, the next suitable parallelization factor after four
is eight, because the available output word widths for the
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on December 2, 2009 at 03:54 from IEEE Xplore.  Restrictions apply. 
tomov and jensen: compact extendable beamformer utilizing oversampled a/d converters 879
Block SelectRAM+ can be powers of only 2. Other par-
allelization factors can be implemented if the memory is
read at a higher rate and more complex alignment logic is
used. With increased ﬁlter length, the architecture allows
beamforming with coded excitation signals, e.g., chirps.
Because of the higher noise level in the image beam-
formed using oversampling, the corresponding velocity es-
timates contain higher error compared to the conventional
imaging. Improvement in this area can be achieved by us-
ing samples with a lower level of the quantization noise.
The ways to achieve that have been outlined above.
The operation speed (and the OSR) of the architec-
ture can be increased by using a faster FPGA device. The
large diﬀerence in the operation frequency estimates shows
that the size of the design has negative inﬂuence on the
achievable performance, unless manual placement is used
to minimize the longest paths.
VII. Conclusions
A novel, ﬂexible beamformer architecture using over-
sampling has been presented. A 32-channel beamformer
can be implemented in one standard FPGA, which can
be programmed easily and upgraded. Such a beamformer
oﬀers signiﬁcant space reductions compared to a conven-
tional multibit beamformer and can be used for building
an eﬃcient and compact ultrasound scanner.
Acknowledgments
The authors would like to thank all colleagues at the
biomedical engineering group and the reviewers for their
valuable comments.
References
[1] A. M. Chiang, P. P. Chang, and S. R. Broadstone, “PC-based
ultrasound imaging system in a probe,” in Proc. IEEE Ultrason.
Symp., 2000, pp. 1255–1260.
[2] J.-J. Hwang, J. Quistgaard, J. Souquet, and L. A. Crum,
“Portable ultrasound device for battleﬁeld trauma,” in Proc.
IEEE Ultrason. Symp., 1998, pp. 1663–1667.
[3] K. E. Thomenius, “Evolution of ultrasound beamformers,” in
Proc. IEEE Ultrason. Symp., 1996, pp. 1615–1621.
[4] J. C. Candy and G. C. Temes, Oversampling Delta-Sigma Data
Converters—Theory, Design and Simulation. New York: IEEE
Press, 1992.
[5] D. K. Peterson and G. S. Kino, “Real-time digital image recon-
struction: A description of imaging hardware and an analysis of
quantization errors,” IEEE Trans. Sonics Ultrason., vol. 31, pp.
337–351, 1984.
[6] S. Holm and K. Kristoﬀersen, “Analysis of worst-case phase
quantization sidelobes in focused beamforming,” IEEE Trans.
Ultrason., Ferroelect., Freq. Contr., vol. 39, pp. 593–599, 1992.
[7] B. D. Steinberg, “Digital beamforming in ultrasound,” IEEE
Trans. Ultrason., Ferroelect., Freq. Contr., vol. 39, pp. 716–721,
1992.
[8] W. E. Engeler, M. O’Donnell, J. T. Pedicone, and J. J.
Bloomer, “Dynamic phase focus for coherent imaging beam for-
mation,” U.S. Patent 5,111,695, Priority date: 1990, Granted
1992.
[9] H. T. Feldka¨mper, R. Schwann, V. Gierenz, and T. G. Noll,
“Low power delay calculation for digital beamforming in hand-
held ultrasound systems,” in Proc. IEEE Ultrason. Symp., 2000,
pp. 1763–1766.
[10] B. G. Tomov and J. A. Jensen, “Delay generation methods
with reduced memory requirements,” in Proc. SPIE Med. Imag.,
2003, pp. 491–500.
[11] M. Karaman, A. Atalar, and H. Ko¨ymen, “VLSI circuits for
adaptive digital beamforming in ultrasound imaging,” IEEE
Trans. Med. Imag., vol. 12, pp. 711–720, 1993.
[12] O. Norman, “A band-pass delta-sigma modulator for ultrasound
imaging at 160 MHz clock rate,” IEEE J. Solid-State Circuits,
vol. 31, pp. 2036–2041, 1996.
[13] S. R. Freeman, M. K. Quick, M. A. Morin, R. C. Anderson,
C. S. Desilets, T. E. Linnenbrink, and M. O’Donnell, “Delta-
sigma oversampled ultrasound beamformer with dynamic de-
lays,” IEEE Trans. Ultrason., Ferroelect., Freq. Contr., vol. 46,
pp. 320–332, 1999.
[14] M. Kozak and M. Karaman, “Digital phased array beamforming
using single-bit delta-sigma conversion with non-uniform over-
sampling,” IEEE Trans. Ultrason., Ferroelect., Freq. Contr.,
vol. 48, pp. 922–931, 2001.
[15] S. Haykin, Communication Systems. New York: Wiley, 2001.
[16] D. Johns and K. Martin, Analog Integrated Circuit Design. New
York: Wiley, 1997.
[17] S. R. Norsworthy, R. Shreier, and G. C. Temes, Delta-Sigma
Data Converters: Theory, Design and Simulation. Hoboken, NJ:
Wiley, 1996.
[18] J. A. Jensen, “Field: A program for simulating ultrasound sys-
tems,” Med. Biol. Eng. Comp., vol. 4, no. Suppl. 1, pt. 1, pp.
351–353, 1996.
[19] J. A. Jensen, O. Holm, L. J. Jensen, H. Bendsen, H. M. Ped-
ersen, K. Salomonsen, J. Hansen, and S. Nikolov, “Experimen-
tal ultrasound system for real-time synthetic imaging,” in Proc.
IEEE Ultrason. Symp., 1999, pp. 1595–1599.
[20] S. R. Freeman, M. O’Donnell, T. E. Linnenbrink, M. A. Morin,
M. K. Quick, and C. S. Desilets, “Beamformed ultrasonic im-
ager with delta-sigma feedback control,” U.S. Patent 5,964,708,
priority date, 1997, granted 1999.
[21] B. G. Tomov and J. A. Jensen, “Compact implementation of
dynamic receive apodization in ultrasound scanners,” in Proc.
SPIE Med. Imag., 2004, pp. 260–271.
Borislav Gueorguiev Tomov was born
on Nov. 28, 1973 in Montana, Bulgaria. He
earned a M.Sc. degree in electronics from
the Technical University of Soﬁa, Bulgaria, in
1996, and Ph.D. degree form the Danish Tech-
nical University, Denmark, in 2003. He is cur-
rently an Assistant Professor at the latter. His
research interests include ultrasound imaging
and digital signal processing.
Jørgen Arendt Jensen (M’93–SM’02)
earned his Master of Science in electrical en-
gineering in 1985 and the Ph.D. degree in
1989, both from the Technical University of
Denmark. He received the Dr.Techn. degree
from the university in 1996. He has published
a number of papers on signal processing and
medical ultrasound and the book Estimation
of Blood Velocities Using Ultrasound, Cam-
bridge University Press, in 1996.
He is also developer of the Field II simula-
tion program. He has been a visiting scientist
at Duke University, Stanford University, and the University of Illi-
nois at Urbana-Champaign. He is currently full professor of Biomed-
ical Signal Processing at the Technical University of Denmark at
Ørsted•DTU and head of Center for Fast Ultrasound Imaging. He
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on December 2, 2009 at 03:54 from IEEE Xplore.  Restrictions apply. 
880 ieee transactions on ultrasonics, ferroelectrics, and frequency control, vol. 52, no. 5, may 2005
has given courses on blood velocity estimation at both Duke Univer-
sity and University of Illinois and teaches biomedical signal process-
ing and medical imaging at the Technical University of Denmark.
He has given several short courses on simulation, synthetic aperture
imaging, and ﬂow estimation at international scientiﬁc conferences.
He is also the co-organizer of a new biomedical engineering educa-
tion program oﬀered by the Technical University of Denmark and
the University of Copenhagen. His research is centered around simu-
lation of ultrasound imaging, synthetic aperture imaging and blood
ﬂow estimation, and constructing systems for such imaging.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on December 2, 2009 at 03:54 from IEEE Xplore.  Restrictions apply. 
