FPGA applications in signal and image processing by Appiah, Kofi E
FPGA Applications in Signal and Image Processing Kofi E Appiah
University of Lincoln
DSP-Based Design
The branch of electronics concerned with the 
representation and manipulation of signals in digital form. 
Processing which comprises:
•Image Processing (including medical imaging)
•Non-linear signal processing applications like Artificial 
Neural Networks
A/D DSP D/A
Analog input
signal
Digital input
samples
Modified output
samples
Analog output
signal
Analog domain Digital domain Analog domain
DSP algorithms typically require huge numbers of 
multiplication and additions. For example the node j in a 
feed-forward ANN has a basic node function of the type
Where g() is the output function and f() is the 
activation/squashing function for which the 
sigmoid/logistic function is the commonest.
Consider the following implementations of the 
expression: 
Y=(A * B) + (C * D) + (E * F) + (G * H)












+= ∑
=
N
i
jiijj xfgx
1
ϕω
Bearing in mind that multipliers are relatively large and 
complex than registers and multiplexers, the choice of 
implementation becomes  clear from the above. 
“Everybody knows that DSP is the technology driver for 
the semiconductor industry,” says Will Strauss, an analyst 
with Forward Concepts Co., Tempe, AZ. 
Artificial Neural Networks (ANN)
A massively parallel computing model used purposely for 
finding smart algorithms to build computing devices for 
technical use. Systems built with ANN have the following 
nice features:
•Able to generalize.
•Massively parallel, suitable for hardware implementation.
•Contains an amount of redundancy for “graceful 
descent”.
•Can handle incomplete data.
Image Processing
The processing of 2D spatial information (matrix of pixel values), 
incorporating a great amount of data. This requires a high level of 
processing capacity only offered by parallel computing.
Algorithms in this area can be broken down into pixel-operation 
functions, segmentation/motion estimation and interpretation.
The Visual system 
as a Neural Network
Real-world applications
•Industrial Inspection
•Identification & Authentication
•Medical Diagnosis
•Defence
Most of these systems have 
been implemented in software 
on conventional sequential 
computers, due to lack of 
appropriate hardware. Thus 
undermining the true potential 
of the inherent, parallel ANN.
Segmentation always poses a problem. This 
illustrates a form of ambiguity one will face when 
segmenting an object from the 
background(which can be seen as black with 
white oval or white with black hollow block ).
A typical demonstration of the use 
of ANN in image segmentation, 
based on MIN/MAX nodes.
From the above sequence, the algorithm is expected to produce the 
foreground image when training converges. 
Hardware Options
Research is not complete if an algorithm is not feasible to be 
implemented or the suitable hardware architecture is not available. 
A good hardware platform should provide good performance 
including high computation throughput, low power consumption and
small design area.




∈
−∪∈
=
MAXMINx
MAXMINx
xF
n
,...1
)12,(),0(...0
)(



≥
<
=
Tx
Tx
xG
...1
...0
)(( )∑
=
=
U
i
iIFGR
1
)(
Outlook
•ANN algorithms with the appropriate hardware can easily be used in solving 
computationally intensive signal processing problems.
•New implementation platform calls for new design process, for efficiency and 
accuracy.
•With the 0.1µ technology, ANN can be implemented on FPGA by storing the 
weights on an on-chip RAM and updated during training. 
•ANN can be used in image processing for motion estimation and 
interpretation.
•The integration of storage and computation within a single FPGA unit are 
keys that make reconfigurable computing system potential for image 
processing.
16-bit SR
flip-flop
clock
mux
y
q
e
a
b
c
d
16x1 RAM
4-input
LUT
clock enable
set/reset
x
+
A[n:0]
B[n:0] Y[(2n - 1):0]
Multiplier
Adder
Accumulator
MAC
0
20
40
60
80
100
120
140
160
180
200
0.13µm 90nm 0.13µm 0.18µm 0.25µm 0.35µm
Intel®
Pentium® 4
Intel®
Pentium®
M
Intel®
Pentium®
M
Intel®
Pentium® III
Intel®
Pentium® II
Intel®
Pentium®
Pro
Technology/Processor
N
u
m
b
e
r
 
o
f
 
T
r
a
n
s
i
s
t
o
r
s
(
M
i
l
l
i
o
n
s
)
Many Thanks…
•Nectar Electronics Ltd.
•Clive “Max” Maxfield, The Design Warrior’s Guide to FPGA, 2004.
•Radek Holota, Neural Network with MIN/MAX Nodes for Image Recognition and its 
Implementation in Programmable Logic Devices, 2002.
•Dag Stranneby & William Walker, Digital Signal Processing & Applications, 2004.
•Forsyth & Ponce, Computer Vision: a modern approach, 2003.
•Paul Churchland, of the University of California at San Diego.
•Jing Ma, Signal and Image Processing Via Reconfigurable Computing, 2003.
•J. Batlle et al, A new FPGA/DSP-Based Parallel Architecture for Real-Time Image Processing, 
2002.
•Dan Ganousis, Top-Down DSP Design Flow to Silicon Implementation, March 2004
Source: Xilinx
FPGAs allow the DSP designer to “fit the architecture to the algorithm” – that is, the designer can 
implement as many parallel resources inside the FPGA as necessary to realize the performance 
required of the system. In general-purpose processors, the resources are fixed as each processor 
contains a finite number of basic computing functions such as multiply accumulators (MAC). 
Thus, in a general-purpose DSP processor, the designer must “fit the algorithm to the architecture”
and the required performance is not obtainable as in an FPGA. 
The BIG Question???
What makes Field Programmable Gate Array (FPGA) so special and feasible for the 
implementation of these complex, processor and memory hungry systems?
The total number of transistors found in a general-purpose processor is directly proportional to 
the DSP performance gap. This calls for a new a better way of implementing such applications or 
systems.
