FPGA-based implementation of a fuzzy motion adaptive de-interlacing algorithm by Brox, Piedad et al.
FPGA-based implementation of a fuzzy motion 
adaptive de-interlacing algorithm
P. Brox, S. Sánchez-Solano and I. Baturone
Instituto de Microelectrónica de Sevilla, IMSE-CNM (CSIC)
Edificio CICA, Av. Reina Mercedes s/n, 41012 Sevilla, SPAIN 
Phone: +34955056666, Fax: +34955056686, E-mail: {brox|santiago|lumi}@imse.cnm.esAbstract- This paper surveys the hardware imple-
mentation of a de-interlacing algorithm on Field-Pro-
grammable Technology for real-time processing. The
algorithm presented evaluates the level of motion at each
pixel, and determines the interpolation between a spatial
and a temporal method according to the presence of
motion. To achieve it the algorithm employs an hierarchi-
cal structure with three simple fuzzy systems. The first
one performs a set of fuzzy rules to apply reasoning in
order to detect motion; the second one selects the most
convenient direction to implement an edge-dependent
line average method; and the third one is used to choose
the most adequate temporal method.
The hardware implementation of this algorithm
combines pipeline architecture with a parallel processing
of fuzzy rules to accelerate the computation. As result an
efficient implementation is developed in terms of compu-
tational time and hardware cost.
I. INTRODUCTION
Current FPGA devices include look-up tables,
registers, multiplexers, and distributed and block mem-
ory, as well as specific circuitry for fast adders, multi-
pliers, and I/O processing. This characteristic, together
with a complete and unlimited reprogramming capabil-
ity, have made FPGAs become key components in
implementing high performance DSP systems in recent
years, especially in the areas of digital communica-
tions, networking, video and imaging. Several tools
have been developed in order to facilitate the design of
FPGA-based DSP designs. The presented design flow
utilizes one of these tools called System Generator, a
system level tool developed by Xilinx (XSG)[1].
This paper describes the design and implementa-
tion of an algorithm for video de-interlacing. This type
of algorithms are currently in demand by a wide
number of devices, such as HDTVs, DVDs, projectors,
etc., that require a progressive scanning format. Inter-
lacing was introduced by TV industry as the most effi-
cient method to reduce transmitted information. It
consists of halving video bandwidth by eliminating
lines, according to the order in which frames are sent.
Frames with odd numbers only contain the odd lines of
the image whereas frames with even numbers only
contain the even lines. A complete frame containing
odd and even lines can be calculated at the receiver
using the interpolation techniques provided by
de-interlacing methods [2].
The paper is organized as follows. A brief descrip-
tion of the algorithm is expounded in Section II. The
strategy of algorithm implementation is explained in
Section III. Implementation results are presented in
Section IV. Finally, the main conclusions are outlined
in Section V.
II. ALGORITHM STUDY
Among de-interlacing algorithms, motion adap-
tive algorithms offer a good trade-off between cost and
quality [3]. This kind of algorithms combines a spatial
method, , and a temporal method, , according to
the presence of motion. They are based on the idea that
temporal interpolation is very suitable for static areas,
while spatial interpolation is more adequate when the
level of motion is high.
The algorithm implemented herein uses fuzzy
logic to interpolate between  and  de-interlacing
methods. It uses as input system an evaluation of
motion, which is calculated as the bi-dimensional con-
volution of a difference matrix  of luminance val-
ues from consecutive fields of the sequence:
(1)
where  are the convolution weights and are
the following values (see Fig.1):
(2)
IS IT
IS IT
H( )
motion
Σ Cij Hij
Σ Cij
------------------------------
2 4 2 H11 H12 H13
T
8
---------------------------------------------------------------= =
 Fig. 1: Pixels involve in the calculation of motion.
B0
X0
E0
X
B
E
Xn
(t-1)
Interpolated lineTransmitted line
(t) Sequence 
order
(t+1)
Current pixel
C i j,( ) H i j,( )
H11
B B0–
2
------------------= H12
Xn X0–
2
----------------------= H13
B B0–
2
------------------=
Different sizes of matrices  and  have been
studied to achieve a good trade-off between the
resources required and the quality of motion measure-
ment obtained [4]-[5]. The selected matrices, as can be
seen in expression (1), only includes neighbors in ver-
tical direction since a wide number of simulation
sequences shown a non-decisive influence of horizon-
tal neighbors.
Analyzing the values of matrix  in expression
(2), it can be seen that is necessary the use of interpo-
lated values calculated in the previous field (  in
Fig.1). To calculate the first progressive frame the spa-
tial method  is applied.
The influence of motion in selecting the contribu-
tion of each interpolator is evaluated by considering
the rules in Table 1. The fuzzy concepts small (S),
large (L) and medium (M), used in the rulebase are
modeled according to the membership functions
shown in Fig.2. Using the Fuzzy Mean as defuzzifica-
tion method, the new pixel value is calculated as fol-
lows:
(3)
where  is the activation degree of rule .  and 
are linear coefficients being its sum equal to one
( ).
Our proposal introduces two main novelties over
conventional motion adaptive methods. The first one
is the use of fuzzy instead of crisp values to define dif-
ferent motion levels. This provides a more robust
motion detection since threshold values usually pro-
duce wrong decision in areas where the decision is
unclear. The second one is the inclusion of a third rule
that increases the interpolation capability of the fuzzy
system. Rulebases with up to five rules have been ana-
lyzed in [6]. Nevertheless, the base with three rules
provides the most attractive solution in terms of hard-
ware resources and quality of the interpolated image
[6].
Moreover, the proposed algorithm also used two
simple fuzzy systems to calculate the  and  inter-
polation modes. The proposal for the spatial interpola-
tion performs an edge-adaptive interpolation by
analyzing the five predetermined directions
(a1,a,b,c,c1) shown in Fig.3(a) [7]. The rules of the
Table in Fig.3(b) select the most adequate direction to
apply the average of luminance values. The fuzzy con-
cepts very large (VL), large (L), small (S) and very
small (VS) used in the rulebase are defined by the
membership functions shown in Fig.3(c). The final
result  is given by:
(4)
H C
TABLE 1. Rulebase for interpolation selection 
if then
1) motion is S IS
2) motion is L IT
3) motion is M λIT+δIS
H
B0 E0,
IS
X α1IT α2IS α3 λIT δIS+( )+ +=
α i i λ δ
λ δ+ 1=
IS IT
 Fig. 2: Membership functions used in the rulebase for interpolation
selection.
1
0
S (small) M (medium) L (large)
µmotion
motion0.5 8.5 72.5
 Fig. 3: (a) Pixels involve in the calculation of the spatial interpolator. (b) Rulebase to select the spatial interpolation according to the presence
of edges. (c) Membership functions of the fuzzy concepts used in the rulebase.
X
C1B CA1
F1
A
D1 D E F
(t) Sequence 
order
(a)
IS=(C1+D1)/2a1 is VL and a is VL and b is L and c is L
and c1 is S
5)
IS=(A+F+C+D)/4a is VS and b is L and c is VS3)
IS=(C+D)/2a is L and b is L and c is S2)
IS=(B+E)/2otherwise6)
IS=(A1+F1)/2a1 is S and a is L and b is L and c is VL
and c1 is VL
4)
IS=(A+F)/2a is S and b is L and c is L1)
thenif
1
0
µa
a
(b) (c)
a=|A-F| a1=|A1-F1|
b=|B-E|
c=|C-D| c1=|C1-D1|
S L
4       20 52   68
VL 
(very
large)
VS 
(very
small)
IS
IS β1
A F+
2
-------------⎝ ⎠⎛ ⎞ β2
C D+
2
--------------⎝ ⎠⎛ ⎞ β3
A C D F+ + +
4
----------------------------------⎝ ⎠⎛ ⎞+ + +=
β+ 4
A1 F1+
2
-------------------⎝ ⎠⎛ ⎞ β5
C1 D1+
2
--------------------⎝ ⎠⎛ ⎞ β6
B E+
2
-------------⎝ ⎠⎛ ⎞+ +
where  is the activation degree of the rules in the
Table of Fig.3(b).
To select the best choice for temporal interpolation
another fuzzy system is used. It makes a decision
depending on the similarity between two consecutive
fields, given by the following expression:
(5)
The pixels used in expression (5) are shown in
Fig.4(a). The rulebase takes a decision using a fuzzy
transition to distinguish which pixel is the most ade-
quate: the pixel in the previous  or in the next field
 (see Table of Fig.4(b)). The fuzzy definitions used
in the rulebase are shown in Fig.4(c) and the result of
 is calculated as follows:
(6)
where  is the activation degree of the rules in the
Table of Fig.4(b).
III. ALGORITHM IMPLEMENTATION WITH XSG
Advances in VLSI technologies have encouraged
a rapid growth in capacity and performance of FPGAs.
On the other hand, the reconfiguration capability of
FPGAs allows adapting its hardware resources for a
specific processing system. This ability together with
the development of powerful design tool such as XSG,
which considerably reduces the overall system devel-
opment time, have made FPGAs as one of the most
attractive solution to develop rapid prototypes of dig-
ital signal processing (DSP) applications. The follow-
ing subsections describe the implementation of the
de-interlacing algorithm proposed in this paper.
A. Design specifications
The experimental set-up is configured using a
XUP Virtex-II Pro Development board [8].It is an
advanced hardware platform that contains a Virtex-II
Pro FPGA surrounded by peripheral components that
can be used to create a complex system. This board
incorporates expansion connectors than can be used to
connect a video capture board. This device acts as an
interface between a video source such as camcorder,
VCR, CCD camera, etc. and the board. The video
decoder board is centered on the ADV7183B video
decoder chip from Analog Devices, which can detect
standard analog baseband television signals (NTSC,
PAL and SECAM) and provides an output digital video
signal. This conversion is realized according to the
ITU-R BT 656 recommendation from the International
Telecommunication Union (ITU), and is independent
of the standard (NTSC, PAL or SECAM).
This recommendation describes an interface in
which the code words that describe the video signal are
transmitted in the form of eight bits at 13.5MHz. This
forces the system to compute a new interpolated pixel
value at 27MHz. 
B. System design with XSG
Current FPGAs incorporate large amounts of block
RAMs resources. Particularly, the design is developed
on the Virtex-II Pro XC2VP30, which contains 136
block RAMs (BRAMs) and a total memory of 2,448
Kb [9]. Each block RAM built into the FPGA can be
used with a configurable depth and width data. Due to
the system requirements the design implements
BRAMs with an 8-bit word width and a parametric
memory depth (it is adaptive with the format of the
video sequence).
To develop the fuzzy system that evaluates motion
and selects the interpolation methods three field mem-
ories are required: a first one for the previous field
(t-1), a second one for the current field (t), and a third
one to store the calculated values of the previous field
(see Fig.1(a)).
Block memories which implement field memories
β i
similarity x y t, ,( )
B B0– E E0–+
2
--------------------------------------------=
X0
Xn
IT
IT γ1X0 γ2X+ n=
γ i
 Fig. 4: (a) Pixels involve in the calculation of the temporal interpolator. (b) Rulebase to select the temporal interpolator (c) Membership func-
tions of the fuzzy concepts used in the rulebase.
1
0
S L
µsimilarity
similarity
(a) (c)
B0
E0
X
B
E
(t-1) (t) Sequence order
X0
9.5
2)  similarity is L            IT=Xn
1) similarity is S            IT=X0
if then
1.5
(b)
Fig. 5: Block diagram of the field memories. Each field memory
provides alternately the previous (white box) or current
(grey box) field.
CLK
DATA
ADDR
WE
EN
FIELD 
MEMORY
DATA 
OUTPUT
FIELD 
MEMORY
DATA 
OUTPUT 46
p
1
24 2
55 33 11
Field output
Field output
NOT
(t-1) and (t) are configured into ‘read after write
mode’. The implementation is performed enabling the
write mode in one of the field memories, and disabling
it in the other. Therefore, both modes are complemen-
tary and this causes that the current field continuously
changes from one output field memory to the other as
shown in Fig.5. A control signal is used to identify the
current field.
The implementation of the field memory to store
the interpolated pixels is realized following two differ-
ent strategies. BRMAs and the use of a distributed
memory, which employs slices of the FPGA. Both
alternatives are presented in Section IV. For the two
first de-interlaced fields, this field memory stores the
values calculated by the  interpolator. For the rest of
the fields, the previously calculated field is used. 
XSG tool is integrated into the Simulink environ-
ment. It consists of a Simulink library, called Xilinx
blockset, and software to translate a Simulink model
into a hardware realization of the model described in
VHDL language. Fig.6 shows the XSG design to
implement L membership function (see Fig.2). The
membership functions used in the rest of rulebases are
implemented in a similar way. Note from the rules of
the Table in Fig.3(b) that the antecedents are connected
with and connectives. The minimum operator is
selected for the implementation of these connectives. 
Modern FPGA devices also incorporate embedded
multiplier blocks [9]. The inputs of these embedded
multiplier blocks can be up to 18 bits wide, and the out-
put up to 36 bits. They are optimized for high-speed
operations and have a lower power consumption com-
pared to a multiplier implemented in slices. Besides,
the use of the embedded multipliers leaves free slices
in the FPGA that can be employed to implement other
resources. Our design uses twelve of these multipliers
to perform the expressions in (3), (4) and (6).
The inputs of the block that implements the fuzzy
system to calculate , are taken from the output of the
current field memory (t). Ten luminance values are
necessary to compute the five inputs (a1,a,b,c,c1) of the
system (see Fig.3(a)). A line buffer and eight registers
are used to achieve the required luminance values as
shown in Fig.7. XSG provides two ways to implement
a line buffer, using delay blocks or specific Virtex-II
line buffers [1]. The delay block is a shift register of
configurable length. Data presented at the input will
appear at the output after a user specified number of
sample periods. The Virtex-II line buffer block delays
a sequential stream of pixels by the specified buffer
depth. It is optimized for the Virtex-II family since it
uses the ‘read before write’ option on the underlying
Single Port RAM block. Both options have been used
in the design implementation as shown in Section IV.
IV. IMPLEMENTATION RESULTS
The performance of the proposed algorithm has
been analyzed by de-interlacing standard video
sequences. The video sequences considered have
widely been used as benchmarks in video processing
applications. The interlaced video data have been
obtained from these progressive sequences by elimi-
nating lines. The peak signal-to-noise ratio, which is
called PSNR, has been employed as figure of merit to
compare the quality between the obtained interpolated
frames and the original ones. It is defined as follows:
IS
 Fig. 6: XSG design to implement the fuzzy concept L.
Fig. 7: Block diagram to obtain the ten pixel values shown in
Fig.3(a).
DATA 
OUTPUT
LINE 
BUFFER
R-1 A1AC1
F1
R-1 BC R-1R-1
R-1 D1DR-1 EF R-1R-1
IS
(7)
where MSE is the mean squared error between the orig-
inal and the reconstructed image.
The proposed algorithm has also been compared
with other de-interlacing algorithms with less or simi-
lar computational cost: four spatial method such as line
doubling, line average, and conventional ELA using
3+3 and 5+5 taps; the simplest temporal de-interlacing
algorithm called field insertion, and two vertico-tem-
poral filtering with two and three fields [2]; and,
finally, other fuzzy motion adaptive algorithms
reported in [10] and [11]. Table 2 shows the average
PSNR obtained when de-interlacing fifty fields of
seven video sequences. As it can be seen, the proposed
algorithm achieves the better results. All the algo-
rithms presented in Table 2 have been coded in Matlab,
and these results correspond to its execution using dou-
ble-precision.
This section also contains implementation results
in terms of device utilization, that is, hardware
resources from FPGA used in the implementation, and
also the maximum frequency achieved by the design to
compute a new pixel value. 
The XSG blocks which compose the design have
been defined using parametric values, that is, their
dimensions are non-fixed and are configured with var-
iables from the input workspace. This provides a
design with a high reconfigurability degree so as to
work with different video sequence formats. For
instance, Table 3 shows a summary of FPGA utiliza-
tion using QCIF (176x144) and CIF (352x288) for-
mats. The designs use Virtex-II line buffers blocks to
implement line buffers and BRAMs to implement the
three field memories. As it can be seen in Table 3, the
processing of a higher format mainly implies a high
increase of the number of BRAMs, whereas the rest of
resources rise moderately. Obviously, the number of
embedded multipliers used to implement the expres-
sions (3), (4) and (6) are the same.
Table 4 shows the implementation results when
the Virtex-II line buffer blocks are substituted for delay
blocks. This reduces the number of BRAMs since each
line buffer requires one BRAM at expense of a slight
increase in the number of slices: 1.23% (QCIF) and
2.58% (CIF).
Finally, the field memory to store the interpolated
pixel values is implemented using distributed memory
instead of BRAMS. The results showed that this option
is not efficient since it implies a large increase of the
number of slices into the FPGA. The design for the
QCIF format almost requires 90% of slices whereas
there is not enough slices into the XC2VP30 FPGA to
implement the algorithm for the CIF format. Finally,
implementation results for the fuzzy systems to calcu-
late the spatial and temporal interpolator are shown in
Table 5.
The algorithm implementation can be evaluated
from results obtained in the Simulink environment or
by modeling the VHDL description generated by XSG
PSNR 20 255
MSE
---------------⎝ ⎠⎛ ⎞log=
TABLE 2. PSNR values in (dBs) for different de-interlacing methods
Sequence
Format
Missa
CIF
Paris
CIF
Trevor 
CIF
Salesman 
CIF
News 
QCIF
Mother 
QCIF
Carphone 
QCIF
Line Doubling 36.44 23.61 31.05 29.75 25.18 31.81 28.25
Line Average 40.47 26.67 35.04 33.53 29.25 35.94 32.61
ELA 3+3 39.49 25.53 34.11 32.11 26.63 35.39 32.65
ELA 5+5 38.56 24.64 33.31 30.17 25.92 34.2 31.51
Field Insertion 38.36 29.86 34.36 36.17 33.13 36.14 30.34
VT 2fields 40.25 30.73 36.61 36.54 35.46 39.61 34.08
VT 3 fields 40.52 31.37 37.16 36.95 35.67 40.89 34.54
Technique in [10] 40.01 33.12 35.38 37.62 34.73 39.49 32.27
Technique in [11] 40.18 35.28 36.69 38.29 37.51 41.87 34.78
Proposal 40.81 35.87 37.63 38.35 38.78 42.11 35.09
TABLE 3. Implementation results in terms of hardware resources for the complete proposed algorithm. The design uses Virtex-II line 
buffers and BRAMs to implement field memories
Format Sequence Number of slices
Number of slices 
Flips Flops
Number of 4-input 
look-up tables (LUTs)
Number of 
BRAMs
Number of embedded 
multipliers
QCIF 1357 (9.91%) 1505 (5.49%) 1306 (4.76%) 35 (25.73%) 12 (8.82%)
CIF 1490 (10.87%) 1550 (5.65%) 1515 (5.53%) 92 (67.64%) 12 (8.82%)
with the ModelSim tool from Mentor Graphics. The
average PSNR value for ‘Salesman’ sequence is 37.71
dBs, whereas for the ‘Mother’ sequence is 40.35 dBs.
As it can be seen from the results in Table 2, errors are
higher for the hardware implementation because of the
algorithm described in Matlab works with double-pre-
cision numbers (64-bits).
The design which implements the algorithm com-
bines pipeline architecture with a parallel processing of
fuzzy rules to accelerate the computation. As result, a
new pixel values is interpolated each 9.61 ns (104.04
MHz). Therefore the design overcomes the timing con-
straints for real-time processing.
V. CONCLUSIONS
This paper presents the hardware implementation
of a de-interlacing algorithm on a Virtex-II Pro FPGA.
The algorithm uses three simple fuzzy systems to inter-
polate the non-transmitted lines of video signals. One
fuzzy system is used to decide the contribution of spa-
tial ( ) and temporal ( ) interpolators according to
the presence of motion. Other fuzzy system is used to
calculate the  interpolator, which is adaptive with the
existence of edges in the image. Finally, the third fuzzy
system calculates the most adequate  interpolator.
The three fuzzy systems utilize simple fuzzy rulebases,
which are implemented in parallel to accelerate the
computation. The strategy of implementation employs
pipeline architecture and provides a new interpolated
pixel in a clock period. As a result, an efficient imple-
mentation of the algorithm in terms of processing time
and hardware cost is achieved.
REFERENCES
[1] Xilinx Inc., “Xilinx System Generator for DSP (v9.1.01)
User’s Guide”, March 2007. Web address to download:
http://www.xilinx.com/support/sw_manuals/sysgen_ug.pdf.
[2] G. de Haan and E. B. Bellers, “De-interlacing: an overview,”
in Proc. of the IEEE, Sep. 1998, pp. 1839–57.
[3] A. M. Bock, “Motion adaptive standards conversion
between formats similar field rates,” Signal Processing:
Image Communication, vol. 6, no. 3, pp. 275–80, Jan. 1994..
[4] P. Brox, I. Baturone, S. Sánchez-Solano, J. Gutiérrez-Ríos
and F.Fernández-Hernández, “A fuzzy edge-dependent
motion adaptive algorithm for de-interlacing,”Fuzzy Sets
and Systems. Special Issue: Image Processing, vol. 158, no.
3, pp. 337–347, Feb. 2007.
[5] P. Brox, I. Baturone and S. Sánchez-Solano, “A fuzzy
motion adaptive algorithm for interlaced-to-progressive
conversion,” in Proc. of Information Processing and Man-
agement of Uncertainty in Knowledge-Based Systems
(IPMU), Jul, 2006.
[6] P. Brox, I. Baturone and S. Sánchez-Solano, “Fuzzy motion
adaptive algorithm for video de-interlacing,” in Proc. of
International Conference on Knowledge-Based and Intelli-
gent Information and Engineering Systems (KES), Oct,
2006.
[7] P. Brox, I. Baturone and S. Sánchez-Solano, “A fuzzy
edge-dependent interpolation algorithm,” in Soft Computing
in Image Processing: Recent Advances. Heidelberg, Ger-
many. Springer, 2007.
[8] Xilinx Inc., “Xilinx University Program Virtex-II Pro Devel-
opment System UG069 (v1.0)”, March 2005.
http://www.xilinx.com/univ/xup2vp.html
[9] Xilinx Inc., “Virtex-II Pro and Virtex-II Pro X FPGA User
Guide UG012 (v4.1)”, March 2007. Web address to down-
load: http://www.xilinx.com/bvdocs/userguides/ug012.pdf
[10]D. Van de Ville, W. Philips and I. Lemahieu, “Fuzzy-based
motion detection and its application to de-interlacing,” in
Fuzzy techniques in image processing. Book Series of Stud-
ies in Fuzziness and Soft Computing, 2000.
[11]J. Gutiérrez-Ríos, F. Fernández-Hernández, J. C. Crespo and
G. Treviño, “Motion adaptive fuzzy video de-interlacing
method based on convolution techniques,” in Proc. of Infor-
mation Processing and Management of Uncertainty in
Knowledge-Based Systems (IPMU), Jul, 2004.
TABLE 4. Implementation results in terms of hardware resources for the complete proposed algorithm. The design uses delays blocks to 
implement line buffers and BRAMs to implement field memories
Format Sequence Number of slices
Number of slices 
Flips Flops
Number of 4-input 
look-up tables (LUTs)
Number of 
BRAMs
Number of embedded 
multipliers
QCIF 1526 (11.14%) 1855 (6.77%) 1294 (4.72%) 32 (23.52%) 12 (8.82%)
CIF 1843 (13.45%) 2271 (8.27%) 1506 (5.49%) 89 (65.44%) 12 (8.82%)
IS IT
IS
IT
TABLE 5. Implementation results in terms of hardware resources for the fuzzy systems to calculate the spatial and temporal interpolator
Interpolator
Format 
Sequence
Number of 
slices
Number of slices 
Flips Flops
Number of 
4-input LUTs
Number of embedded 
multipliers
Spatial QCIF 579 (4.22%) 659 (2.41%) 639 (2.33%) 6 (4.41%
Spatial CIF 721 (5.27%) 1073 (3.92%) 1016 (3.71%) 6 (4.41%)
Temporal QCIF 72 (0.52%) 64 (0.23%) 58 (0.21%) 2 (1.47%)
Temporal CIF 72 (0.52%) 64 (0.23%) 58 (0.21%) 2 (1.47%)
