General purpose VLSI median filter and its applications for image processing by Karaman Mustafa, Onural Levent, Atalar Abdullah
A GENERAL PURPOSE VLSI MEDIAN FILTER 
AND ITS APPLICATIONS FOR 
IMAGE PROCESSING 
Mustafa Karaman, Levent Onural, and Abdullah Atalar 
ABSTRACT 
A general purpose median filter configuratioii con- 
sisting of two single-chip median filters is proposed. 
One of the chips is designed for the applications re- 
quiring variable word-length and variable window 
size whereas the other one is for real-time applica- 
tions. The architectures of the chips are based on 
the  odd/even transposition sorting. The cliips are 
implemented in 3 - p m  M2CMOS by using full-custom 
VLSI design techniques. The chips together with a 
reasonable external hardware can be used for tlie 
realizations of many median filtering techniques. In 
this paper, tlie VLSI design procedure of the chips 
and their applications to  different median filtering 
techniques for image processing are presented. 
1. INTRODUCTION 
The median of an odd number of elements is de- 
fined as the middle element when the elenieuts are 
sorted. Output of a median filter is the median of 
its input data,  and the resulting nonlinear smoothing 
filter can filter out tlie impulsive noises from signals 
and images while preserving the edge-information 
[l]. Such filters are frequently used in many sig- 
iial and image processing applications. In  terms of 
impulsive noise suppressioa, edge preservation, and 
ease of design, the performance of median filters are 
better than tlie other smoothing filters such as linear 
filters [2] and generalized mean filters [3]. 
l n  1-D and 2-D standard median filtering applica- 
tions, a window of size w, w is odd, moves on the 
sampled values of the signal or image, and then the 
median of the samples within the window is com- 
puted and written as the output element at the loca- 
tion of the center of tlie window. Theoretical anal- 
ysis and applications of the median filters can be 
found in the literature [4,5]. Mostly, median filters 
are implemented in general purpose computers [6,7]. 
However, there are also hardware implementations 
for faster filtering purposes [8]. Because of the low 
VLSI cost of sorting structures, most of the hard- 
ware median filtering algorithms are based on sort- 
ing [9]. 
In  order t o  increase tlie performance of tlie median 
filters for particular applications, various techniques 
such as weighted [IO], separable [ll], recursive [12], 
adaptive-length [13], generalized [14], selective [15], 
hybrid [lG] median filtering techniques have been de- 
veloped. The computation of the median of a group 
of elements is tlie fundamental operation in all those 
techniques. Thus, tlie standard median filters are 
used as tlie basic components for the realizations of 
other techniques. 
Tlie window size of the median filter and the word- 
length of the elements are not tlie same in different 
applications. Also, the required speed of the filter- 
ing operation varies depending on tlie application. 
In  order t o  meet these changing demands, a general 
purpose VLSI median filter unit which coiisists of 
two single-chip median filters, one extensible and one 
real-time, is designed. The extensible median filter 
chip is designed for the applications requiring vari- 
able word-lengths and variable window sizes whereas 
the real-time median filter chip is for tlie real-time 
median filtering applications. The architectures of 
the chips are bit-level pipelined systolic structures 
based on tlie odd/even transposition sorting. The 
cliips are implemented in 3-pm M2CMOS by using 
full-custom VLSI design techniques. In  tlie following 
sections, the architectures, VLSI implementations, 
and some possible applications of tlie chips are pre- 
sented. 
2. ARCHITECTURES 
2.1 Extensible Median Filter Architecture 
The extensible median filter is an odd/even trans- 
position sorting network which is a pipelined regu- 
lar structure consisting of 9 compare-and-swap stages 
(Fig.1.a). Each stage consists of 5 bitwise compare- 
and-swap units. Each of these units compares two 
one-bit numbers at its inputs and interchanges them 
if necessary so that  tlie larger one is at the “top”. At 
the output of the last stage, the data  will be sorted 
such that the largest will be a t  the top, and tlie me- 
dian will be in the middle. At each clock, one bit 
from each word (total of 9 bits) enter the network 
and one bit of tlie median is obtained a t  the output. 
Tlie flow is from the most significant bits toward the 
least significant bits both a t  the input and a t  the 
366 
S. E. output. Because of the bitwise serial data flow, this 
structure allows arbitrary word-length, L .  
Tlie bitwise compare-and-swap unit (CSU1) is a fi- 
iiite state niacliilie wliicll lias tlrree legal operatioli 
states: r q u n l .  ~ ( I A S ,  and J U J ~ ~ .  CSUl is set to  the equal 
state a t  the end of eacli data word by a reset signal. 
Thus the reset signal flows tlrrougl~ tlie stages of the 
network a t  a rate of one stage per clock cycle by 
iiieaiis of the pipeliiied delay units. CSUl stays in 
equal state as long as its inputs are equal. However, 
it locks itself into one of tlie pass or swap states de- 
peiidiiig 011 its inputs aiid stays in that state until 
it is reset. The state diagram aiid tlie operations at  
different states are given iii Fig.1.b. 
In the extensible median filter structure given in 
Fig.l.a, the upper and b u t e r  e.ctcn~zora I/O’s (.rt,<,’s and 
y2l0’s) are used to  extend the filter to  larger win- 
dow sizes. For ti1 = 9, tlie upper arid lower exteii- 
sioii inputs are connected to  logic 1’s and logic 0’s so 
that the correspondiiig coinpare-and-swap units act 
as delay units. On tlie other hand, the design allows 
tlie iiitercoiiiiectioi~s of many of these chips to form 
mediaii filters for (11 > 9. 
Ai  , Bi : om - bel Input dota 






swap s l d *  
&&= Bi ,Bs A,) 
R 
( b )  
Figure 1: Tlie extensible median filter: a) arcllitec- 
ture, b) compare-and-swap unit (CSU1). 
- /  
bitwise de!oy unii. 
/ 
- rn, 
x ,  y z, bits of the inputs corresponding ’he new elements In a 
r”, I ’’ bit of the median 
5, E, 
3 x 3  sliding window 
Test inputs to- testing of the blocks individually 
!a! 
€3, €3, A,= i f  So then B, else A, 
Bo= i f  Sothen A,  else B, 
Figure ArQ 2: The real-time ( b )  median filter: a) architec- 
S o =  Si + E, A i < B i  1 
Eo = E, ( A,  = 8, 1 conpore 
and swop 
so Eo 
ture, b) coinpare-and-swap unit (CSU2). 
The extensible median filter generates its outputs 
with a delay of ti’ + L clocks; and after tlie network 
is full, it finds one L-bit iiiediaii per L clocks. Al- 
though, the resulting speed inay be sufficient for tlie 
real-time median filtering of 512 x 512 frames with 
L < 3,  it is not enough for tlie real-time filtering of 
1024 x 1024 frames with L > 1. 
2.2 Real-Time Median Filter Architecture 
Tlie real-time iiiediau filter is designed by intercoii- 
iiectiiig 8 odd/evea transposition sorter blocks in 
parallel [9] (Fig.2.a). In this network, the data enter 
in such a way that the iiiost significant bits go t o  tlie 
first block, the second iiiost significant bits t o  tlie 
second block, and so on. Tlie bitwise compare-aiid- 
swap unit used in this network is slightly different 
tliaii that  of tlie extensible one, because t11e“swap” 
or ‘ipass” information flows froiii upper to  lower block 
so that tlie compare-and-swap unit takes this iiifor- 
mation, uses, updates and sends it out (Fig.2.b). For 
proper timing, the delay units are included at  tlie in- 
put aiid output of tlie network. 
367 
The real-time median filter has nine 8-bit data  in- 
puts and it generates one 8-bit median per clock. 
At every clock, three new elements enter the chip, 
corresponding to  the new elements of a sliding 3 x 3 
window. Since the clock period is determined by the 
delay of one compare-and swap unit (CSUS), recent 
VLSI technology allows the implementation of CSU2 
a t  a speed larger than the real-time operation rate 
for the 1024 x 1024 frames with L = 8. 
3. CHIPS 
Both of the extensible and real-time median filter ar- 
chitectures are regular arrays of the bitwise compare- 
and-swap units. Also, their internal communica- 
tion schemes are simple and regular. This makes 
the VLSI implementations easy and straightforward 
[17,18]. The architectures are mapped to  hardware 
by using standard CMOS logic style [19] in 3 - p  dou- 
ble metal n-well process. For generation of the chip 
layouts, and their simulations, full-custom VLSI CAD 
tools [20,21] are used: magic for layout editing, Spice, 
Rnl ,  and Esim for simulations. The overall layouts of 
the chips are shown in Fig.3. 
frequency up to  40 MHz with a power dissipation 
less than 800 mW at this frequency. I t  generates one 
median per clock so that its throughput is 40 mega 
medians/s. I t  consists of about 22000 transistors and 
has an area of 45 mm2 (6.8 mmx6.6 mm) and 40 pins. 
The testing of the chips are easily accomplished by 
the functional test techniques [22] since the operations 
of the cells can be selectively probed by using proper 
test vectors. The test vectors and the expected out- 
puts are generated by using software tools written 
for these purposes. There are 500 test vectors for 
the extensible median filter chip, and 12,000 for the 
other one. 
4. APPLICATIONS 
In image processing applications, median filters are 
used mainly for noise suppression and for edge de- 
tection. For impulsive noise suppression, standard 
median filtering technique is a good choice. How- 
ever, for suppression of nonimpulsive noises other 
techniques such as adaptive-length, separable, recur- 
sive, and weighted median filtering techniques may 
be more convenient. For edge detection, generalized, 
hybrid, and selective median filtering techniques are 
frequently used. In  addition, the weighted median 
filtering can be also used for edge detection by choos- 
ing the weight coefficients properly. 
The designed median filter chips can be selectively 
used in a processor environment by means of the chip 
enable signal that  each chip has. Furthermore, one 
can realize any median filtering technique mentioned 
above by using the extensible and/or the real-time 
median filter chips together with or without a rea- 
sonable external hardware: 
For the standard median filtering technique, the 
exact medians of the elements, in a window size 
w = 9 with arbitrary word length L ,  can be 
found by using only one extensible median filter 
chip. For w > 9 with arbitrary L ,  at most [[w/9] 
1’ ([[.I]’ indicates the smallest greater integer) 
chips are required to find the exact medians. 
On the other hand, the real-time median filter 
chip can find the exact running medians of the 
elements in a window of a fixed size w = 9 with 
fixed word length L = 8 a t  the real-time rate. 
Figure 3: The layouts of the median filter chips: 
a) real-time, b) extensible. 
According to  the simulation results, the extensible 
median filter chip can run up to  a clock frequency of 
30 MHz with a power dissipation less than 250 mW 
a t  this frequency. The throughput of the chip is 
about 30/L  mega medians/s. The chip consists of 
about 5000 transistors and has an area of 11.7 mm2 
(3 mmx3.9 mm) and 28 pins. On the other hand, 
the real-time median filter chip can run with a clock 
The extensible median filter is a favorable choice 
to  realize the adaptive-length median filters [13], 
since one can change the window size from 3 t o  
indefinitely large ones by using the extensible 
median filter chip(s) by applying logic 0’s or 1’s 
to unused inputs of the chip(s) appropriately. 
For the realizations of the weighted median filters 
[lo], the extensible median filter can be used 
with a pipelined multiplier t o  multiply the in- 
put data  with the weight coefflcients. Since all 
input data of the chip are entered to  the chip 
directly at each move of the window, one can 
realize an adaptive weighted median filter by 
changing the weight coefflcients a t  each posi- 
tion of the window on the frame. 
368 
0 A pair of the extensible or the  real-time me- 
dian filter chips can be used as a selective me- 
dian filter [15] together with a n  external control 
logic consisting of two full-word subtracter and 
a full-word comparator. 
Either the extensible or the  real-time median 
filter chip can be used as a line-recursive median 
Alter [13] by loading the window elements from 
the frame appropriately. 
0 The chips can be used for the realizations of the 
separable median filters [11] without any external 
hardware. 
5. CONCLUDING REMARKS 
A general purpose VLSI median filter unit consisting 
of two single-chip median filters and its applications 
are presented. The architectures of the chips are 
modular and have regular communication schemes 
which make the VLSI implementations rather easy 
and straightforward. Both of the architectures are 
not preferable t o  be implemented at larger window 
sizes since the area is proportional t o  the wz. We 
have chosen w = 9, because this is the most com- 
monly used window size in two dimensional median 
Altering applications. 
The main contributions of this study are the archi- 
tecture of the extensible median filter and its VLSI 
implementation. Another achievement of this study 
is the implementation of the real-time median fll- 
ter  which can operate at the real-time rate for the 
1024 x 1024 resolution frames. 
ACKNOWLEDGMENT 
This research was sponsored by NATO’s Scientific 
Affairs Division in the framework of the Science for 
Stability Programme. 
References 
[l] J. W. W e y ,  “Nonlinear (nonsuperposable) methods for 
smoothing data,” in Conf. Rec., p. 673, EASCON 1974. 
[2] A. V. Oppenheim and R. W. Schafer, Digital Signal 
Processing, Englewood Cliffs, N.J.: Prentice-Hall, 1975. 
[3] A. Kundu, S. K. Mitra, and P. P. Vaidyanathan, “Appli- 
cation of two-dimensional generalized mean filtering for 
removal of impulse noises from images,” IEEE %ns. 
Acowtic, Speech, and Signal Processing, vol. ASSP-32, 
NO. 3, pp. 600-609, Jun. 1984. 
[4] N. C. Gallagher, Jr., and G. L. Wise, “A theoreti- 
cal analysis of the properties of median filters,” IEEE 
Bans .  Acowtic, Speech, and Signal Processing, vol. 
ASSP-29, pp. 1136-1141, Dec. 1981. 
[5] E. Ataman and E. Alparslan, “Application of median 
filtering algorithm to images,” Electronics Division, 
Marmara Research Institute, Gebze, Turkey, Tech. Rep. 
U1 78/10, Sep. 1978. 
[6] E. Ataman, V. K. Aatre, and K. M. Wong, “A fast 
method for real-time median filtering,” ZEEE h n s .  
Acowtic, Speech, and Signal Processing, vol. ASSP-28, 
pp. 415-421, Aug. 1980. 
[7] V. V. B. Rao and K. S. Rao, UA new algorithm for real- 
time median filtering,” IEEE h n s .  Acowtic, Speech, 
and Signal Processing, vol. ASSP-34, pp. 1674-1675, 
Dec. 1986 
(81 D. L. Knuth, The Art of Computer Progmmming- 
Searching and Sorting, vol. 3. Reading MA: Addison- 
Wesley, 1973. 
[9] K. Oflazer, “Design and implementation of a single-chip 
1-D median filter,” IEEE %ns. Acowtic, Speech and 
Signal Processing, vol. ASSP-31, pp. 11641168, Oct. 
1983. 
[lo] T. Loupas, W. N. McDicken, and P. L. Allan, Noise 
reduction in ultrasonic images by digital filtering, “ The 
British Journal of Radiology, vol. 60, pp.389-392, Apr. 
1987. 
[I11 T. A. Nodes and N. C. Gallagher, Jr., “Two- 
dimensional root structures and convergence properties 
of the separable median filter,” IEEE !?+am. Acowtic, 
Speech, and Signal Processing, vol. ASSP-31, pp. 1350- 
1365, Dec. 1983. 
[12] C. G. Boncelet Jr., “Recursive algorithms and VLSI 
implementations for median filtering,” Proc. of IEEE 
ISCAS‘88, pp. 1745-1747. 
[13] H. M. Lin and A. N. Willson, Jr., “Adaptive-length me- 
dian filters for image processing,” Proc. of IEEE IS- 
CAS‘88, pp. 2557-2560. 
[14] Y. H. Lee and S. A. Kassam, “Generalized median fil- 
tering and related nonlinear filtering techniques,” ZEEE 
Bans .  Acoustic, Speech, and Signal Processing, vol. 
ASSP-33, pp. 672-683, Jun. 1985. 
[15] S. J. KO, Y. H. Lee, and A. T. Fam, “Selective median 
filters,” Proc. of IEEE ISCAS‘88, pp. 14951498. 
[16] Y. Neuvo, P. Heinonen, and I. Defee, “Linear-median 
hybrid edge detectors,” IEEE Bans. Circuih and Sgs- 
terns, vol. CAS-34, pp. 1337-1343, Nov. 1987. 
[17] M. J. Foster and H. T. Kung, “The design of special 
purpose VLSI chips,” IEEE Computer, pp. 26-40, Jan. 
1980. 
(181 H. T. Kung, “Why systolic architectures?,” IEEE Com- 
puter, pp. 37-46, Jan. 1982. 
[19] N. Weste and K. Eshraghian,Pltnciples of CMOS VLSI 
Design, Reading MA: Addison-Wesley, 1985. 
[20] Berkeley CAD Tools User’s Manual, EECS Dep., Uni- 
versity of California at Berkeley, 1986. 
(211 VLSZ Tools Reference Manual, TR#87-02-01, Release 
3.1, NW Lab. Int. Sys., Dep. Computer Sci., University 
of Washington, Feb.1987. 
[22] J. A. Abraham and W. K. Fuchs, “F‘ault and error mod- 
els for VLSI,” ZEEE Proc., vol. 74, pp. 639-654, May 
1986. 
369 
