In order to meet the changing demands of different median filtering applications, a VLSI median filter unit is designed and implemented in 3-pm M'CMOS by employing full-custom VLSI design techniques. The unit consisk of two single-chip median filters, one extensible and one realtime. The architectures of the chips are bit-level pipelined systolic structures based on the odd/even transposition sorting. The extensible chip is designed for the applications requiring variable window sizes and variable word-lengths whereas the other one is for real-time applications. Various median filtering techniques are easily realized by using the designed chips together with a reasonable external hardware.
INTRODUCTION
The median filtering is a nonlinear smoothing technique that has been frequently used in many signal and image processing applications to filter out the impulsive noises while preserving the edge-information 111. In the standard median filtering applications, a window of size w, w is odd, moves on the sampled values of the signal or image, and then the median of the samples within the window is computed and written as the output element at the location of the center of the window [2] . In terms of impulsive noise suppression, edge preservation, and ease of design, the performance of median filters are better than the other smoothing filters such as linear filters and generalized mean filters [3] .
In order to increase the performance of the median filters for particular applications, various median filtering techniques have been developed. For impulsive noise suppression, the standard median filtering technique is a good choice However, for suppression of nonimpulsive noises other techniques such as adaptive-length [4] , separable [5] recursive [6] , and weighted [7] median filtering techniques may be more convenient. For edge detection, generalized [8), hybrid [9] , and selective [lo] median filtering techniques are frequently used. In addition, the weighted median filtering can be also uscd for edge detection by choosing the weight coefficients properly.
Since the computation of the median of a group of elements is the fundamental operation in all of the techniques cited above, their realizations can be accomplished using the standard median filter as the basic component. Thus, there have been much efforts to develop high performance software and hardware standard median filters [11, 12, 13, 14] .
However, the window size of the median filter and the wordlength of the elements are not the same in different applications. Also, the required speed of the filtering operation varies depending on the application. A median filter which is intended as a component of a general purpose signal or image processor must meet these changing demands. We propose a solution to that problem in the form of two single-clip median filters, one extensible and one real-time, which are implemented in 3-pm M'CMOS by employing full-custom VLSI design techniques. The extensible median filter chip is designed for the applications requiring variable word-lengths and variable window sizes whereas the realtime median filter chip is for the real-time median filtering applications. The architectures of the chips are bit-level pipelined systolic structures based on the odd/even transposition sorting. In the following sections, the architectures, VLSI implementations, and some possible applications of the chips are presented.
ARCHITECTURES

Extensible Median Filter Architecture
The extensible me2ian filter is an odd/even transposition sorting network which is a bit-level pipelined regular structure consisting of 9 CO npare-and-swap stages ( Fig.1) . Each stage consists of 5 bitwise compare-and-swap units. Each of these units serially compares two numbers at its inputs and interchanges them if necessary so that t,he larger one is at the "top". At the output of the last stage, the data will be sorted such that the largest will be at the top, and the median will be in the middle. At each clock. one bit from each word (total of 9 bits) enter the network and one bit of the median is obtained at the output. The Row is from the most significant bits toward the least significant bits both at the input and at the output. Because of the bitwise serial data flow, this structure allows arbitrary word-length, L.
The bitwise compare-and-swap unit (CSUI) is a finite state machine which has three legal operation states: equal ( S E = OI), paJs ( S E = 00), and swap ( S E = 10) ( Fig.2) .
CSUl is set to the equal state at the end of each data word by a reset signal. Thus the reset signal flows through the stages of the network at a rate of one stage per clock cycle by means of the pipelined delay units. During the compltation, the CSUl stays in the equal state and passw til,. input data unaltered as long as the two input bits arc. c>,ll1:ll as they flow in. However, it locks itself into the pnss si:lr,~ when it first finds tha.t A, > B i and passes the inlnits I I I I : I~ tered. On the other hand it locks itself into the. s~1 1 ' st:ltc\ when it first finds that A; < B i and swaps b l a i l l p t s ,
In the extensible median filter structure given in Fig.1 , the upper and lower eztension I/O's (x+,'s and g~+'s) are used to extend the filter to larger window sizes. For w = 9, the upper and lower extension inputs are connected to logic 1's and logic 0's so that the corresponding compare-andswap units act as delay units. On the other hand, the design allows the interconnections of many of these chips to form median filters for w > 9 (Fig.3) .
The extensible median filter generates its outputs with a delay of w + L clocks; and after the network is full, it finds one L-bit median per L clocks. Although, the resulting speed may be sufficient for the real-time median filtering of 512 x 512 frames with L < 5, it is not enough for the realtime filtering of 1024 x 1024 frames with L > 1. The real-time median filter is designed by interconnecting 8 odd/even trmsposition sorter blocks in parallel [13] (Fig.4) . In this network, the data enter in such a way that the most significant bits go to the first block, the second most significant bits to the second block, and so on. The bitwise compare-and-swap unit used in this network is slightly different than that of the extensible one, because the"swap" or "pass" information flows from upper to lower blocks so that the compare-and-swap unit takes this information, uses, updates and sends it out (Fig.5) . For proper timing, the pipelined delay units are included at the input and output of the network.
Real-Time
The real-time median filter has nine 8-bit data inputs and it generates one 8-bit median per clock. At every clock, three new elements enter the chip, corresponding to the new elements of a sliding 3 x 3 window. Since the clock period is determined by the delay of one compare-and swap unit (CSUZ), recent VLSI technology allows the implementation of CSUZ at a speed larger than the real-time operation rate for the 1024 x 1024 frames with L = 8.
S. E.
: / . bitwme delay u d . Table. 1. Figure 6 : The layouts of the median filter chips: a) exten- The testing of the chips are easily accomplished by the functional test techniques [20] since the operations of the cells can be selectively probed by using proper test vectors. The test vectors and the expected outputs are generated by using software tools written for these purposes. There are 500 test vectors for the extensible median filter chip, and 12,000 for the other one.
( b )
APPLICATIONS
The extensible and the real-time median filter chips can be selectively used in a processor environment by means of the chip enable signal that each chip has. Furthermore, one can realize any median filtering technique mentioned in the introduction by using the extensible and/or the red-time median filter chips together with or without a reasonable external hardware:
For the standard median filtering technique, the exact medians of the elements, in a window size w = 9 with arbitrary word length L , can be found by using only one extensible median filter chip. For w > 9 with arbitrary L , at most [w/9]' ( [.] I' indicates the smallest greater integer) chips are required to find the exact medians (Fig. 3) . On the other hand, the real-time median filter chip can find the exact running medians of the elements in a window of a fixed size w = 9 with fixed word length L = 8 at the real-time rate.
The extensible median filter is a favorable choice to realize the adaptive-length median filters [4] , since one can change the window size from 3 to indefinitely large ones by using the extensible median filter chip(s) by applying logic 0's or 1's to unused inputs of the chip(s) appropriately.
For the realizations of the weighted median filters (71, the extensible median filter can be used together with a pipelined multiplier which multiplies the input data with the weight coefficients. Since all input data of the chip are entered to the chip directly at each move of the window, one can realize an adaptive weighted median filter by changing the weight coefficients at each position of the window on the frame.
A pair of the extensible or the real-time median filter chips can be used as a selective median filter [IO] together with an external control logic consisting of two full-word subtiacter and a full-word comparator.
Either the extensible or the real-time median filter chip can be used as a line-ncursiwe median filter [4] by loading the window elements from the frame appropriately.
The chips can be used for the realizations of the separable median filters [5] without any external hardware.
CONCLUDING REMARKS
A VLSI median filter unit consisting of two single-chip median filters and its applications are presented. The architectures of the chips are modular and have regular communication schemes which make the VLSI implementations rather easy and straightforward. Both of the architectures are not preferable to be implemented at larger window sizes since the area is proportional to the 12. We have chosen zu = 9, because this is the most commonly used window size in two dimensional median filtering applications.
The main contributions of this study are design and implementations of two single-chip versatile components for signal and image processing: an extensible median filter chip for adaptive-word-length and adaptive-window filtering applications, and a real-time median filter chip for realtime filtering of images with sizes up to 1024 x 1024 pixels. Wrthermore, it is concluded that a general purpose median filter unit can be formed by selectively using the chips in a full-scale general purpose digital signal or image processor environment.
