Abstract--A new technique for the implementation of a single hardware structure capable of computing any rank order filter is presented in this paper. The proposed technique, which is based on the majority gate, achieves faster extraction of setting flag signals and, therefore, shorter processing times are attained. A pipelined systolic array, suitable for performing rank order filtering, is also presented. Applications of rank order filters include digital image processing, speech processing and coding and digital TV applications. ~ 1997 Pattern Recognition Society. Published by Elsevier Science Ltd.
l. INTRODUCTION
Rank order filters are a class of nonlinear filters. The input of such filters is a window of data with an odd number of elements. These elements are sorted in ascending order and the output of the rank order filter with rank r is the rth element (rth order statistic)J 1) Special cases of rank order filters are median, minimum and maximum filters, where the outputs are the median, the minimum and the maximum values of the input data window, respectively. Rank order filters exhibit excellent robustness properties and provide solutions in many cases where linear filters are inappropriate. They can suppress high-frequency and impulse noise in an image, avoiding at the same time extensive blurting of the image, since they have good edge preservation properties. They have found numerous applications, such as in digital image analysis, in speech processing and coding, in digital TV applications, etc. (~ '2) Median and rank order filters are strongly related with morphological filters, another class of nonlinear illters. (3'4) It has been shown that erosions and dilations are special cases of rank order filters and that any rank order filter can be expressed either as a maximum of erosions or as a minimum of dilations. °) Therefore, algorithms which originally have been devised for rank order and median filters can be used for realization of morphological operators. (5'6) Several algorithms have been proposed for the realization of rank order filters, such as tree sorts, shell sorts and quick sorts. (1) Although these algorithms are suitable for software implementation, they result in inefficient hardware structures, since they handle the numbers in wordlevel. Rank order filters can be implemented in VLSI * Author to whom correspondence should be addressed. Email: ioannis @ orfeas.ee.duth.gr. using the threshold decomposition technique. (7-9) However, in this case hardware complexity increases exponentially with both the resolution of the numbers and the size of the data window. Therefore, implementation of filters capable of handling high-resolution numbers is not practical. Bit-sliced algorithms suitable for hardware implementation have been proposedJ 1°'11) In these implementations, the numbers are handled in bit-level in order to obtain local minima and maxima, from which the rth order statistic is obtained. However, several building blocks are required to implement the local minima and maxima functions, and, thus, the hardware complexity increases.
Different hardware structures of an efficient algorithm for rank order filters have been presentedJ [12] [13] [14] These are based on the selection of intermediate signals through a device which gives output "1" if the number of its inputs which are "1" is greater or equal to the rank of the filter, otherwise its output is "0". This device has been implemented using the following techniques: The device for the median computation (majority gate) is shown in Fig. 1 . It is a nonlinear voltage divider, built by output-wired inverters and an inverting buffer. The latter approach is the most advantageous in terms of silicon area among the three. It replaces either an N 1-bit binary tree adder and a [log2(N)+ 1 ]-bit comparator (comparison after summation technique) or N÷ 1 gates of N 1 inputs (PBF technique), with 2N÷2 transistors. However, once the device has been designed, the rank order of the filter is fixed (PBF and CMOS programmable device approaches). In the comparison after summation approach, it is possible to implement any rank order filter by controlling the second input of the comparator. However, this technique lacks in terms of both silicon area and operation speed. In this paper the majority gate technique, based on the CMOS programmable device, has been modified and the proposed hardware structure is capable of computing any rank order filter. A pipelined systolic array suitable for performing rank order filtering is also proposed. Furthermore, efficient extraction of setting flag signals resulted in a faster hardware structure.
MEDIAN VALUE COMPUTATION ALGORITHM
In bit-sliced algorithms, bits of different significance are handled by dedicated Processing Elements (PEs) in different stages of the process. The process starts with the Most Significant Bits (MSBs). Flag signals derived from previous stages are used for further processing in the successive stages. The process is based on the majority selection of intermediate signals. The selection is achieved using a majority gate, which operates as follows: its output is "1" if over half of its inputs are "1 ", otherwise its output is "0".
Definitions and notations
Suppose that the total number of the data window elements xi is W = 2N + 1, where 0 < i < W and N is a positive integer. The median of numbers xi is m med(Xl,X2,... ,xw).
(
xi numbers are represented in binary form of k-bit resolution. Suppose that bij is the jth bit of the binary representation of xi, then
Also suppose that oj is the jth bit of the binary representation of m, then
j I
The following flags and intermediate signals are defined:
• rij is the rejecting flag signal, which indicates whether the number xi remains within the subset of the candidate numbers to be the median, in the jth step of the algorithm. When rij is "1", number xi remains within the subset, whereas when rij is "0", number xi is rejected from the subset. Once rij is set to "0" it remains constant in the successive stages and the remaining bits b i are not taken into account.
• lid is the setting flag signal, which replaces the bij bits of the rejected numbers in the majority selection process. The setting flag signal tij is set to the complementary value of the previous output bit oj i when the number has been rejected. In this way, the number which has been rejected is pushed away from the median value. If the state of the rejecting flag rij has not been changed to "0", the setting flag is in a "don't care" state.
• iij is an intermediate signal, which is either bid ifxi has not been rejected or tij if xi has been rejected. The output bit oj is "1" if the majority of iij is "l", otherwise it is "0".
into account for the majority selection. 
Algorithm description
The median value computation procedure follows. ( t 2 ~41 The MSBs of the numbers within the data window are first processed. The other bits are then processed sequentially until the Less Significant Bits (LSBs) are reached. Initially, the rejecting flag signals ri, l are set to "1" since all the numbers are candidates to be the median value. The setting flags ti, 1 are in a "don't care" state. If the majority of the MSBs bi,1 are found to be "1", then the MSB of the output is o~ ="1 ", otherwise 0~-"0". In the following stage the bits bi,2 of the numbers which have MSBs complement to o~ are rejected and are not taken
ALGORITHM FOR ANY RANK FILTER IMPLEMENTATION
The previously described algorithm can be easily implemented in hardware. However, it should be noticed that the rank order of this hardware structure is fixed after the design of the majority gate. 041 In this section, a new hardware implementation technique based on the majority gate is presented. This technique implements a single hardware structure capable of computing any rank order filter. Suppose that there are W = 2N + 1 numbers xi, the rth order statistic of which is required. A hardware module similar to the one shown in Fig. 1 having W' = 4N + I inputs implementing the median computa- W' numbers are ordered in ascending sequence, dt are placed to the extremes of this sequence. The key concept is that by having a method to compute the median value of 4N+ 1 numbers and by being able to control 2N of these numbers, any rth order statistic of the rest 2N+l numbers can be determined. Figure 2 ~X (5 ) in ascending order (the subscript in parentheses denotes the rank). The larger window contains nine numbers also in ascending order. By controlling the number of dummy inputs which are pushed to the top and to the bottom, any order statistic r of the numbers xi can be obtained. More specifically, r exceeds by one the number of dummy inputs which are pushed to the top. Table 1 illustrates the proposed technique for the computation of the second-order statistic of nine 4-bit numbers (a "don't care" state is denoted by X). Since the second-order statistic is searched, seven of the dummy inputs have been set to 0000, whereas the remaining one has been set to 1111. In the first stage of the process the rejecting flags are "1", since all inputs are candidates and the setting flags are in a "don't care" state. The majority of the MSBs of the W ) numbers is "0" and, therefore, the MSB of the output is "0". In the second stage the numbers of which the MSBs were "1" (i.e. x4, XT, xs, Xg, and ds) are rejected by setting their rejecting flag to "0". Then the corresponding setting flags are set to the complementary value of the MSB of the output (i.e. "1 ") and they are used for the majority selection. Also, in this stage, the majority of the bits bi, 2 of the numbers which are taken into account and the setting flags of the numbers which have been rejected is "0" and, thus, the output of this stage becomes "0". The process continues similarly in the two remaining stages and the output (x3-3), which is both the median of the W' = 4N ÷ 1 = 17 numbers and the second-order statistic of the W = 2N + 1 = 9 numbers, is obtained.
A SYSTOLIC ARRAY IMPLEMENTATION FOR RANK ORDER FILTERING
A pipelined systolic array suitable for implementing rank order filters is presented in this section. Its architecture is scalable and its hardware complexity expands linearly both with the size of the data window and the resolution of the numbers. The proposed architecture operates faster than other existing ones, ~a) since intermediate signals are derived faster. Using the definitions of Section 2.1, the truth table for rij+h iij and tij+~ is constructed as shown in Table 2 . From this table the Karnaugh maps, shown in Fig. 3 , are derived. From Fig. 3(a) (4) where., +, and ® stand for logical AND, OR, NOT and XNOR, respectively. Also, from Fig. 3(b) ,
ii.j rij • bij ÷ ~ij " tij. (5) Finally, from Fig. 3(c) tij+l can be written as ti,i+l (~i.j" tij) ÷ (rij" ojj) (6) 1576 A. GASTERATOS et al.
Realization of equation (7) leads to a faster hardware structure, since propagation delay of ~ signal is omitted. The realization of the proposed Processing Element (PE) is based on equations (4), (5) and (7) . The circuit diagram of this PE is shown in Fig. 4 . Due to its simplicity (there are only three stages of gates including the inverters), it can attain very short processing times, independent of the data window size. Also, it becomes clear that the hardware complexity of the PE grows linearly with the number of its inputs.
A pipelined systolic array capable of computing rank order values is shown in Fig. 5 . The inputs to this array are W t --4N + 1 numbers (from which W -2N + 1 are the data window and the 2N are the dummy inputs). The systolic array consists of PEs separated by registers (R). The resolution of the registers, which hold the data window numbers, is reduced by one bit in each successive stage, since there is no need to carry the bid coefficients which have already been processed. On the other hand, the resolution of the registers, which hold the result, is increased by one bit in each successive stage.
CONCLUSIONS
A new technique for realization of rank order filters based on the modification of the majority gate has been presented in this paper. A new PE has been designed for more efficient extraction of setting flag signals and, therefore, shorter processing times, independent of the data window size, have been achieved. Also, the hardware complexity of the PE grows linearly with the number of its inputs. A pipelined systolic array architecture suitable for performing rank order filtering has also been proposed. Typical applications of such an array include digital image processing, speech processing and coding, as well as digital TV applications, where rank order filters are employed.
