In this paper, a new low power design method of the FIR filter for image processing is proposed. Because the correlation between adjacent pixels is very high in image data, the clock gating technique can be a good candidate for low power strategy. However, the conventional clock gating strategy that is applied independently to every flip-flop of the filter give rise to too much additional area overhead and couldn't get a good result in the power reduction. In our method, each tap register, which is used to delay the input data in the filter, is partitioned into two sub-registers according to the correlation characteristic of its input space. For the sub-register which highly correlated data is inputted into, the dynamic power consumption is reduced by diminishing switching activity of the clock signal. We can also reduce the additional hardware overhead by propagating the clock gating control signal of the first tap register to other tap registers.
INTRODUCTION
As the portable electronic market including mobile communication equipments have been making a good success, the interests on the low power design is on its sharp increase [1, 2] . Especially, the digital filter is one of the most popular devices in DSP applications that process image and speech signals. Because the power consumption of filters is a large part of the total power of such DSP application products, researches are concentrated in the development of algorithm considering low power digital filter architecture [3] [4] [5] . There are two methods for implementing low power filters, that is, the optimization of filter coefficients and the transformation of the filter structure.
In [3] and [4] , reducing power consumption using differential coefficient instead of the original *Corresponding author. Tel.: + 82-2-2290-0562, Fax: + 82-2-2299-2129, e-mail: jmjung@unitel.co.kr e-mail: jchong@email.hanyang.ac.kr or reordering the sequence of processing filter coefficients reduce the total power consumption.
In [5] , the author shows that the transformation of the filter structure reduces switching activity at internal nodes in the filter.
In this paper, we [6, 7] . Figure 4 shows the clock gating circuitry to the first two sub-registers that consists the first tap register.
From the figure, the tap register is partitioned into the sub-register(subl) that lower 4 bits of the input are inputted to and the sub-register(sub2) that higher 4 bits are inputted to. Then, the clock gating is applied to partitioned sub-registers. As shown Figure 2 , sub (sub2) has low(high) correlation. The clock gating probability is defined as the probability that the clock signal is inactive. The clock gating probability of sub for our method is almost same as that of each flip-flop of sub l, because the sub has low correlation and the input of register is likely to change at every clock. On the other hand, our method is area effective, which resulting in reducing the total power consumption.
As the input of sub2 is highly correlated, the clock where N is the number of tap register.
Clock Gating for Pipeline Register in Filter Operation Part
The filter operation part consists of multiplier and adder for convolution operation. Because the delay time of operation part must be less than one clock period, the pipeline registers can be inserted for high-speed operation. In this case, the efficient clock gating for pipelined register is shown in Figure 6 . The output of gating logic in Figure 6 is the output signal A of D flip-flop in Figure 5 .
For convenience, even if all tap registers have their own gating logic, some gating logics are omitted in Figure 6 . REG1 and REG2 are the pipeline registers. The inputs of REG1 (REG2) are the result of convolution operation for three (two) tap register and coefficients. When all outputs of the gating logics are 0, the clock of corresponding 
