ABSTRACT
INTRODUCTION
A finite impulse response (FIR) digital filter has no feedback and is used in DSP application, which starts ranging from wireless mobile communications to video and image processing [2] . However, an area efficient 2-parallel FIR filter uses booth multiplier, carry look-ahead adder, and a carry-look-ahead subtractor for the design of the filter. Booth multiplier only multiplies in two signed binary numbers in two's complement and had high performance, consume low power and it does not have weak regularity. Let's take for an 8-bit binary number, in which the number may be either positive or negative and will be shown in two's complement format, i.e. the value is from -128 to +127 [4] . Traditional hardware multiplication is presented similarly as multiplication is done by hand: a. Partial products are computed, b. shifted appropriately, and c. Summed.
This booth algorithm can be increased if we reduced the number of the partial product (i.e. fewer bits) because output will wait only for few sum to perform [7] . Figure. Carry look-ahead adder is used in a digital logic circuit. The advantage of using carry lookahead adder is that it speeds up the bits and reduces area, and also it reduces the time required to examine the carry bits [3] .Since adders are less weight in term of silicon area, compare to multipliers. Therefore multipliers are replaced with adders for reducing area and speed of the filter [5] . The carry look ahead subtractor is a fast subtractor which is designed to reduce the delay. If utilize the fact that, at each point of the bit position, whether it should carry with a generated at that bit or it can carry with a propagated at that bit.
In this paper, we are implementing the area-efficient 2-paralel FIR digital filter using VHDL. Integrated circuit (IC) which is designed in VLSI has become a drawback regarding area and speed. Our project is about improving the drawback which makes the area less that is storage resource of memory becomes small, and the speed of the operator becomes faster [6] .
PARALLEL PROCESSING
Parallel processing and pipelining system are similar with one another. Independent sets of computations are computed and inserted in a pipelined technique, whereas a duplicate hardware computation is calculated and added in parallel processing [1] .
To obtain a parallel processing system, we should convert the SISO (single-input-single -output) system into a MIMO (multiple-input-multiple-output) system. For example, the given below expression shows three inputs parallel system per clock cycle (i.e., a level of parallel processing L=3) [1] .
Where k represents the clock cycle. We know that at the k th clock cycle all the 3 inputs x (3k), x (3k+1) and x(3k+2) are further processed and 3 samples are generated at the output. Because of the MIMO system, a latch has been placed which is also known as block delay (or L-slow). For example, if we delay the signal x(3k) by 1 clock cycle it will give us output as x(3k-3) instead of x(3k-1), which has been input in the different input line. The block architecture for a 3-parallel FIR filter is shown in figure.2 [1] . 
Parallel processing FIR filters for High Speed or Low power
Consider a 2-parallel FIR digital filter shown in figure.3[1] . The 2-parallel FIR filter has exactly two copies of the primary 4-tap FIR filter. The dashed line in fig.3 indicates the critical path.16-bit Binary adder and the 16 bit binary multiplier are used for 2-parallel FIR filter designing. We consider the input x(2k) and x(2k+1) as even and odd respectively. Here h0,h1,h2,h3 indicates the filter coefficient of 2-parallel filter. D means delay. Delay of one clock cycle, which means the value, has to be stored for one clock cycle. Similarly, consider the area efficient 2-parallel FIR filter in figure. 4 [1] . The area efficient 2-parallel filter shown in figure. 4 is more efficient in term of area and speed when compared with a basic 2-parallel FIR filter shown in figure. 3. In figure.4 , we consider the input x (2k) and x (2k+1) as even and odd respectively. D indicates the delay. Delay of one clock cycle, which means the value has to be stored for one clock cycle. The dashed line shows the critical line. h0, h1, h2, h3 are the filter coefficients. From synthesis report, it is found that memory usage for area efficient 2 parallel FIR filter is 301308 kb and 2-parallel FIR filter is 306920 kb and delay for area efficient is found to be 10.880ns and delay for 2-parallel is found to be 11.905ns. Thus, it is clear that the area efficient 2-parallel FIR filter has less area and more speed when compare with the existing 2-parallel FIR filter. Even the number of a slice, flip-flop, input LUTs are improved for area efficient 2-parallel FIR filter when compare to 2-parallel FIR filter. 
CONCLUSIONS
In this paper, area-efficient 2-parallel FIR filter is designed and compared with a primary 2-parallel filter. Area and speed of area-efficient 2-parallel filter are improved. Carry-look-ahead adder and subtractor are used in an area-efficient 2-parallel filter. For multiplication, booth multiplier is used in an area-efficient 2-parallel filter. All the simulated waveforms are discussed.
