Abstract-Gabor Filters are widely used in all kinds of image processing. Gabor Filters include a memory, a controller and an arithmetic logic unit. The Gabor Filter designed in this project has a RAM type Memory, but a few changes were made in the Controller and the Arithmetic Logic Unit (ALU). The Arithmetic Logic Unit had a new type of multiplier called a Vedic Multiplier. So building a Gabor Filter with Vedic Multipliers is something that we have introduced in this paper. Using Vedic Multipliers, our filter was made faster without affecting the functionality of filter. The project included two phases where we did simulation of the Verilog Codes and synthesis of the whole Gabor Filter. For simulation and coding modules with Verilog HDL, we used ModelSim-Altera 6.5b (Quartus II 9.1) Starter Edition. For synthesis of the units and examining RTL schematic diagrams, we used Xilinx ISE 9.2i.
I. INTRODUCTION
Image processing is any form of signal processing for which the input is an image, such as a photograph or video frame; the output of image processing may be either an image or a set of characteristics or parameters related to the image. Most image processing techniques involve treating the image as a two dimensional signal and applying standard signal processing techniques to it [1] .
In image processing, a Gabor Filter, named after Dennis Gabor, is a linear filter used for edge detection. Frequency and orientation representations of Gabor Filters are similar to those of the human visual system and they have been found to be particularly appropriate for texture representation and discrimination. In the spatial domain, a 2D Gabor filter is a Gaussian Kernel function modulated by a sinusoidal plane wave [2] .
Some of the implementations of Gabor Filters include fingerprint recognition, palm print recognition and facial recognition. It is also used in biomedical applications like processing kidney images and ultrasound images.
exchanged with series multipliers in the MAC to make it more area efficient whereas, Vedic Multipliers have not been used before.
A. Gabor Filter
The contents of our version of the Gabor Filter include three blocks inside the Top Level block. The three parts are the Control Logic Unit (CLU), the Arithmetic Logic Unit (ALU), and the Memory. The Memory is a RAM type memory, the CLU is simply a controller and the ALU contains the new addition to our design which is the Vedic Multiplier.
The "Convolution" signal indicates the operation of the filter. If the signal is high then the convolution process takes place. If it is low then the filter receives image input and stores it to the memory based on the input location. The data enters the filter pixel by pixel. The "Pixel_X" and "Pixel_Y" signal gave the address of the memory location [3] . 
B. Arithmetic Logic Unit (ALU)
The ALU has three blocks: the MAC block, the ROM block, and the Convolution Signal Buffer. Inside the MAC block resides three more blocks: the Data Control Buffer, the Vedic Multiplier, and the Accumulator.
Vedic Mathematics has been used in this work by using a Vedic Multiplier. The speed of MAC greatly depends on the multiplier. It enables parallel generation of intermediate products, eliminates unwanted multiplication steps with zeros and scaled to higher bit levels using Karatsuba algorithm with the compatibility to different data types. MAC is an extensible block using the Vedic Multiplier module plays an important role in computing especially digital signal processing.
C. Vedic Multiplication
Vedic mathematics is mainly based on 16 Sutras (formulae) which deals with various branches of mathematics like arithmetic, algebra, geometry, etc [4] , [5] Atharva Veda Swamiji constructed 16 sutras (formulae) and 16 Upa sutras (sub formulae). Vedic mathematics deals with several basic as well as complex mathematical operations. The methods of Vedic mathematics are extremely simple and very powerful. One of the methods include the Urdhva Triyagbhyam.
The work has proved the efficiency of Urdhva Triyagbhyam Vedic method for multiplication which strikes a difference in the actual process of multiplication itself. It enables parallel generation of intermediate products, eliminates unwanted multiplication steps with zeros and scaled to higher bit levels using Karatsuba algorithm with the compatibility to different dat a types. These Multipliers have an important effect in designing arithmetic, signal and image processors. Many mandatory functions in such processors make use of multipliers (for example, the basic building blocks in Fast Fourier transforms (FFTs) and multiply accumulate (MAC) are multipliers) [6] . Urdhva -Triagbhyam Sutra is a general multiplication formula applicable to all cases of multiplication. It literally means "Vertically and Crosswise". To Illustrate this multiplication scheme, let us consider the multiplication of two decimal numbers (5498×2314). The conventional methods already known to us will require 16 multiplications and 15 additions. An alternative method of multiplication using Urdhva -Triagbhyam Sutra is shown in Fig. 2 .
The numbers to be multiplied are written on two consecutive sides of the square as shown in the figure. The square is divided into rows and columns where each row/column corresponds to one of the digit of either a multiplier or a multiplicand. Thus, each digit of the multiplier has a small box common to a digit of the multiplicand. These small boxes are partitioned into two halves by the crosswise lines. Each digit of the multiplier is then independently multiplied with every digit of the multiplicand and the two-digit product is written in the common box. All the digits lying on a crosswise dotted line are added to the previous carry. The least significant digit of the obtained number acts as the result digit and the rest as the carry for the next step. Carry for the first step (i.e., the dotted line on the extreme right side) is taken to be zero [6] .
II. PREVIOUS WORKS
The Gabor Filter has previously been designed to be area efficient. No previous record of redesigning the MAC was found where the digital Gabor Filter was made faster. Vedic Multipliers have previously being used in MAC units that resides inside the Arithmetic Logic Unit.
III. METHODOLOGY
The focus of this work is to improve the design of a digital Gabor Filter where the maximization of speed will be the main priority.
A. Design Flow
The design flow of the Gabor Filter is shown in Fig. 3 . When a data is entered, it enters pixel by pixel into the Memory block. The memory location where it is stored is given by Pixel_X and Pixel_Y. The "Convolution" signal indicates the operation of the filter. If this signal is high (1), it means convolution has started. If this signal is low (0), the data entered is written in the memory location. When the "Convolution" signal is high, the Controller will read the image that is stored in the memory and send the data to the MAC unit of the Arithmetic Logic Unit. The Controller will call the data from the determined memory location. The Arithmetic Logic Unit consists of a ROM which is holding the Kernel Coefficient value. The address of Kernel Coefficient will also be generated by the Controller. The Kernel Coefficient is sent to the MAC unit from the ROM. Both the image data and Kernel Coefficient enter the MAC and multiplication and accumulation process starts taking place. Only one series of data will be convoluted at a time [7] . Since there are 9 Kernel Coefficients, there will be 9 convolution operations. Therefore, 9 image data and 9 Kernel Coefficients convolute and accumulate. The result is the filtered output.
The contents of the three blocks Control Logic Unit, Arithmetic Logic Unit and Memory are discussed below.
IV. EXPERIMENTAL RESULTS AND DISCUSSION
After designing the Gabor Filter using Verilog HDL in the ModelSim-Altera 6.5b (Quartus II 9.1) Starter Edition, the
International Journal of Information and Electronics Engineering, Vol. 5, No. 5, September 2015
code was then synthesized in Xilinx ISE 9.2i. Fig. 1 shows the schematic view of the top level filter. There were six input pins and one output pin on the top level. "Image_data" stands for an unfiltered 32-bit image data. "Pixel_X" and "Pixel_Y" contained the location of the memory where the "Image_data" is to be written. Filter_clock and Filter_reset pins indicate the generated clock and reset button for the filter. The "Convolution" signal indicates the operation of the filter. If the signal is high (1), the convolution process takes place. If the signal is low (0), the memory in the filter receives image input and stores it on the location specified by "Pixel_X" and "Pixel_Y". Fig. 4 below shows the detailed block diagram which shows the inner structure of the Gabor Filter. From Fig. 5 the output result for the filter is 0.006764772(3BDDAB06) but the expected result in Fig. 6 was 0.006764705(3BDDAA75). The difference was 0.00000068. The error was only 0.001%. This new design verifies that even though a Vedic Multiplier was implemented in the MAC unit of the filter, the functionality of the Gabor Filter has not been jeopardized.
A. Top Level

B. Controller (CLU)
The Control Logic Unit (CLU) is the controller which controls the data flow in the filter. It instructs the other blocks to perform their jobs. It provides the memory address from where data is to be read in the Memory and also provides the address from where the Kernel coefficient is to be read from the ROM in the ALU. Fig. 7 shows the Block Diagram of the Controller. The "RDY" and "SET" signals are inputs to the Controller which are outputs of the ALU. These feedbacks from the arithmetic unit are used to control the operation "OP" of the arithmetic unit. When the "OP" signal is high, the convolution process at the ALU starts. When the "OP" signal is low, the convolution process stops.
The "OP" was designed this way to control accurate series data sent to the arithmetic unit so there won"t be any mismatch of data [7] . The "SET" signal turns off the "OP" signal and the "RDY" signal turns on the "OP" signal. It is a continuous process. Initially, the data is being entered and written to the Memory. When the "START" signal in the Controller goes high, only then the Controller generates "X", "Y", and "Z". "X" and "Y" are address locations in the Memory from where data is to be read and "Z" carries the address from where the Kernel Coefficient is to be read. 
C. Memory
The Memory block is where the image pixels are being stored. Fig. 9 shows the Memory block diagram. Initially, "Image_data" of 32 -bit is being entered along with "Pixel_X" and "Pixel_Y" which are 4 bit data providing the location of the memory where this new data is going to be stored. The "Convolution" signal will indicate if the data is going to be read or written in the memory. The "X" and "Y" signals are the signals sent by the controller carrying the location of the memory from which data is to be read. It also has a "RESET" button. The output of the Memory block is the image data we send to the Arithmetic Logic Unit for convolution.
The Memory is a 16 by 16 block and therefore we have 256 locations. Each location is able to store 32 -bit data. The memory is RAM type. When the "Convolution" signal is low, data is being written in the memory in the location provided by "Pixel_X" and "Pixel_Y". When the "Convolution" signal is high, data is being read from the memory from the location provided by the 4-bit signals X and Y. Fig. 10 below shows the verification of the Memory block. 
D. Arithmetic Logic Unit (ALU)
The Arithmetic Logic Unit is the most important part of this design. The 32 -bit image data enters the Arithmetic Logic Unit from the Memory. The 4 bit Kernel Coefficient address location (Coeff_add) enters the ROM in the ALU from the Controller. The OP signal enters the ALU from the Controller. The CLK and RESET are also inputs to the ALU.
When all the inputs enter the ALU, the image data enters the MAC unit straight away. The coefficient address sent to the ALU by the controller enters the ROM to generate the coefficient address. When the OP signal goes high, the coefficient address is generated and sent to the MAC. The OP signal enters the ROM and the Convolution Signal Buffer simultaneously. The Buffer is used to delay the OP signal by one clock cycle so that it can enter the MAC at the same time as the Kernel Coefficient. Then convolution takes place inside the MAC. The convolved data is our final output. This block also generates a SET and RDY signal which is the driving force of the whole filter. These two signals operate the OP signal for further convolutions. The OP signal entering the Vedic Multiplier controls the SET signal. When the OP signal goes high, the SET signal waits one clock cycle to become high. The SET signal now goes to the controller to turn off the OP signal.
The 64-bit multiplied output (mul_out) from the Vedic Multiplier now enters the Accumulator. The "mul_rdy" signal (which is also the SET signal) enters the Accumulator and controls the accumulation. When the "mul_rdy" signal goes high in the Accumulator, accumulation takes place and eventually we get a 64 -bit filtered output. When the "mul_rdy" signal goes high, the RDY signal waits four clock cycles to become high. This RDY signal enters the controller from the ALU. When RDY signal enters the controller, it makes the OP signal high. As one signal influences the other continuously, the whole process continues till there is no data left for convolution. The verification of this is shown in Fig. 14 . 
V. CONCLUSION
Our project proposed the design of Multiplier and Accumulator (MAC) Unit in the Gabor Filter using the techniques of Ancient Indian Vedic Mathematics that have been modified to improve performance. The speed of MAC depends greatly on the multiplier. This work has proved the efficiency of Urdhva Triyagbhyam -Vedic method for multiplication which strikes a difference in the actual process of multiplication itself. Our project shows that design of MAC unit using Vedic multiplication is efficient in terms of speed compared to conventional multiplication. This modified Gabor Filter has a lot of future scopes as it can be used for Cancer Detection, Brain Tumor Detection, Video Processing and even extracting Satellite Images.
We have successfully designed a Gabor Filter with Vedic Multipliers using Verilog HDL. First of all, we designed the main module (top module) and then we designed the modules of the CLU, ALU, and Memory. After designing these modules, we had to further design the modules that are needed to run CLU, ALU and Memory. The main and important part of our modified Gabor Filter was the Vedic Multiplier which was hard to design because we had access to the 2 bit by 2 bit Vedic Multiplier which we had to instantiate several times to obtain the 32 bit by 32 bit one.
