Abstract
Introduction
The digital image processing has become an important subject of interest in many different areas such as medical, technological etc. This paper deals with the digital watermarking algorithm for digital images is implemented in hardware FPGA. Now-a-days, Application Specific Integrated Circuits (ASIC) and programmable DSP processors were the other implementation choices for many digital applications. Development cost, time, flexibility, programmable software and the functional efficiency parameters enable us FPGA is considered as the implementation platform. Reconfigurable computing is also being considered as primary concern. FPGA is used for the system implementation in the hardware. Hence FPGAs are an attractive choice due to their low power dissipation per unit computation, high performance and reconfigurability [1] . The parallel computing feature of the FPGA is extremely useful to support the needs of the modern world applications in the areas like DSP, image and video processing etc. To create custom DSP data paths in FPGA System Generator [1-2] is used as a high level well suited design tool. This objective lead to the use of Xilinx System Generator tool with a high level graphical interface i.e. Simulink, based on blocks which makes it very easy to handle with respect to other softwares for the hardware design and implementation. Now-a-days handling of digital data on internets and in multimedia applications is a critical issue. Digital watermarking is applied for copyright protection, content authentication, detection of illegal duplication and alteration, feature tagging and secret communication. Digital Watermarking is the hiding of a secret message within an ordinary message and its extraction at its destination. The secret message embedded as watermark can be anything like plaintext, image etc. In general digital watermarking algorithm involves two major operations. 1. Watermark embedding, and 2. Watermark extraction. For both the operations a secret key is needed to secure the watermark [3] . This paper is concentrated on developing algorithmic models in simulink using Xilinx Block set for watermarking algorithm and then hardware implementation on FPGA. It is organized as follows. Section1 explains the importance of hardware implementation of digital images. Section2 is deals with the XSG, Section 3 focuses on hardware implementation and the results are discussed in Section 4.
Xilinx System Generator
Xilinx System Generator (XSG) [5] is an Integrated Design Environment (IDE) for FPGAs within the ISE 13.4 development suite, which uses Simulink [5] , as a development environment and is presented in the form of model based design. It has an integrated design flow generates the bit stream file (*. bit) from Simulink design environment which is necessary for the programming of FPGA.
XSG has abstraction level operating with fixed point double precision including quantization and overflow characteristics. In contrast Simulink works with numbers of double-precision floating point. The connection between XSG blocks and Simulink blocks are the gateway blocks. XSG automatically generates simulation results, RTL synthesis, VHDL/Verilog code, User Constraint File (UCF) and mapping hardware. It was created primarily to deal with complex Digital Signal Processing (DSP) applications but now it is intensively used for the implementation of many image processing applications.
Watermarking
Watermarking algorithm consists of two phases, embedding and extraction. Embedding is like an encoder in hardware applications, where as extraction algorithm resembles a decoder.
Embedding/Extraction Algorithm
Original image (cover) and key image are considered and key image is embedded into the functional part of the cover data so that it results in watermarked image as shown in Figure 1 .
Figure 1. Embedding of Watermark
Extraction process is performed on watermarked image to extract the original and key images as shown in Figure 2 .
Figure 2. Extraction of Cover Image and Watermark

Implementation in Simulink
Watermarking is implemented using Simulink blocks and Xilinx block set. It is considered in 3 steps.
1. Pre-Processing 2. Embedding/ Extraction as shown in Figure 3 and Figure 4 respectively. 
. Watermark Embedding
Embedding Algorithm
Two signals host and key are fed into the logical adder block as input. Then resulting sum and carry out from adder1 are fed to shift block it performs division by 2. The scaling of the image pixel gray values is performed in multiplier by multiplying host image with high order bits. The higher order bits of multiplier are fed to the adder/subtract unit to perform embedding block. Only high order bits are considered low order bits are discarded as shown in Figure 3 .
Extraction Algorithm
Watermarked image used as input to the extraction algorithm Subsystem block for extraction is as shown in Figure 3 to extract the original host and key image. The output of the XOR operation signal is to perform the thresholding and filtering operation to remove the unwanted noises. The subsystem of extraction process is shown in Figure 4 .
Figure 4. Subsystem Block for Extraction
International Journal of Security and Its Applications Vol.9, No.1 (2015)
H/W S/W Co-Simulation
Embedding/Extraction is modeled hardware is generated through system generator, target board Virtex 6 LX240T and JTAG cable Now it can be verified using hw as well as software. A model can be co-simulated provided it meets the requirements of the underlying hardware board. It generates the JTAG co-simulation block as shown in Figure 5 . By using this we can verify the output on the PC or on separate hardware VGS. After completing the Co-simulation, design is synthesized and implemented in Xilinx ISE 13.4.The resultant RTL schematic for Embedded/Extraction phases of watermarking algorithm as shown in Figure 7 and Figure 8 respectively. 
Results
The watermarking algorithm is implemented in 2 phases. The output of the both the embedding and extraction phases are obtained. After completing the hardware-software co-simulation the resultant outputs are as shown in Figure 9 and Figure10 respectively. It is observed that distortion with the watermarked or extracted images are very less. The resource utilization and power-delay performance of these algorithms are shown in Tables 1 and 2 . 
Conclusion
The Watermarking Algorithm is implemented on Virtex-6 target board with the help of Xilinx system Generator and Simulink IDE. It is observed that the proposed algorithm takes 1311ps and consumes 6.7mw power. It uses 156 ffs, 46 LUTs and 50 IOBs. The development time is less. It is less complex and highly flexible for prototyping and modifications. It is implemented with 40nm copper CMOS process technology. It is implemented with MatLab13.2 version Xilinx vi14. It uses less area of 2% of device space. The performance and accuracy of implemented watermarking algorithm is very high as shown in results table.1. It is easier in implementation and prototyping many applications rather than DSP hardware or PC with Co-Processor.
