I. Introduction
Discrete Wavelet Transform (DWT) becomes more popular in many fields and it have wide applications in digital image compression [1, 2, 3] . Many filters for compression were introduced many masks of DWT and one important application is JPEG2000 standard [4, 5, 6] . DWT is performed via the implementation of low pass filter (LPF) and high pass filter (HPF) on the original data [7, 8, 9] . DWT has proven useful in the area of image compression where it replaces the Discrete Cosine Transform (DCT) in new JPEG2000 and MPEG4 image and video compression standards [10, 11, 12] . DWT is an efficient tool for data compression and on of special hardware application is neural interface [13, 14, 15] . DWT cores using convolution require two passes per row/column of pixels for each 1-D DWT that is performed [16, 17] . DWT cores using the lifting scheme require two or more passes [18, 19] . This wavelet core implements the "line-based" DWT similar to that described in that requires only one pass per row/column and each lifting step requires separate multiplication units [20, 21] . Provide unique scalable architectures using the line-based lifting scheme method similar to that employed in this thesis to reduce the memory requirements for performing the 2-D DWT [22, 23] .
The role of the compression is to reduce bandwidth requirements for transmission & memory requirements for storage of all forms of data as it would not be practical to put images, audio, video alone on websites without compression [24, 25] . Hardware implementation of any algorithm leads to speed up the processing time [26, 27] . Compacting VLSI and FPGA to realize the implemented hardware is an efficient architecture [28, 29, 30] .
Hardware filter processor was implemented via the convolution operation with the support of software [31, 32] . The hardware architecture of image processing system based on time sharing to synchronize the processing time [33, 34] . The convolution processor is a linear operation using low pass and high pass filters to implement 2D-DWT [35, 36] . Programmable Logic Devices (PLD and EPLD) are used for real time image processing to minimize the processing time as possible, also Programmable Logic Devices Digital Signal Processor Architecture (PLD+DSP) are applied [37, 38] .
It's a big task when searching or research and investigation on the subject has been trying to beneficiation by many researchers where they are provided methods and techniques cannot be underestimated in the subject of image compression. It has been the transition from traditional to modern technologies and high efficiency of applied technologies. Here it must be pointed to a very important point is how to manage tasks efficiently applied in the program leads to an influential actor and have effective results of the work. structures implementation. The system output can be improved by a factor of four per proposed approach. On the other hand the cost of the equipment increases by a factor of three. Parallel and efficient finite impulse response filter structure is improved to accelerate the processing speed of discrete wavelet transform and to control the cost of the required equipment's simultaneously [31] .
R.Lavanya and Saranya B. (2010) designed and implemented folded wavelet filters with the characteristics of high speed and low complexity architecture. The implemented approach aimed to improve the reconfigurable architecture speed. This is a method of analysis to improve communication between tasks and in addition between dependencies between tasks that will reduce processing time and the overall of the required communications [32] .
Takkiti Rajashekar Reddy, Rangu Srikanth (2011), developed basic discrete wavelet transform image processing system using Spartan3 Xilinx Field Programmable Gate Array device via Xilinx's integrated development tools. Two different discrete wavelet transform hardware architectures have been implemented in this embedded system. One of them is the direct execution of the two dimensional discrete wavelet transform cascade of two processes of on dimensional discrete wavelet transform. Another approach is implementing two dimensional discrete wavelet transform with optimized control and architecture [33] .
Husain.K.Bhaldar et al. (2012) represented high speed discrete wavelet transform for hardware implementation using 5/3 wavelet data for image compression applications. Voice compression is another application of Wavelets which reduces the transmission time in mobile applications. This work as a main objective was demonstrated that a big reduction of complexity with excellent efficiency can be obtained by implementing the multiplying via discrete wavelet transform implementation in FPGA using this wavelet filters. DWT generates an analysis of multispectral that allows for an invariable interpretation of the image scale [34] .
Khamees Khalaf Hasan et al. (2014) proposed a multilevel transform of discrete wavelet decomposition which is a flexible hardware architecture and directly apply for image compression. VHDL methodolody is applied to analysis and synthesis discrete wavelet transform decomposition architecture which can be implemented via Field Programmable Gate Array device. This makes each image size can be decomposition for the required level. A simple Harr wavelet mask are used in this approach to avoid complexity of the computation. The approach can be applied for wired and wireless applications [35] .
Mr. Hemantkumar H. Nikhare, Prof. Ashish Singhadia (2015), explained the very large scale integration architectures for the application of two dimensional discrete wavelet transform. VHDL is applied to implement the required multiplier. Two type of multiplier are designed and implemented to perform this approach. Then a comparison process is applied to measure the energy consumption. The experimental results illustrated that Radix 4 multiplier have power reduction of 22.9% but the conventional radix 2 multiplier and almost 50% reduction of power [36] .
Ms. Dhrisya, Mr. V Lakshmipathi (2016) explained that discrete wavelet transform plays an important part in many fields such as; signal compression, signal analysis and computer vision. Discrete wavelet transform required more memory for storing the intermediate computational results especially when it applied to image processing. This work analyzed the complexities and computational time of the Lifting based two discrete wavelet transform. This approach tend to developing a new algorithm of implementing pipeline architecture that is able of getting multiple data streams suitable for application in image and video processing which is require a real time processing [37] .
III. Methodology Hardware System Architecture
The hardware implemented system concentrated on introducing the Raspberry Pi which has the ability to operate in real time. Many generations of Raspberry Pi have been manufactured, started from first generation (1G) applied at 2012 to the last or third generation (3G) applied at 2016 [38] .
The Raspberry Pi is standalone microcomputer system which was designed at United Kingdom, University of Cambridge University [38] . The main purpose of these devices is for teaching of computer science and information technology at schools [39] . The Raspberry pi is integrated from 700 MHz ARM11 co-processor [40] , 512 MB RAM on model B+ whereas 256 MB RAM on model A and Broadcom video core IV [34] . Raspberry Pi 3 is a microcomputer and it have the ability for real time working which can be control all jobs and operations of the implemented. The architectural design of Raspberry Pi is shown in Fig.1 [41] . Fig.2 shows the implemented hardware system using Raspberry Pi, which have the ability to monitor and control all the jobs and functions at real time. In this case all image operations (including image compression) can be implemented and run at real time without any indicated delay. Fast 2D-DWT Structure Discrete Wavelet Transform (DWT) is a powerful signal processing tool that has recently gained widespread acceptance in the field of digital image processing. The multiresolution analysis provided by the DWT addresses the shortcomings of the Fourier Transform and its derivatives. The direct application of 2D-DWT is image compression, so this structure implement the image compression via 2D-DWT. The simple implementation of 2D-DWT is via applying low pass filter (LPF) and high pass filter (HPF) on rows and columns respectively. In this case we get four bands: low-low band (LL-band) via applying LPF twice on rows and columns, low-high band (LH-band) via applying LPF on rows and HPF on columns, high-low band (HL-band) via applying HPF on rows and LPF on columns and high-high band (HH-band) via applying HPF twice on rows and columns. Fig.3 indicates the three levels of 2D-DWT applied on an image, in which the final image will be (1/4 * 1/4 * 1/4) i.e. (1/64) of the original image. On other word if we have the square image of size (1024*1024) that leads to a good image resolution, in this case of applying third level 2D-DWT we reach the size of (8*8) i.e. the third level of 2D-DWT will be of size (128*128).
Efficient Image Encryption Approach Based on Chaos Technique
DOI
Figure 3 Three levels of 2D-DWT of an image
Implemented 2D-DWT At the beginning of 2D-DWT architecture, try to explain the mathematical model of 2D-DWT. Suppose the size of the input square image is N*N, so the size after applying first level 2D-DWT is N/2*N/2, the size after applying second level 2D-DWT is N/4*N/4 and the size after applying third level 2D-DWT is N/8*N/8. Let n1 represents the row index and n2 represents the column index of the input image. Let k1 represents the row index and k2 represents the column index of the output image. Let x(n1,n2) represents the input image, and represents X(k1,k2) the output image. Let the LPF impulse response is defined as hLPF. Let the HPF impulse response is defined as hHPF. The LL-band of 2D-DWT is given below:
Efficient Image Encryption Approach Based on Chaos Technique
The LH-band of 2D-DWT is given below:
The HL-band of 2D-DWT is given below:
The HH-band of 2D-DWT is given below:
To get second level 2D-DWT X_LL (k1,k2) will be the input of equation (1) and so on for the third level 2D-DWT.
Each level of 2D-DWT required two sages: first stage concern to the system and flow control in which consists of two memory FIFOs separated by flip flop delay and the second stage consists of the structure implementation of 2D-DWT. To implement the above equations in a real three levels of 2D-DWT architecture, it is convenient to follow the following steps ( fig.4 ):
Step1: the flow of pixels is controlled and it is divided into two parts; even pixels and odd pixels.
Step2: the even pixels is generated via memory FIFO1.
Step3: the odd pixels is generated via the combination of memory FIFO2 with flip-flop FF1.
Step4: the even pixels flow via first low pass filter LPF1 to generate the low band.
Step5: the output of LPF1 flows via first low pass filter LPF2 to generate the low-low band (LL-band).
Step6: the output of LPF1 flows via first high pass filter HPF1 to generate the low-low band (HL-band).
Step7: the odd pixels flow via first low pass filter HPF2 to generate the high band.
Step8: the output of HPF2 flows via third low pass filter LPF3 to generate the high-low band (HL-band).
Step9: the output of HPF2 flows via third high pass filter HPF3 to generate the high-high band (HH-band).
Step10: the output of LPF2 (LL-band) will pass to the second level 2D-DWT.
Step11: the output of LPF5 (LL-LL-band) will pass to the third level 2D-DWT.
Step12: the output of LPF8 (LL-LL-LL-band) will generate the third level 2D-DWT. 
IV. Results and discussions
As it is shown in figure 4 in each level of 2D-DWT we have two memory FIFO, one flip-flop, three low pass filter and three high pass filter. But for real working when we pass from one level to the other we need only two low pass filter in each level, so the overall active filter to generate the third level 2D-DWT is six low pass filters as shown in Fig.5 shows that the preceding process required half number of operation of the previous process, and at the end we get 1/64 of the original image means the compression ratio is 1/64. On the other hand you can see that the total number of operation of the third level 2D-DWT is 1,032,192. According to the Raspberry pi is integrated from 700 MHz, this leads to 1/700 = 1.42857ns for each cycle. In this case the implementation of third level 2D-DWT required 0.00147456 s or 1474.56 µs which is so enough for real time image processing. The error measures indicated that with the increase of the compression ratio or increase the level of 2D-DWT, then the values of MSE increased, and the values of PSNR decreased as shown in fig.6 . 
V. Conclusion
Image and video compression have wide range of applications and there are many techniques that applied to perform the compression, one of the effective technique is discrete wavelet transform. In this work, image compression via two dimensional discrete wavelet transform, is implemented using Raspberry Pi hardware device (microcomputer), this will accelerate the processing speed. Raspberry Pi is a microcomputer manufactured in UK in 2012 for teaching of computer and information technology subjects. The implemented algorithm is concentrated on third level 2D-DWT in which required six active sequential filters (LPF and HPF), two filters in each level. The required number of operations is 1032192, in which the required processing time is 1474.56 µs i.e. it is able to be real time processing.
