600 research outputs found
Recommended from our members
Efficient architectures and power modelling of multiresolution analysis algorithms on FPGA
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.In the past two decades, there has been huge amount of interest in Multiresolution Analysis Algorithms (MAAs) and their applications. Processing some of their applications such as medical imaging are computationally intensive, power hungry and requires large amount of memory which cause a high demand for efficient algorithm implementation, low power architecture and acceleration. Recently, some MAAs such as Finite Ridgelet Transform (FRIT) Haar Wavelet Transform (HWT) are became very popular and they are suitable for a number of image processing applications such as detection of line singularities and contiguous edges, edge detection (useful for compression and feature detection), medical image denoising and segmentation. Efficient hardware implementation and acceleration of these algorithms particularly when addressing large problems are becoming very chal-lenging and consume lot of power which leads to a number of issues including mobility, reliability concerns. To overcome the computation problems, Field Programmable Gate Arrays (FPGAs) are the technology of choice for accelerating computationally intensive applications due to their high performance. Addressing the power issue requires optimi- sation and awareness at all level of abstractions in the design flow.
The most important achievements of the work presented in this thesis are summarised
here.
Two factorisation methodologies for HWT which are called HWT Factorisation Method1 and (HWTFM1) and HWT Factorasation Method2 (HWTFM2) have been explored to increase number of zeros and reduce hardware resources. In addition, two novel efficient and optimised architectures for proposed methodologies based on Distributed Arithmetic (DA) principles have been proposed. The evaluation of the architectural results have shown that the proposed architectures results have reduced the arithmetics calculation (additions/subtractions) by 33% and 25% respectively compared to direct implementa-tion of HWT and outperformed existing results in place. The proposed HWTFM2 is implemented on advanced and low power FPGA devices using Handel-C language. The FPGAs implementation results have outperformed other existing results in terms of area and maximum frequency. In addition, a novel efficient architecture for Finite Radon Trans-form (FRAT) has also been proposed. The proposed architecture is integrated with the developed HWT architecture to build an optimised architecture for FRIT. Strategies such as parallelism and pipelining have been deployed at the architectural level for efficient im-plementation on different FPGA devices. The proposed FRIT architecture performance has been evaluated and the results outperformed some other existing architecture in place. Both FRAT and FRIT architectures have been implemented on FPGAs using Handel-C language. The evaluation of both architectures have shown that the obtained results out-performed existing results in place by almost 10% in terms of frequency and area. The proposed architectures are also applied on image data (256 £ 256) and their Peak Signal to Noise Ratio (PSNR) is evaluated for quality purposes.
Two architectures for cyclic convolution based on systolic array using parallelism and pipelining which can be used as the main building block for the proposed FRIT architec-ture have been proposed. The first proposed architecture is a linear systolic array with pipelining process and the second architecture is a systolic array with parallel process. The second architecture reduces the number of registers by 42% compare to first architec-ture and both architectures outperformed other existing results in place. The proposed pipelined architecture has been implemented on different FPGA devices with vector size (N) 4,8,16,32 and word-length (W=8). The implementation results have shown a signifi-cant improvement and outperformed other existing results in place.
Ultimately, an in-depth evaluation of a high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called func-tional level power modelling approach have been presented. The mathematical techniques that form the basis of the proposed power modeling has been validated by a range of custom IP cores. The proposed power modelling is scalable, platform independent and compares favorably with existing approaches. A hybrid, top-down design flow paradigm integrating functional level power modelling with commercially available design tools for systematic optimisation of IP cores has also been developed. The in-depth evaluation of this tool enables us to observe the behavior of different custom IP cores in terms of power consumption and accuracy using different design methodologies and arithmetic techniques on virous FPGA platforms. Based on the results achieved, the proposed model accuracy is almost 99% true for all IP core's Dynamic Power (DP) components.Thomas Gerald Gray Charitable Trus
Lossless and low-cost integer-based lifting wavelet transform
Discrete wavelet transform (DWT) is a powerful tool for analyzing real-time signals, including aperiodic, irregular, noisy, and transient data, because of its capability to explore signals in both the frequency- and time-domain in different resolutions. For this reason, they are used extensively in a wide number of applications in image and signal processing. Despite the wide usage, the implementation of the wavelet transform is usually lossy or computationally complex, and it requires expensive hardware. However, in many applications, such as medical diagnosis, reversible data-hiding, and critical satellite data, lossless implementation of the wavelet transform is desirable. It is also important to have more hardware-friendly implementations due to its recent inclusion in signal processing modules in system-on-chips (SoCs).
To address the need, this research work provides a generalized implementation of a wavelet transform using an integer-based lifting method to produce lossless and low-cost architecture while maintaining the performance close to the original wavelets. In order to achieve a general implementation method for all orthogonal and biorthogonal wavelets, the Daubechies wavelet family has been utilized at first since it is one of the most widely used wavelets and based on a systematic method of construction of compact support orthogonal wavelets. Though the first two phases of this work are for Daubechies wavelets, they can be generalized in order to apply to other wavelets as well. Subsequently, some techniques used in the primary works have been adopted and the critical issues for achieving general lossless implementation have solved to propose a general lossless method.
The research work presented here can be divided into several phases. In the first phase, low-cost architectures of the Daubechies-4 (D4) and Daubechies-6 (D6) wavelets have been derived by applying the integer-polynomial mapping. A lifting architecture has been used which reduces the cost by a half compared to the conventional convolution-based approach. The application of integer-polynomial mapping (IPM) of the polynomial filter coefficient with a floating-point value further decreases the complexity and reduces the loss in signal reconstruction. Also, the “resource sharing” between lifting steps results in a further reduction in implementation costs and near-lossless data reconstruction.
In the second phase, a completely lossless or error-free architecture has been proposed for the Daubechies-8 (D8) wavelet. Several lifting variants have been derived for the same wavelet, the integer mapping has been applied, and the best variant is determined in terms of performance, using entropy and transform coding gain. Then a theory has been derived regarding the impact of scaling steps on the transform coding gain (GT). The approach results in the lowest cost lossless architecture of the D8 in the literature, to the best of our knowledge. The proposed approach may be applied to other orthogonal wavelets, including biorthogonal ones to achieve higher performance.
In the final phase, a general algorithm has been proposed to implement the original filter coefficients expressed by a polyphase matrix into a more efficient lifting structure. This is done by using modified factorization, so that the factorized polyphase matrix does not include the lossy scaling step like the conventional lifting method. This general technique has been applied on some widely used orthogonal and biorthogonal wavelets and its advantages have been discussed.
Since the discrete wavelet transform is used in a vast number of applications, the proposed algorithms can be utilized in those cases to achieve lossless, low-cost, and hardware-friendly architectures
Modified Distributive Arithmetic based 2D-DWT for Hybrid (Neural Network-DWT) Image Compression
Artificial Neural Networks ANN is significantly used in signal and image processing techniques for pattern recognition and template matching Discrete Wavelet Transform DWT is combined with neural network to achieve higher compression if 2D data such as image Image compression using neural network and DWT have shown superior results over classical techniques with 70 higher compression and 20 improvement in Mean Square Error MSE Hardware complexity and power issipation are the major challenges that have been addressed in this work for VLSI implementation In this work modified distributive arithmetic DWT and multiplexer based DWT architecture are designed to reduce the computation complexity of hybrid architecture for image compression A 2D DWT architecture is designed with 1D DWT architecture and is implemented on FPGA that operates at 268 MHz consuming power less than 1
Efficient Algorithms/Techniques on Discrete Wavelet Transformation for Video Compression: A Review
Visualization is the most effective and informative form for delivering any information. There are various techniques for video compression such as Motion Estimation and Compensation, Discrete Cosine Transformation, Discrete Wavelet Transformation etc. Wavelet transforms have been triumphant in high rates of compression as well as maintains good video/image quality. In this paper, the implementation of different algorithms of three dimensional wavelet transformations for video compression is presented. Keywords: Video compression, Temporal decomposition, Discrete Wavelet Transform (DWT), 3D Wavelet Transform
Real-Time DSP-Based License Plate Character Segmentation Algorithm Using 2D Haar Wavelet Transform
Non peer reviewe
VLSI Implementation of Reversible Watermarking Algorithm
This paper presents VLSI design approach and implementation of Lifting based Reversible Watermarking Algorithm. 5 by 3 Lifting based Discrete Wavelet Transform based image watermarking algorithm is proposed. It is attractive algorithm because of easier understanding and implement. Main feature of Lifting based scheme is that all constructions are derived in the spatial domain. Therefore it does not require complex mathematical calculations that are required in traditional method. This algorithm is mainly applicable in Military application as well as Medical application where reconstruction of original image and watermarking data (or image) is essential from the watermarked image after serving intended purpose. In this algorithm, image is decomposed into four sub bands LL, LH, HL, and HH using Lifting based DWT Algorithm. Then watermarking data (or image) is embedded into any of three high frequency sub bands. The interesting point of this algorithm is that original image can be exactly restored from the watermarked image. The architecture of Lifting based DWT Algorithm has been coded in verilog HDL on Xilinx platform and the target FPGA device used is Virtex-IV family.
DOI: 10.17762/ijritcc2321-8169.15058
Fast Implementation of Lifting Based DWT Architecture For Image Compression
Technological growth in semiconductor industry have led to unprecedented demand for faster area efficient and low power VLSI circuits for complex image processing applications DWT-IDWT is one of the most popular IP that is used for image transformation In this work a high speed low power DWT IDWT architecture is designed and implemented on ASIC using 130nm Technology 2D DWT architecture based on lifting scheme architecture uses multipliers and adders thus consuming power This paper addresses power reduction in multiplier by proposing a modified algorithm for BZFAD multiplier The proposed BZFAD multiplier is 65 faster and occupies 44 less area compared with the generic multipliers The DWT architecture designed based on modified BZFAD multiplier achieves 35 less power reduction and operates at frequency of 200MHz with latency of 1536 clock cycles for 512x512 image The developed DWT can be used as an IP for VLSI implementatio
Efficient Hardware Implementation Of Haar Wavelet Transform With Line-Based And Dual-Scan Image Memory Accesses
Image compression is of great importance in multimedia systems and applications because it drastically reduces bandwidth requirements for transmission and
memory requirements for storage. An image compression algorithm JPEG2000 isbased on Discrete Wavelet Transform. In the hardware implementation of DiscreteWavelet
Transform (DWT) and inverse DiscreteWavelet Transform (IDWT),the main problems are storage memory, internal processing buffer, and the limitation of the FPGA resources. Based on non-separable 2-D DWT, the method
used to access the image memory has a direct impact on the internal buffer size,the power consumption and, the transformation speed. The need for internal buffer reduces the image memory access time. The main objectives of this thesis are as follows; to implement a 2-D Haar wavelet transform for large gray-scale image, to reduce the number of image memory access by implementing the 2-
D Haar wavelet transform with a suitable combination between using external memory and internal memory, and targeting a low-power and high-speed architecture
based on multi-levels non-separable discrete Haar wavelet transform. In this work, the proposed two architectures reduce the number of image memory access. The line-based architecture reduces the internal buffer by 2 x 0.5 x N
where N presents the image size. This happens for the low-pass coefficients and for the high-pass coefficients. The dual-scan architecture does not use the internal
memory. Overall both architectures work well on the Altera FPGA board at frequency 100 MHz
The DLMT hardware implementation. A comparative study with the DCT and the DWT
In the last recent years, with the popularity of image compression techniques, many architectures have been proposed. Those have been generally based on the Forward and Inverse Discrete Cosine Transform (FDCT, IDCT). Alternatively, compression schemes based on discrete "wavelets" transform (DWT), used, both, in JPEG2000 coding standard and in H264-SVC (Scalable Video Coding) standard, do not need to divide the image into non-overlapping blocks or macroblocks. This paper discusses the DLMT (Discrete Lopez-Moreno Transform) hardware implementation. It proposes a new scheme intermediate between the DCT and the DWT, comparing results of the most relevant proposed architectures for benchmarking. The DLMT can also be applied over a whole image, but this does not involve increasing computational complexity. FPGA implementation results show that the proposed DLMT has significant performance benefits and improvements comparing with the DCT and the DWT and consequently it is very suitable for implementation on WSN (Wireless Sensor Network) applications
- …