2,407 research outputs found
Simple Signal Extension Method for Discrete Wavelet Transform
Discrete wavelet transform of finite-length signals must necessarily handle
the signal boundaries. The state-of-the-art approaches treat such boundaries in
a complicated and inflexible way, using special prolog or epilog phases. This
holds true in particular for images decomposed into a number of scales,
exemplary in JPEG 2000 coding system. In this paper, the state-of-the-art
approaches are extended to perform the treatment using a compact streaming
core, possibly in multi-scale fashion. We present the core focused on CDF 5/3
wavelet and the symmetric border extension method, both employed in the JPEG
2000. As a result of our work, every input sample is visited only once, while
the results are produced immediately, i.e. without buffering.Comment: preprint; presented on ICSIP 201
Fast Implementation of Lifting Based DWT Architecture For Image Compression
Technological growth in semiconductor industry have led to unprecedented demand for faster area efficient and low power VLSI circuits for complex image processing applications DWT-IDWT is one of the most popular IP that is used for image transformation In this work a high speed low power DWT IDWT architecture is designed and implemented on ASIC using 130nm Technology 2D DWT architecture based on lifting scheme architecture uses multipliers and adders thus consuming power This paper addresses power reduction in multiplier by proposing a modified algorithm for BZFAD multiplier The proposed BZFAD multiplier is 65 faster and occupies 44 less area compared with the generic multipliers The DWT architecture designed based on modified BZFAD multiplier achieves 35 less power reduction and operates at frequency of 200MHz with latency of 1536 clock cycles for 512x512 image The developed DWT can be used as an IP for VLSI implementatio
Development of Lifting-based VLSI Architectures for Two-Dimensional Discrete Wavelet Transform
Two-dimensional discrete wavelet transform (2-D DWT) has evolved as an essential
part of a modem compression system. It offers superior compression with good image
quality and overcomes disadvantage of the discrete cosine transform, which suffers
from blocks artifacts that reduces the quality of the inage. The amount of
computations involve in 2-D DWT is enormous and cannot be processed by generalpurpose
processors when real-time processing is required. Th·"efore, high speed and
low power VLSI architecture that computes 2-D DWT effectively is needed. In this
research, several VLSI architectures have been developed that meets real-time
requirements for 2-D DWT applications. This research iaitially started off by
implementing a software simulation program that decorrelates the original image and
reconstructs the original image from the decorrelated image. Then, based on the
information gained from implementing the simulation program, a new approach for
designing lifting-based VLSI architectures for 2-D forward DWT is introduced. As a
result, two high performance VLSI architectures that perform 2-D DWT for 5/3 and
9/7 filters are developed based on overlapped and nonoverlapped scan methods. Then,
the intermediate architecture is developed, which aim a·: reducing the power
consumption of the overlapped areas without using the expensive line buffer. In order
to best meet real-time applications of 2-D DWT with demanding requirements in
terms of speed and throughput parallelism is explored. The single pipelined
intermediate and overlapped architectures are extended to 2-, 3-, and 4-parallel
architectures to achieve speed factors of 2, 3, and 4, respectively. To further
demonstrate the effectiveness of the approach single and para.llel VLSI architectures
for 2-D inverse discrete wavelet transform (2-D IDWT) are developed. Furthermore,
2-D DWT memory architectures, which have been overlooked in the literature, are
also developed. Finally, to show the architectural models developed for 2-D DWT are
simple to control, the control algorithms for 4-parallel architecture based on the first
scan method is developed. To validate architectures develcped in this work five
architectures are implemented and simulated on Altera FPGA.
In compliance with the terms of the Copyright Act 1987 and the IP Policy of the
university, the copyright of this thesis has been reassigned by the author to the legal
entity of the university,
Institute of Technology PETRONAS Sdn bhd.
Due acknowledgement shall always be made of the use of any material contained
in, or derived from, this thesis
Advancement of Computing on Large Datasets via Parallel Computing and Cyberinfrastructure
Large datasets require efficient processing, storage and management to efficiently extract useful information for innovation and decision-making. This dissertation demonstrates novel approaches and algorithms using virtual memory approach, parallel computing and cyberinfrastructure. First, we introduce a tailored user-level virtual memory system for parallel algorithms that can process large raster data files in a desktop computer environment with limited memory. The application area for this portion of the study is to develop parallel terrain analysis algorithms that use multi-threading to take advantage of common multi-core processors for greater efficiency. Second, we present two novel parallel WaveCluster algorithms that perform cluster analysis by taking advantage of discrete wavelet transform to reduce large data to coarser representations so data is smaller and more easily managed than the original data in size and complexity. Finally, this dissertation demonstrates an HPC gateway service that abstracts away many details and complexities involved in the use of HPC systems including authentication, authorization, and data and job management
DCT Implementation on GPU
There has been a great progress in the field of graphics processors. Since, there is no rise in the speed of the normal CPU processors; Designers are coming up with multi-core, parallel processors. Because of their popularity in parallel processing, GPUs are becoming more and more attractive for many applications. With the increasing demand in utilizing GPUs, there is a great need to develop operating systems that handle the GPU to full capacity. GPUs offer a very efficient environment for many image processing applications. This thesis explores the processing power of GPUs for digital image compression using Discrete cosine transform
Parallel 3D Fast Wavelet Transform comparison on CPUs and GPUs
We present in this paper several implementations of the 3D Fast Wavelet Transform (3D-FWT) on multicore CPUs and manycore GPUs. On the GPU side, we focus on CUDA and OpenCL programming to develop methods for an efficient mapping on manycores. On multicore CPUs, OpenMP and Pthreads are used as counterparts to maximize parallelism, and renowned techniques like tiling and blocking are exploited to optimize the use of memory. We evaluate these proposals and make a comparison between a new Fermi Tesla C2050 and an Intel Core 2 QuadQ6700. Speedups of the CUDA version are the best results, improving the execution times on CPU, ranging from 5.3x to 7.4x for different image sizes, and up to 81 times faster when communications are neglected. Meanwhile, OpenCL obtains solid gains which range from 2x factors on small frame sizes to 3x factors on larger ones
- …