Search CORE

1,218 research outputs found

Simple Signal Extension Method for Discrete Wavelet Transform

Author: Barina David
Kula Michal
Zemcik Pavel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/09/2017
Field of study

Discrete wavelet transform of finite-length signals must necessarily handle the signal boundaries. The state-of-the-art approaches treat such boundaries in a complicated and inflexible way, using special prolog or epilog phases. This holds true in particular for images decomposed into a number of scales, exemplary in JPEG 2000 coding system. In this paper, the state-of-the-art approaches are extended to perform the treatment using a compact streaming core, possibly in multi-scale fashion. We present the core focused on CDF 5/3 wavelet and the symmetric border extension method, both employed in the JPEG 2000. As a result of our work, every input sample is visited only once, while the results are produced immediately, i.e. without buffering.Comment: preprint; presented on ICSIP 201

arXiv.org e-Print Archive

Crossref

Parallel 3D Fast Wavelet Transform comparison on CPUs and GPUs

Author: Bernabé Gregorio
Publication venue: University of Granada-University of Cadiz
Publication date: 01/01/2015
Field of study

We present in this paper several implementations of the 3D Fast Wavelet Transform (3D-FWT) on multicore CPUs and manycore GPUs. On the GPU side, we focus on CUDA and OpenCL programming to develop methods for an efficient mapping on manycores. On multicore CPUs, OpenMP and Pthreads are used as counterparts to maximize parallelism, and renowned techniques like tiling and blocking are exploited to optimize the use of memory. We evaluate these proposals and make a comparison between a new Fermi Tesla C2050 and an Intel Core 2 QuadQ6700. Speedups of the CUDA version are the best results, improving the execution times on CPU, ranging from 5.3x to 7.4x for different image sizes, and up to 81 times faster when communications are neglected. Meanwhile, OpenCL obtains solid gains which range from 2x factors on small frame sizes to 3x factors on larger ones

Portal de revistas de la Universidad de Granada

DIALNET

Recommended from our members

Efficient architectures and power modelling of multiresolution analysis algorithms on FPGA

Author: Sazish Abdul Naser
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2011
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.In the past two decades, there has been huge amount of interest in Multiresolution Analysis Algorithms (MAAs) and their applications. Processing some of their applications such as medical imaging are computationally intensive, power hungry and requires large amount of memory which cause a high demand for efficient algorithm implementation, low power architecture and acceleration. Recently, some MAAs such as Finite Ridgelet Transform (FRIT) Haar Wavelet Transform (HWT) are became very popular and they are suitable for a number of image processing applications such as detection of line singularities and contiguous edges, edge detection (useful for compression and feature detection), medical image denoising and segmentation. Efficient hardware implementation and acceleration of these algorithms particularly when addressing large problems are becoming very chal-lenging and consume lot of power which leads to a number of issues including mobility, reliability concerns. To overcome the computation problems, Field Programmable Gate Arrays (FPGAs) are the technology of choice for accelerating computationally intensive applications due to their high performance. Addressing the power issue requires optimi- sation and awareness at all level of abstractions in the design flow. The most important achievements of the work presented in this thesis are summarised here. Two factorisation methodologies for HWT which are called HWT Factorisation Method1 and (HWTFM1) and HWT Factorasation Method2 (HWTFM2) have been explored to increase number of zeros and reduce hardware resources. In addition, two novel efficient and optimised architectures for proposed methodologies based on Distributed Arithmetic (DA) principles have been proposed. The evaluation of the architectural results have shown that the proposed architectures results have reduced the arithmetics calculation (additions/subtractions) by 33% and 25% respectively compared to direct implementa-tion of HWT and outperformed existing results in place. The proposed HWTFM2 is implemented on advanced and low power FPGA devices using Handel-C language. The FPGAs implementation results have outperformed other existing results in terms of area and maximum frequency. In addition, a novel efficient architecture for Finite Radon Trans-form (FRAT) has also been proposed. The proposed architecture is integrated with the developed HWT architecture to build an optimised architecture for FRIT. Strategies such as parallelism and pipelining have been deployed at the architectural level for efficient im-plementation on different FPGA devices. The proposed FRIT architecture performance has been evaluated and the results outperformed some other existing architecture in place. Both FRAT and FRIT architectures have been implemented on FPGAs using Handel-C language. The evaluation of both architectures have shown that the obtained results out-performed existing results in place by almost 10% in terms of frequency and area. The proposed architectures are also applied on image data (256 £ 256) and their Peak Signal to Noise Ratio (PSNR) is evaluated for quality purposes. Two architectures for cyclic convolution based on systolic array using parallelism and pipelining which can be used as the main building block for the proposed FRIT architec-ture have been proposed. The first proposed architecture is a linear systolic array with pipelining process and the second architecture is a systolic array with parallel process. The second architecture reduces the number of registers by 42% compare to first architec-ture and both architectures outperformed other existing results in place. The proposed pipelined architecture has been implemented on different FPGA devices with vector size (N) 4,8,16,32 and word-length (W=8). The implementation results have shown a signifi-cant improvement and outperformed other existing results in place. Ultimately, an in-depth evaluation of a high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called func-tional level power modelling approach have been presented. The mathematical techniques that form the basis of the proposed power modeling has been validated by a range of custom IP cores. The proposed power modelling is scalable, platform independent and compares favorably with existing approaches. A hybrid, top-down design flow paradigm integrating functional level power modelling with commercially available design tools for systematic optimisation of IP cores has also been developed. The in-depth evaluation of this tool enables us to observe the behavior of different custom IP cores in terms of power consumption and accuracy using different design methodologies and arithmetic techniques on virous FPGA platforms. Based on the results achieved, the proposed model accuracy is almost 99% true for all IP core's Dynamic Power (DP) components.Thomas Gerald Gray Charitable Trus

Brunel University Research Archive

Building an Application-specific Memory Hierarchy on FPGA

Author: Devos Harald
Stroobandt Dirk
Van Campenhout Jan
Publication venue
Publication date: 01/01/2008
Field of study

The high potential performance of FPGAs cannot be exploited if a design suffers a memory bottleneck. Therefore, a memory hierarchy is needed to reuse data in on-chip memories and minimize the number of accesses to off-chip memory

Ghent University Academic Bibliography

Archivsystem Ask23

Multi-coefficient Parallel Adaptive Wavelet Rendering

Author: Somers Robin
Publication venue: Lunds universitet/Institutionen för datavetenskap
Publication date: 01/01/2015
Field of study

Adaptive Wavelet Rendering is a sampling method used for ray tracing in order to render photorealistic images. The concept of wavelets and the so-called discrete wavelet transform is used to create a multi-scale view of the image when sampling. This allows the method to identify image variance on different levels and therefore to differentiate and appropriately handle variance resulting from sharp edges or blurred regions, thus creating visually appealing images with minimal work even for complex scenes. This thesis investigates the algorithm and specifically how it can be improved through multi-core concurrency. To this end an alternative version is proposed which works on multiple regions simultaneously. Parallelism is considered for both the original and the alternative version. Furthermore, they are compared both based on the qualita- tive difference between their results and their respective performance gains through concurrency. It is shown that although the structure of the algorithm limits the potential for concur- rency, some improvements can be made, especially for the alternative multi-coefficient version with results maintaining high quality, thus making it better suited to todays highly parallel compute systems. Finally some future directions are considered based on the detailed analysis of how concurrency affects the major components of the algorithm

Quantum Image Processing and Its Application to Edge Detection: Theory and Experiment

Author: Chen Ming-Cheng
Li Jianzhong
Li Jun
Liao Zeyang
Lin Xingcheng
Luo Zhihuang
Pan Jian
Peng Xinhua
Suter Dieter
Wang Hengyan
Wang Zhehui
Yao Xi-Wei
Zhang Kechao
Zhao Meisheng
Zheng Wenqiang
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2017
Field of study

Processing of digital images is continuously gaining in volume and relevance, with concomitant demands on data storage, transmission and processing power. Encoding the image information in quantum-mechanical systems instead of classical ones and replacing classical with quantum information processing may alleviate some of these challenges. By encoding and processing the image information in quantum-mechanical systems, we here demonstrate the framework of quantum image processing, where a pure quantum state encodes the image information: we encode the pixel values in the probability amplitudes and the pixel positions in the computational basis states. Our quantum image representation reduces the required number of qubits compared to existing implementations, and we present image processing algorithms that provide exponential speed-up over their classical counterparts. For the commonly used task of detecting the edge of an image, we propose and implement a quantum algorithm that completes the task with only one single-qubit operation, independent of the size of the image. This demonstrates the potential of quantum image processing for highly efficient image and video processing in the big data era.Comment: 13 pages, including 9 figures and 5 appendixe

arXiv.org e-Print Archive

Directory of Open Access Journals

Wavelet/shearlet hybridized neural networks for biomedical image restoration

Author: Burger
Gaunt
Goossens
He
Ioffe
Jacobsen
Kingma
Labate
Labate
Mao
Shapiro
Tian
Tomasi
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2019
Field of study

Recently, new programming paradigms have emerged that combine parallelism and numerical computations with algorithmic differentiation. This approach allows for the hybridization of neural network techniques for inverse imaging problems with more traditional methods such as wavelet-based sparsity modelling techniques. The benefits are twofold: on the one hand traditional methods with well-known properties can be integrated in neural networks, either as separate layers or tightly integrated in the network, on the other hand, parameters in traditional methods can be trained end-to-end from datasets in a neural network "fashion" (e.g., using Adagrad or Adam optimizers). In this paper, we explore these hybrid neural networks in the context of shearlet-based regularization for the purpose of biomedical image restoration. Due to the reduced number of parameters, this approach seems a promising strategy especially when dealing with small training data sets

Crossref

Ghent University Academic Bibliography