211 research outputs found

    GPU-oriented architecture for an end-to-end image/video codec based on JPEG2000

    Get PDF
    Modern image and video compression standards employ computationally intensive algorithms that provide advanced features to the coding system. Current standards often need to be implemented in hardware or using expensive solutions to meet the real-time requirements of some environments. Contrarily to this trend, this paper proposes an end-to-end codec architecture running on inexpensive Graphics Processing Units (GPUs) that is based on, though not compatible with, the JPEG2000 international standard for image and video compression. When executed in a commodity Nvidia GPU, it achieves real time processing of 12K video. The proposed S/W architecture utilizes four CUDA kernels that minimize memory transfers, use registers instead of shared memory, and employ a double-buffer strategy to optimize the streaming of data. The analysis of throughput indicates that the proposed codec yields results at least 10× superior on average to those achieved with JPEG2000 implementations devised for CPUs, and approximately 4× superior to those achieved with hardwired solutions of the HEVC/H.265 video compression standard

    Analysis and Performance Optimization of a GPGPU Implementation of Image Quality Assessment (IQA) Algorithm VSNR

    Get PDF
    abstract: Image processing has changed the way we store, view and share images. One important component of sharing images over the networks is image compression. Lossy image compression techniques compromise the quality of images to reduce their size. To ensure that the distortion of images due to image compression is not highly detectable by humans, the perceived quality of an image needs to be maintained over a certain threshold. Determining this threshold is best done using human subjects, but that is impractical in real-world scenarios. As a solution to this issue, image quality assessment (IQA) algorithms are used to automatically compute a fidelity score of an image. However, poor performance of IQA algorithms has been observed due to complex statistical computations involved. General Purpose Graphics Processing Unit (GPGPU) programming is one of the solutions proposed to optimize the performance of these algorithms. This thesis presents a Compute Unified Device Architecture (CUDA) based optimized implementation of full reference IQA algorithm, Visual Signal to Noise Ratio (VSNR) that uses M-level 2D Discrete Wavelet Transform (DWT) with 9/7 biorthogonal filters among other statistical computations. The presented implementation is tested upon four different image quality databases containing images with multiple distortions and sizes ranging from 512 x 512 to 1600 x 1280. The CUDA implementation of VSNR shows a speedup of over 32x for 1600 x 1280 images. It is observed that the speedup scales with the increase in size of images. The results showed that the implementation is fast enough to use VSNR on high definition videos with a frame rate of 60 fps. This work presents the optimizations made due to the use of GPU’s constant memory and reuse of allocated memory on the GPU. Also, it shows the performance improvement using profiler driven GPGPU development in CUDA. The presented implementation can be deployed in production combined with existing applications.Dissertation/ThesisMasters Thesis Computer Science 201

    Information Systems: Secure Access and Storage in the Age of Cloud Computing

    Get PDF
    Given that cloud computing is a remotely accessed service, the connection between provider and customer needs to be adequately protected against all known security risks. In order to ensure this, an open and clear specification of all standards, algorithms and security protocols adopted by the cloud provider is required. In this paper, we review current issues concerned with security threats to cloud computing and present a solution based on our unique patented compression-encryption method. The method provides highly efficient data compression where a unique symmetric key is generated as part of the compression process and is dependent on the characteristics of the data. Without the key, the data cannot be decompressed. We focus on threat prevention by cryptography that, if properly implemented, is virtually impossible to break directly. Our security by design is based on two principles: first, defence in depth, where our proposed design is such that more than one subsystem needs to be violated to get both the data and their key. Second, the principle of least privilege, where the attacker may gain access to only part of a system. The paper highlights the benefits of the solution that include high compression ratios, less bandwidth requirements, faster data transmission and response times, less storage space, and less energy consumption among others

    Discrete Wavelet Transforms

    Get PDF
    The discrete wavelet transform (DWT) algorithms have a firm position in processing of signals in several areas of research and industry. As DWT provides both octave-scale frequency and spatial timing of the analyzed signal, it is constantly used to solve and treat more and more advanced problems. The present book: Discrete Wavelet Transforms: Algorithms and Applications reviews the recent progress in discrete wavelet transform algorithms and applications. The book covers a wide range of methods (e.g. lifting, shift invariance, multi-scale analysis) for constructing DWTs. The book chapters are organized into four major parts. Part I describes the progress in hardware implementations of the DWT algorithms. Applications include multitone modulation for ADSL and equalization techniques, a scalable architecture for FPGA-implementation, lifting based algorithm for VLSI implementation, comparison between DWT and FFT based OFDM and modified SPIHT codec. Part II addresses image processing algorithms such as multiresolution approach for edge detection, low bit rate image compression, low complexity implementation of CQF wavelets and compression of multi-component images. Part III focuses watermaking DWT algorithms. Finally, Part IV describes shift invariant DWTs, DC lossless property, DWT based analysis and estimation of colored noise and an application of the wavelet Galerkin method. The chapters of the present book consist of both tutorial and highly advanced material. Therefore, the book is intended to be a reference text for graduate students and researchers to obtain state-of-the-art knowledge on specific applications

    Efficient JPEG 2000 Image Compression Scheme for Multihop Wireless Networks

    Get PDF
     When using wireless sensor networks for real-time data transmission, some critical points should be considered. Restricted computational power, reduced memory, narrow bandwidth and energy supplied present strong limits in sensor nodes. Therefore, maximizing network lifetime and minimizing energy consumption are always optimization goals. To overcome the computation and energy limitation of individual sensor nodes during image transmission, an energy efficient image transport scheme is proposed, taking advantage of JPEG2000 still image compression standard using MATLAB and C from Jasper. JPEG2000 provides a practical set of features, not necessarily available in the previous standards. These features were achieved using techniques: the discrete wavelet transform (DWT), and embedded block coding with optimized truncation (EBCOT). Performance of the proposed image transport scheme is investigated with respect to image quality and energy consumption. Simulation results are presented and show that the proposed scheme optimizes network lifetime and reduces significantly the amount of required memory by analyzing the functional influence of each parameter of this distributed image compression algorithm.

    A flexible hardware architecture for 2-D discrete wavelet transform: design and FPGA implementation

    Get PDF
    The Discrete Wavelet Transform (DWT) is a powerful signal processing tool that has recently gained widespread acceptance in the field of digital image processing. The multiresolution analysis provided by the DWT addresses the shortcomings of the Fourier Transform and its derivatives. The DWT has proven useful in the area of image compression where it replaces the Discrete Cosine Transform (DCT) in new JPEG2000 and MPEG4 image and video compression standards. The Cohen-Daubechies-Feauveau (CDF) 5/3 and CDF 9/7 DWTs are used for reversible lossless and irreversible lossy compression encoders in the JPEG2000 standard respectively. The design and implementation of a flexible hardware architecture for the 2-D DWT is presented in this thesis. This architecture can be configured to perform both the forward and inverse DWT for any DWTfamily, using fixed-point arithmetic and no auxiliary memory. The Lifting Scheme method is used to perform the DWT instead of the less efficient convolution-based methods. The DWT core is modeled using MATLAB and highly parameterized VHDL. The VHDL model is synthesized to a Xilinx FPGA to prove hardware functionality. The CDF 5/3 and CDF 9/7 versions of the DWT are both modeled and used as comparisons throughout this thesis. The DWT core is used in conjunction with a very simple image denoising module to demonstrate the potential of the DWT core to perform image processing techniques. The CDF 5/3 hardware produces identical results to its theoretical MATLAB model. The fixed point CDF 9/7 deviates very slightly from its floating-point MATLAB model with a ~59dB PSNR deviation for nine levels of DWT decomposition. The execution time for performing both DWTs is nearly identical at -14 clock cycles per image pixel for one level of DWT decomposition. The hardware area generated for the CDF 5/3 is -16,000 gates using only 5% of the Xilinx FPGA hardware area, 2.185 MHz maximum clock speed and 24 mW power consumption. The simple wavelet image denoising techniques resulted in cleaned images up to -27 PSNR

    Development Of Efficient Multi-Level Discrete Wavelet Transform Hardware Architecture For Image Compression

    Get PDF
    Berfokuskan pengkomputeran intensif dalam gelombang kecil diskret (DWT), reka bentuk seni bina perkakasan efisen bagi pengkomputeran laju menjadi imperatif terutamanya dalam aplikasi masa nyata. Focusing on the intensive computations involved in the discrete wavelet transform (DWT), the design of efficient hardware architectures for a fast computation of the transform has become imperative, especially for real-time applications
    corecore