228 research outputs found

    Scalable video compression with optimized visual performance and random accessibility

    Full text link
    This thesis is concerned with maximizing the coding efficiency, random accessibility and visual performance of scalable compressed video. The unifying theme behind this work is the use of finely embedded localized coding structures, which govern the extent to which these goals may be jointly achieved. The first part focuses on scalable volumetric image compression. We investigate 3D transform and coding techniques which exploit inter-slice statistical redundancies without compromising slice accessibility. Our study shows that the motion-compensated temporal discrete wavelet transform (MC-TDWT) practically achieves an upper bound to the compression efficiency of slice transforms. From a video coding perspective, we find that most of the coding gain is attributed to offsetting the learning penalty in adaptive arithmetic coding through 3D code-block extension, rather than inter-frame context modelling. The second aspect of this thesis examines random accessibility. Accessibility refers to the ease with which a region of interest is accessed (subband samples needed for reconstruction are retrieved) from a compressed video bitstream, subject to spatiotemporal code-block constraints. We investigate the fundamental implications of motion compensation for random access efficiency and the compression performance of scalable interactive video. We demonstrate that inclusion of motion compensation operators within the lifting steps of a temporal subband transform incurs a random access penalty which depends on the characteristics of the motion field. The final aspect of this thesis aims to minimize the perceptual impact of visible distortion in scalable reconstructed video. We present a visual optimization strategy based on distortion scaling which raises the distortion-length slope of perceptually significant samples. This alters the codestream embedding order during post-compression rate-distortion optimization, thus allowing visually sensitive sites to be encoded with higher fidelity at a given bit-rate. For visual sensitivity analysis, we propose a contrast perception model that incorporates an adaptive masking slope. This versatile feature provides a context which models perceptual significance. It enables scene structures that otherwise suffer significant degradation to be preserved at lower bit-rates. The novelty in our approach derives from a set of "perceptual mappings" which account for quantization noise shaping effects induced by motion-compensated temporal synthesis. The proposed technique reduces wavelet compression artefacts and improves the perceptual quality of video

    3D Wavelet Transformation for Visual Data Coding With Spatio and Temporal Scalability as Quality Artifacts: Current State Of The Art

    Get PDF
    Several techniques based on the three–dimensional (3-D) discrete cosine transform (DCT) have been proposed for visual data coding. These techniques fail to provide coding coupled with quality and resolution scalability, which is a significant drawback for contextual domains, such decease diagnosis, satellite image analysis. This paper gives an overview of several state-of-the-art 3-D wavelet coders that do meet these requirements and mainly investigates various types of compression techniques those exists, and putting it all together for a conclusion on further research scope

    Real-time scalable video coding for surveillance applications on embedded architectures

    Get PDF

    Spread spectrum-based video watermarking algorithms for copyright protection

    Get PDF
    Merged with duplicate record 10026.1/2263 on 14.03.2017 by CS (TIS)Digital technologies know an unprecedented expansion in the last years. The consumer can now benefit from hardware and software which was considered state-of-the-art several years ago. The advantages offered by the digital technologies are major but the same digital technology opens the door for unlimited piracy. Copying an analogue VCR tape was certainly possible and relatively easy, in spite of various forms of protection, but due to the analogue environment, the subsequent copies had an inherent loss in quality. This was a natural way of limiting the multiple copying of a video material. With digital technology, this barrier disappears, being possible to make as many copies as desired, without any loss in quality whatsoever. Digital watermarking is one of the best available tools for fighting this threat. The aim of the present work was to develop a digital watermarking system compliant with the recommendations drawn by the EBU, for video broadcast monitoring. Since the watermark can be inserted in either spatial domain or transform domain, this aspect was investigated and led to the conclusion that wavelet transform is one of the best solutions available. Since watermarking is not an easy task, especially considering the robustness under various attacks several techniques were employed in order to increase the capacity/robustness of the system: spread-spectrum and modulation techniques to cast the watermark, powerful error correction to protect the mark, human visual models to insert a robust mark and to ensure its invisibility. The combination of these methods led to a major improvement, but yet the system wasn't robust to several important geometrical attacks. In order to achieve this last milestone, the system uses two distinct watermarks: a spatial domain reference watermark and the main watermark embedded in the wavelet domain. By using this reference watermark and techniques specific to image registration, the system is able to determine the parameters of the attack and revert it. Once the attack was reverted, the main watermark is recovered. The final result is a high capacity, blind DWr-based video watermarking system, robust to a wide range of attacks.BBC Research & Developmen

    A DWT based perceptual video coding framework: concepts, issues and techniques

    Get PDF
    The work in this thesis explore the DWT based video coding by the introduction of a novel DWT (Discrete Wavelet Transform) / MC (Motion Compensation) / DPCM (Differential Pulse Code Modulation) video coding framework, which adopts the EBCOT as the coding engine for both the intra- and the inter-frame coder. The adaptive switching mechanism between the frame/field coding modes is investigated for this coding framework. The Low-Band-Shift (LBS) is employed for the MC in the DWT domain. The LBS based MC is proven to provide consistent improvement on the Peak Signal-to-Noise Ratio (PSNR) of the coded video over the simple Wavelet Tree (WT) based MC. The Adaptive Arithmetic Coding (AAC) is adopted to code the motion information. The context set of the Adaptive Binary Arithmetic Coding (ABAC) for the inter-frame data is redesigned based on the statistical analysis. To further improve the perceived picture quality, a Perceptual Distortion Measure (PDM) based on human vision model is used for the EBCOT of the intra-frame coder. A visibility assessment of the quantization error of various subbands in the DWT domain is performed through subjective tests. In summary, all these findings have solved the issues originated from the proposed perceptual video coding framework. They include: a working DWT/MC/DPCM video coding framework with superior coding efficiency on sequences with translational or head-shoulder motion; an adaptive switching mechanism between frame and field coding mode; an effective LBS based MC scheme in the DWT domain; a methodology of the context design for entropy coding of the inter-frame data; a PDM which replaces the MSE inside the EBCOT coding engine for the intra-frame coder, which provides improvement on the perceived quality of intra-frames; a visibility assessment to the quantization errors in the DWT domain

    High ratio wavelet video compression through real-time rate-distortion estimation.

    Get PDF
    Thesis (M.Sc.Eng.)-University of Natal, Durban, 2003.The success of the wavelet transform in the compression of still images has prompted an expanding effort to exercise this transform in the compression of video. Most existing video compression methods incorporate techniques from still image compression, such techniques being abundant, well defined and successful. This dissertation commences with a thorough review and comparison of wavelet still image compression techniques. Thereafter an examination of wavelet video compression techniques is presented. Currently, the most effective video compression system is the DCT based framework, thus a comparison between these and the wavelet techniques is also given. Based on this review, this dissertation then presents a new, low-complexity, wavelet video compression scheme. Noting from a complexity study that the generation of temporally decorrelated, residual frames represents a significant computational burden, this scheme uses the simplest such technique; difference frames. In the case of local motion, these difference frames exhibit strong spatial clustering of significant coefficients. A simple spatial syntax is created by splitting the difference frame into tiles. Advantage of the spatial clustering may then be taken by adaptive bit allocation between the tiles. This is the central idea of the method. In order to minimize the total distortion of the frame, the scheme uses the new p-domain rate-distortion estimation scheme with global numerical optimization to predict the optimal distribution of bits between tiles. Thereafter each tile is independently wavelet transformed and compressed using the SPIHT technique. Throughout the design process computational efficiency was the design imperative, thus leading to a real-time, software only, video compression scheme. The scheme is finally compared to both the current video compression standards and the leading wavelet schemes from the literature in terms of computational complexity visual quality. It is found that for local motion scenes the proposed algorithm executes approximately an order of magnitude faster than these methods, and presents output of similar quality. This algorithm is found to be suitable for implementation in mobile and embedded devices due to its moderate memory and computational requirements

    Image Information Distance Analysis and Applications

    Get PDF
    Image similarity or distortion assessment is fundamental to a broad range of applications throughout the field of image processing and machine vision. These include image restoration, denoising, coding, communication, interpolation, registration, fusion, classification and retrieval, as well as object detection, recognition, and tracking. Many existing image similarity measures have been proposed to work with specific types of image distortions (e.g., JPEG compression). There are also methods such as the structural similarity (SSIM) index that are applicable to a wider range of applications. However, even these "general-purpose" methods offer limited scopes in their applications. For example, SSIM does not apply or work properly when significant geometric changes exist between the two images being compared. The theory of Kolmogorov complexity provides solid groundwork for a generic information distance metric between any objects that minorizes all metrics in the class. The Normalized Information Distance (NID) metric provides a more useful framework. While appealing, the challenge lies in the implementation, mainly due to the non-computable nature of Kolmogorov complexity. To overcome this, a Normalized Compression Distance (NCD) measure was proposed, which is an effective approximation of NID and has found successful applications in the fields of bioinformatics, pattern recognition, and natural language processing. Nevertheless, the application of NID for image similarity and distortion analysis is still in its early stage. Several authors have applied the NID framework and the NCD algorithm to image clustering, image distinguishability, content-based image retrieval and video classification problems, but most reporting only moderate success. Moreover, due to their focuses on ! specific applications, the generic property of NID was not fully exploited. In this work, we aim for developing practical solutions for image distortion analysis based on the information distance framework. In particular, we propose two practical approaches to approximate NID for image similarity and distortion analysis. In the first approach, the shortest program that converts one image to another is found from a list of available transformations and a generic image similarity measure is built on computing the length of this shortest program as an approximation of the conditional Kolmogorov complexity in NID. In the second method, the complexity of the objects is approximated using Shannon entropy. Specifically we transform the reference and distorted images into wavelet domain and assume local independence among image subbands. Inspired by the Visual Information Fidelity (VIF) approach, the Gaussian Scale Mixture (GSM) model is adopted for Natural Scene Statistics (NSS) of the images to simplify the entropy computation. When applying image information distance framework in real-world applications, we find information distance measures often lead to useful features in many image processing applications. In particular, we develop a photo retouching distortion measure based on training a Gaussian kernel Support Vector Regression (SVR) model using information theoretic features extracted from a database of original and edited images. It is shown that the proposed measure is well correlated with subjective ranking of the images. Moreover, we propose a tone mapping operator parameter selection scheme for High Dynamic Range (HDR) images. The scheme attempts to find tone mapping parameters that minimize the NID of the HDR image and the resulting Low Dynamic Range (LDR) image, and thereby minimize the information loss in HDR to LDR tone mapping. The resulting images created by minimizing NID exhibit enhanced image quality

    Discrete Wavelet Transforms

    Get PDF
    The discrete wavelet transform (DWT) algorithms have a firm position in processing of signals in several areas of research and industry. As DWT provides both octave-scale frequency and spatial timing of the analyzed signal, it is constantly used to solve and treat more and more advanced problems. The present book: Discrete Wavelet Transforms: Algorithms and Applications reviews the recent progress in discrete wavelet transform algorithms and applications. The book covers a wide range of methods (e.g. lifting, shift invariance, multi-scale analysis) for constructing DWTs. The book chapters are organized into four major parts. Part I describes the progress in hardware implementations of the DWT algorithms. Applications include multitone modulation for ADSL and equalization techniques, a scalable architecture for FPGA-implementation, lifting based algorithm for VLSI implementation, comparison between DWT and FFT based OFDM and modified SPIHT codec. Part II addresses image processing algorithms such as multiresolution approach for edge detection, low bit rate image compression, low complexity implementation of CQF wavelets and compression of multi-component images. Part III focuses watermaking DWT algorithms. Finally, Part IV describes shift invariant DWTs, DC lossless property, DWT based analysis and estimation of colored noise and an application of the wavelet Galerkin method. The chapters of the present book consist of both tutorial and highly advanced material. Therefore, the book is intended to be a reference text for graduate students and researchers to obtain state-of-the-art knowledge on specific applications
    corecore