10 research outputs found

    Codage vidéo par schéma lifting avec gestion des occlusions

    Get PDF
    · Le schéma lifting compensé en mouvement est utilisé dans la plupart des codeurs vidéo basés ondelettes. Cependant, l'estimation et la compensation de mouvement à l'aide de blocs entraßne l'apparition d'artéfacts visibles autour des objets en mouvement et du bord des images. Dans ce papier, nous proposons une nouvelle méthode de filtrage temporel qui fait appel à une segmentation et une estimation de mouvement conjointes. Le principe consiste à attribuer un mouvement à des régions de forme adaptable au lieu d'utiliser des blocs. Nous présentons d'une part l'algorithme de filtrage «Puzzle» et étudions les conditions de son inversibilité. D'autre part, nous proposons une méthode d'extraction des régions d'occlusion à partir des informations de segmentation et de mouvement; ces régions sont ensuite utilisées pour gérer les occlusions. Les premiers résultats expérimentaux confirme la diminution des effets de blocs; la bonne gestion des occlusions permet une baisse significative de l'entropie des sous-bandes temporelles

    Combined Hierarchical Wavelet-Coefficient Structures For Grayscale Image Compression

    Get PDF
    A suitable algorithm suggested with wavelet compression for gray scaleimages based on one- and two-dimension combined hierarchical structure, in thesub-band which has been generated by the aid of several types of waveletfunctions. It is shown that the using of combined based hierarchical structuresallows us to reduce the calculations complexity of compression anddecompression at constant values of compression coefficients

    Enabling error-resilient internet broadcasting using motion compensated spatial partitioning and packet FEC for the dirac video codec

    Get PDF
    Video transmission over the wireless or wired network require protection from channel errors since compressed video bitstreams are very sensitive to transmission errors because of the use of predictive coding and variable length coding. In this paper, a simple, low complexity and patent free error-resilient coding is proposed. It is based upon the idea of using spatial partitioning on the motion compensated residual frame without employing the transform coefficient coding. The proposed scheme is intended for open source Dirac video codec in order to enable the codec to be used for Internet broadcasting. By partitioning the wavelet transform coefficients of the motion compensated residual frame into groups and independently processing each group using arithmetic coding and Forward Error Correction (FEC), robustness to transmission errors over the packet erasure wired network could be achieved. Using the Rate Compatibles Punctured Code (RCPC) and Turbo Code (TC) as the FEC, the proposed technique provides gracefully decreasing perceptual quality over packet loss rates up to 30%. The PSNR performance is much better when compared with the conventional data partitioning only methods. Simulation results show that the use of multiple partitioning of wavelet coefficient in Dirac can achieve up to 8 dB PSNR gain over its existing un-partitioned method

    Error-resilient performance of Dirac video codec over packet-erasure channel

    Get PDF
    Video transmission over the wireless or wired network requires error-resilient mechanism since compressed video bitstreams are sensitive to transmission errors because of the use of predictive coding and variable length coding. This paper investigates the performance of a simple and low complexity error-resilient coding scheme which combines source and channel coding to protect compressed bitstream of wavelet-based Dirac video codec in the packet-erasure channel. By partitioning the wavelet transform coefficients of the motion-compensated residual frame into groups and independently processing each group using arithmetic and Forward Error Correction (FEC) coding, Dirac could achieves the robustness to transmission errors by giving the video quality which is gracefully decreasing over a range of packet loss rates up to 30% when compared with conventional FEC only methods. Simulation results also show that the proposed scheme using multiple partitions can achieve up to 10 dB PSNR gain over its existing un-partitioned format. This paper also investigates the error-resilient performance of the proposed scheme in comparison with H.264 over packet-erasure channel

    Enabling error-resilient internet broadcasting using motion compensated spatial partitioning and packet FEC for the Dirac Video Codec

    Get PDF
    Video transmission over the wireless or wired network require protection from channel errors since compressed video bitstreams are very sensitive to transmission errors because of the use of predictive coding and variable length coding. In this paper, a simple, low complexity and patent free error-resilient coding is proposed. It is based upon the idea of using spatial partitioning on the motion compensated residual frame without employing the transform coefficient coding. The proposed scheme is intended for open source Dirac video codec in order to enable the codec to be used for Internet broadcasting. By partitioning the wavelet transform coefficients of the motion compensated residual frame into groups and independently processing each group using arithmetic coding and Forward Error Correction (FEC), robustness to transmission errors over the packet erasure wired network could be achieved. Using the Rate Compatibles Punctured Code (RCPC) and Turbo Code (TC) as the FEC, the proposed technique provides gracefully decreasing perceptual quality over packet loss rates up to 30%. The PSNR performance is much better when compared with the conventional data partitioning only methods. Simulation results show that the use of multiple partitioning of wavelet coefficient in Dirac can achieve up to 8 dB PSNR gain over its existing un-partitioned method

    A DWT based perceptual video coding framework: concepts, issues and techniques

    Get PDF
    The work in this thesis explore the DWT based video coding by the introduction of a novel DWT (Discrete Wavelet Transform) / MC (Motion Compensation) / DPCM (Differential Pulse Code Modulation) video coding framework, which adopts the EBCOT as the coding engine for both the intra- and the inter-frame coder. The adaptive switching mechanism between the frame/field coding modes is investigated for this coding framework. The Low-Band-Shift (LBS) is employed for the MC in the DWT domain. The LBS based MC is proven to provide consistent improvement on the Peak Signal-to-Noise Ratio (PSNR) of the coded video over the simple Wavelet Tree (WT) based MC. The Adaptive Arithmetic Coding (AAC) is adopted to code the motion information. The context set of the Adaptive Binary Arithmetic Coding (ABAC) for the inter-frame data is redesigned based on the statistical analysis. To further improve the perceived picture quality, a Perceptual Distortion Measure (PDM) based on human vision model is used for the EBCOT of the intra-frame coder. A visibility assessment of the quantization error of various subbands in the DWT domain is performed through subjective tests. In summary, all these findings have solved the issues originated from the proposed perceptual video coding framework. They include: a working DWT/MC/DPCM video coding framework with superior coding efficiency on sequences with translational or head-shoulder motion; an adaptive switching mechanism between frame and field coding mode; an effective LBS based MC scheme in the DWT domain; a methodology of the context design for entropy coding of the inter-frame data; a PDM which replaces the MSE inside the EBCOT coding engine for the intra-frame coder, which provides improvement on the perceived quality of intra-frames; a visibility assessment to the quantization errors in the DWT domain

    Research and developments of distributed video coding

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The recent developed Distributed Video Coding (DVC) is typically suitable for the applications such as wireless/wired video sensor network, mobile camera etc. where the traditional video coding standard is not feasible due to the constrained computation at the encoder. With DVC, the computational burden is moved from encoder to decoder. The compression efficiency is achieved via joint decoding at the decoder. The practical application of DVC is referred to Wyner-Ziv video coding (WZ) where the side information is available at the decoder to perform joint decoding. This join decoding inevitably causes a very complex decoder. In current WZ video coding issues, many of them emphasise how to improve the system coding performance but neglect the huge complexity caused at the decoder. The complexity of the decoder has direct influence to the system output. The beginning period of this research targets to optimise the decoder in pixel domain WZ video coding (PDWZ), while still achieves similar compression performance. More specifically, four issues are raised to optimise the input block size, the side information generation, the side information refinement process and the feedback channel respectively. The transform domain WZ video coding (TDWZ) has distinct superior performance to the normal PDWZ due to the exploitation in spatial direction during the encoding. However, since there is no motion estimation at the encoder in WZ video coding, the temporal correlation is not exploited at all at the encoder in all current WZ video coding issues. In the middle period of this research, the 3D DCT is adopted in the TDWZ to remove redundancy in both spatial and temporal direction thus to provide even higher coding performance. In the next step of this research, the performance of transform domain Distributed Multiview Video Coding (DMVC) is also investigated. Particularly, three types transform domain DMVC frameworks which are transform domain DMVC using TDWZ based 2D DCT, transform domain DMVC using TDWZ based on 3D DCT and transform domain residual DMVC using TDWZ based on 3D DCT are investigated respectively. One of the important applications of WZ coding principle is error-resilience. There have been several attempts to apply WZ error-resilient coding for current video coding standard e.g. H.264/AVC or MEPG 2. The final stage of this research is the design of WZ error-resilient scheme for wavelet based video codec. To balance the trade-off between error resilience ability and bandwidth consumption, the proposed scheme emphasises the protection of the Region of Interest (ROI) area. The efficiency of bandwidth utilisation is achieved by mutual efforts of WZ coding and sacrificing the quality of unimportant area. In summary, this research work contributed to achieves several advances in WZ video coding. First of all, it is targeting to build an efficient PDWZ with optimised decoder. Secondly, it aims to build an advanced TDWZ based on 3D DCT, which then is applied into multiview video coding to realise advanced transform domain DMVC. Finally, it aims to design an efficient error-resilient scheme for wavelet video codec, with which the trade-off between bandwidth consumption and error-resilience can be better balanced

    SPIHT image coding : analysis, improvements and applications.

    Get PDF
    Image compression plays an important role in image storage and transmission. In the popular Internet applications and mobile communications, image coding is required to be not only efficient but also scalable. Recent wavelet techniques provide a way for efficient and scalable image coding. SPIHT (set partitioning in hierarchical trees) is such an algorithm based on wavelet transform. This thesis analyses and improves the SPIHT algorithm. The preliminary part of the thesis investigates two-dimensional multi-resolution decomposition for image coding using the wavelet transform, which is reviewed and analysed systematically. The wavelet transform is implemented using filter banks, and the z-domain proofs are given for the key implementation steps. A scheme of wavelet transform for arbitrarily sized images is proposed. The statistical properties of the wavelet coefficients (being the output of the wavelet transform) are explored for natural images. The energy in the transform domain is localised and highly concentrated on the low-resolution subband. The wavelet coefficients are DC-biased, and the gravity centre of most octave-segmented value sections (which are relevant to the binary bit-planes) is offset by approximately one eighth of the section range from the geometrical centre. The intra-subband correlation coefficients are the largest, followed by the inter-level correlation coefficients in the middle then the trivial inter-subband correlation coefficients on the same resolution level. The statistical properties reveal the success of the SPIHT algorithm, and lead to further improvements. The subsequent parts of the thesis examine the SPIHT algorithm. The concepts of successive approximation quantisation and ordered bit-plane coding are highlighted. The procedure of SPIHT image coding is demonstrated with a simple example. A solution for arbitrarily sized images is proposed. Seven measures are proposed to improve the SPIHT algorithm. Three DC-level shifting schemes are discussed, and the one subtracting the geometrical centre in the image domain is selected in the thesis. The virtual trees are introduced to hold more wavelet coefficients in each of the initial sets. A scheme is proposed to reduce the redundancy in the coding bit-stream by omitting the predictable symbols. The quantisation of wavelet coefficients is offset by one eighth from the geometrical centre. A pre-processing technique is proposed to speed up the judgement of the significance of trees, and a smoothing is imposed on the magnitude of the wavelet coefficients during the pre-processing for lossy image coding. The optimisation of arithmetic coding is also discussed. Experimental results show that these improvements to SPIHT get a significant performance gain. The running time is reduced by up to a half. The PSNR (peak signal to noise ratio) is improved a lot at very low bit rates, up to 12 dB in the extreme case. Moderate improvements are also made at high bit rates. The SPIHT algorithm is applied to loss less image coding. Various wavelet transforms are evaluated for lossless SPIHT image coding. Experimental results show that the interpolating transform (4, 4) and the S+P transform (2+2, 2) are the best for natural images among the transforms used, the interpolating transform (4, 2) is the best for CT images, and the bi-orthogonal transform (9, 7) is always the worst. Content-based lossless coding of a CT head image is presented in the thesis, using segmentation and SPIHT. Although the performance gain is limited in the experiments, it shows the potential advantage of content-based image coding

    Toward sparse and geometry adapted video approximations

    Get PDF
    Video signals are sequences of natural images, where images are often modeled as piecewise-smooth signals. Hence, video can be seen as a 3D piecewise-smooth signal made of piecewise-smooth regions that move through time. Based on the piecewise-smooth model and on related theoretical work on rate-distortion performance of wavelet and oracle based coding schemes, one can better analyze the appropriate coding strategies that adaptive video codecs need to implement in order to be efficient. Efficient video representations for coding purposes require the use of adaptive signal decompositions able to capture appropriately the structure and redundancy appearing in video signals. Adaptivity needs to be such that it allows for proper modeling of signals in order to represent these with the lowest possible coding cost. Video is a very structured signal with high geometric content. This includes temporal geometry (normally represented by motion information) as well as spatial geometry. Clearly, most of past and present strategies used to represent video signals do not exploit properly its spatial geometry. Similarly to the case of images, a very interesting approach seems to be the decomposition of video using large over-complete libraries of basis functions able to represent salient geometric features of the signal. In the framework of video, these features should model 2D geometric video components as well as their temporal evolution, forming spatio-temporal 3D geometric primitives. Through this PhD dissertation, different aspects on the use of adaptivity in video representation are studied looking toward exploiting both aspects of video: its piecewise nature and the geometry. The first part of this work studies the use of localized temporal adaptivity in subband video coding. This is done considering two transformation schemes used for video coding: 3D wavelet representations and motion compensated temporal filtering. A theoretical R-D analysis as well as empirical results demonstrate how temporal adaptivity improves coding performance of moving edges in 3D transform (without motion compensation) based video coding. Adaptivity allows, at the same time, to equally exploit redundancy in non-moving video areas. The analogy between motion compensated video and 1D piecewise-smooth signals is studied as well. This motivates the introduction of local length adaptivity within frame-adaptive motion compensated lifted wavelet decompositions. This allows an optimal rate-distortion performance when video motion trajectories are shorter than the transformation "Group Of Pictures", or when efficient motion compensation can not be ensured. After studying temporal adaptivity, the second part of this thesis is dedicated to understand the fundamentals of how can temporal and spatial geometry be jointly exploited. This work builds on some previous results that considered the representation of spatial geometry in video (but not temporal, i.e, without motion). In order to obtain flexible and efficient (sparse) signal representations, using redundant dictionaries, the use of highly non-linear decomposition algorithms, like Matching Pursuit, is required. General signal representation using these techniques is still quite unexplored. For this reason, previous to the study of video representation, some aspects of non-linear decomposition algorithms and the efficient decomposition of images using Matching Pursuits and a geometric dictionary are investigated. A part of this investigation concerns the study on the influence of using a priori models within approximation non-linear algorithms. Dictionaries with a high internal coherence have some problems to obtain optimally sparse signal representations when used with Matching Pursuits. It is proved, theoretically and empirically, that inserting in this algorithm a priori models allows to improve the capacity to obtain sparse signal approximations, mainly when coherent dictionaries are used. Another point discussed in this preliminary study, on the use of Matching Pursuits, concerns the approach used in this work for the decompositions of video frames and images. The technique proposed in this thesis improves a previous work, where authors had to recur to sub-optimal Matching Pursuit strategies (using Genetic Algorithms), given the size of the functions library. In this work the use of full search strategies is made possible, at the same time that approximation efficiency is significantly improved and computational complexity is reduced. Finally, a priori based Matching Pursuit geometric decompositions are investigated for geometric video representations. Regularity constraints are taken into account to recover the temporal evolution of spatial geometric signal components. The results obtained for coding and multi-modal (audio-visual) signal analysis, clarify many unknowns and show to be promising, encouraging to prosecute research on the subject
    corecore