11 research outputs found

    Distributed video coding for wireless video sensor networks: a review of the state-of-the-art architectures

    Get PDF
    Distributed video coding (DVC) is a relatively new video coding architecture originated from two fundamental theorems namely, Slepian–Wolf and Wyner–Ziv. Recent research developments have made DVC attractive for applications in the emerging domain of wireless video sensor networks (WVSNs). This paper reviews the state-of-the-art DVC architectures with a focus on understanding their opportunities and gaps in addressing the operational requirements and application needs of WVSNs

    Decoder-driven mode decision in a block-based distributed video codec

    Get PDF
    Distributed Video Coding (DVC) is a video coding paradigm in which the computational complexity is shifted from the encoder to the decoder. DVC is based on information theoretic results suggesting that, under ideal conditions, the same rate-distortion performance can be achieved as for traditional video codecs. In practice however, there is still a significant performance gap between the two coding architectures. One of the main reasons for this gap is the lack of multiple coding modes in current DVC solutions. In this paper, we propose a block-based distributed video codec that supports three coding modes: Wyner-Ziv, skip, and intra. The mode decision process is entirely decoder-driven. Skip blocks are selected based on the estimated accuracy of the side information. The choice between intra and Wyner-Ziv coding modes is made on a rate-distortion basis, by selecting the coding mode with the lowest rate while assuring equal distortion for both modes. Experimental results illustrate that the proposed block-based architecture has some advantages over classical bitplane-based approaches. Introducing skip and intra coded blocks yields average bitrate gains of up to 33.7% over our basic configuration supporting Wyner-Ziv mode only, and up to 29.7% over the reference bitplane-based DISCOVER codec

    DYNAMIC KEY BLOCK DECISION WITH SPATIO-TEMPORAL ANALYSIS FOR WYNER-ZIV VIDEO CODING

    Get PDF
    ABSTRACT Wyner-Ziv coding has been recognized as the most popular method up to now. For traditional WZC, side information is generated from intra-coded frames for use in the decoding of WZ frames. The unit for intra-coding is a frame and the distance between key-frames is kept constant. In this paper, the unit for intra-coding is a block, and the temporal distance between two consecutive key blocks can varying with time. A block is assigned a mode (WZ or intra-coded), depending on the result of spatio-temporal analysis, and encoded in an alternative manner. This strategy improves the overall coding efficiency, while maintaining a low encoder complexity. The performance gain can achieve up to 6 dB with respect to the traditional pixel-domain WZC

    A low-complexity and efficient encoder rate control solution for distributed residual video coding.

    Get PDF
    Existing encoder rate control (ERC) solutions have two technical limitations that prevent them from being widely used in real-world applications. One is that encoder side information (ESI) is required to be generated which increases the complexity at the encoder. The other is that rate estimation is performed at bit plane level which incurs computation overheads and latency when many bit planes exist. To achieve a low-complexity encoder, we propose a new ERC solution that combines an efficient encoder block mode decision (EBMD) for the distributed residual video coding (DRVC). The main contributions of this paper are as follows: 1) ESI is not required as our ERC is based on the analysis of the statistical characteristics of the decoder side information (DSI); 2) a simple EBMD is introduced which only employs the values of residual pixels at the encoder to classify blocks into Intra mode, Skip mode, and WZ mode; 3) an ERC solution using pseudo-random sequence scrambling is proposed to estimate rates for all WZ blocks at frame level instead of at bit plane level, i.e., only one rate is estimated; and 4) a quantization-index estimation algorithm (QIEA) is proposed to solve the problem of rate underestimation. The simulation results show that the proposed solution is not only low complex but also efficient in both the block mode decision and the rate estimation. Also, as compared to DISCOVER system and the state-of-the-art ERC solution, our solution demonstrates a competitive rate-distortion(RD)performance. Due to maintain the low-complexity nature of the encoder and have good RD performance, we believe that our ERC solution is promising in practice

    Selected topics on distributed video coding

    Get PDF
    Distributed Video Coding (DVC) is a new paradigm for video compression based on the information theoretical results of Slepian and Wolf (SW), and Wyner and Ziv (WZ). While conventional coding has a rigid complexity allocation as most of the complex tasks are performed at the encoder side, DVC enables a flexible complexity allocation between the encoder and the decoder. The most novel and interesting case is low complexity encoding and complex decoding, which is the opposite of conventional coding. While the latter is suitable for applications where the cost of the decoder is more critical than the encoder's one, DVC opens the door for a new range of applications where low complexity encoding is required and the decoder's complexity is not critical. This is interesting with the deployment of small and battery-powered multimedia mobile devices all around in our daily life. Further, since DVC operates as a reversed-complexity scheme when compared to conventional coding, DVC also enables the interesting scenario of low complexity encoding and decoding between two ends by transcoding between DVC and conventional coding. More specifically, low complexity encoding is possible by DVC at one end. Then, the resulting stream is decoded and conventionally re-encoded to enable low complexity decoding at the other end. Multiview video is attractive for a wide range of applications such as free viewpoint television, which is a system that allows viewing the scene from a viewpoint chosen by the viewer. Moreover, multiview can be beneficial for monitoring purposes in video surveillance. The increased use of multiview video systems is mainly due to the improvements in video technology and the reduced cost of cameras. While a multiview conventional codec will try to exploit the correlation among the different cameras at the encoder side, DVC allows for separate encoding of correlated video sources. Therefore, DVC requires no communication between the cameras in a multiview scenario. This is an advantage since communication is time consuming (i.e. more delay) and requires complex networking. Another appealing feature of DVC is the fact that it is based on a statistical framework. Moreover, DVC behaves as a natural joint source-channel coding solution. This results in an improved error resilience performance when compared to conventional coding. Further, DVC-based scalable codecs do not require a deterministic knowledge of the lower layers. In other words, the enhancement layers are completely independent from the base layer codec. This is called the codec-independent scalability feature, which offers a high flexibility in the way the various layers are distributed in a network. This thesis addresses the following topics: First, the theoretical foundations of DVC as well as the practical DVC scheme used in this research are presented. The potential applications for DVC are also outlined. DVC-based schemes use conventional coding to compress parts of the data, while the rest is compressed in a distributed fashion. Thus, different conventional codecs are studied in this research as they are compared in terms of compression efficiency for a rich set of sequences. This includes fine tuning the compression parameters such that the best performance is achieved for each codec. Further, DVC tools for improved Side Information (SI) and Error Concealment (EC) are introduced for monoview DVC using a partially decoded frame. The improved SI results in a significant gain in reconstruction quality for video with high activity and motion. This is done by re-estimating the erroneous motion vectors using the partially decoded frame to improve the SI quality. The latter is then used to enhance the reconstruction of the finally decoded frame. Further, the introduced spatio-temporal EC improves the quality of decoded video in the case of erroneously received packets, outperforming both spatial and temporal EC. Moreover, it also outperforms error-concealed conventional coding in different modes. Then, multiview DVC is studied in terms of SI generation, which differentiates it from the monoview case. More specifically, different multiview prediction techniques for SI generation are described and compared in terms of prediction quality, complexity and compression efficiency. Further, a technique for iterative multiview SI is introduced, where the final SI is used in an enhanced reconstruction process. The iterative SI outperforms the other SI generation techniques, especially for high motion video content. Finally, fusion techniques of temporal and inter-view side informations are introduced as well, which improves the performance of multiview DVC over monoview coding. DVC is also used to enable scalability for image and video coding. Since DVC is based on a statistical framework, the base and enhancement layers are completely independent, which is an interesting property called codec-independent scalability. Moreover, the introduced DVC scalable schemes show a good robustness to errors as the quality of decoded video steadily decreases with error rate increase. On the other hand, conventional coding exhibits a cliff effect as the performance drops dramatically after a certain error rate value. Further, the issue of privacy protection is addressed for DVC by transform domain scrambling, which is used to alter regions of interest in video such that the scene is still understood and privacy is preserved as well. The proposed scrambling techniques are shown to provide a good level of security without impairing the performance of the DVC scheme when compared to the one without scrambling. This is particularly attractive for video surveillance scenarios, which is one of the most promising applications for DVC. Finally, a practical DVC demonstrator built during this research is described, where the main requirements as well as the observed limitations are presented. Furthermore, it is defined in a setup being as close as possible to a complete real application scenario. This shows that it is actually possible to implement a complete end-to-end practical DVC system relying only on realistic assumptions. Even though DVC is inferior in terms of compression efficiency to the state of the art conventional coding for the moment, strengths of DVC reside in its good error resilience properties and the codec-independent scalability feature. Therefore, DVC offers promising possibilities for video compression with transmission over error-prone environments requirement as it significantly outperforms conventional coding in this case

    Robust density modelling using the student's t-distribution for human action recognition

    Full text link
    The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE
    corecore