2,062 research outputs found

    Error and Congestion Resilient Video Streaming over Broadband Wireless

    Get PDF
    In this paper, error resilience is achieved by adaptive, application-layer rateless channel coding, which is used to protect H.264/Advanced Video Coding (AVC) codec data-partitioned videos. A packetization strategy is an effective tool to control error rates and, in the paper, source-coded data partitioning serves to allocate smaller packets to more important compressed video data. The scheme for doing this is applied to real-time streaming across a broadband wireless link. The advantages of rateless code rate adaptivity are then demonstrated in the paper. Because the data partitions of a video slice are each assigned to different network packets, in congestion-prone wireless networks the increased number of packets per slice and their size disparity may increase the packet loss rate from buffer overflows. As a form of congestion resilience, this paper recommends packet-size dependent scheduling as a relatively simple way of alleviating the buffer-overflow problem arising from data-partitioned packets. The paper also contributes an analysis of data partitioning and packet sizes as a prelude to considering scheduling regimes. The combination of adaptive channel coding and prioritized packetization for error resilience with packet-size dependent packet scheduling results in a robust streaming scheme specialized for broadband wireless and real-time streaming applications such as video conferencing, video telephony, and telemedicine

    Robust video coder solution for wireless streaming: applications in Gaussian channels

    Get PDF
    With the technological progress in wireless communications seen in the past decade, the miniaturization of personal computers was imminent. Due to the limited availability of resources in these small devices, it has been preferable to stream the media over widely deployed networks like the Internet. However, the conventional protocols used in physical and data-link layers are not adequate for reliable video streaming over noisy wireless channels. There are several popular and well-studied mechanisms for addressing this problem, one of them being Multiple-Description-Coding. However, proposed solutions are too specialized, focusing the coding of either motion or spatial information; thus failing to address the whole problem, that is - the robust video coding. In this thesis a novel MDC video coder is presented, which was developed during an internship at the I3S laboratory - France. The full coding scheme is capable of robust transmission of Motion-Vectors and wavelet-subband information over noisy wireless channels. The former is accomplished by using a MAP-based MD-decoding algorithm available in literature, while the robust transmission of wavelet-subbands is achieved using a state-of-the-art registry-based JPEG-2000 MDC. In order to e ciently balance MV information between multiple descriptions, a novel R/D-optimizing MD bitallocation scheme is presented. As it is also important to e ciently distribute bits between subband and motion information, a global subband/motion-vector bit-allocation technique found in literature was adopted and improved. Indeed, this thesis would not be complete without the presentation of produced streams as well as of a set of backing scienti c results

    3D high definition video coding on a GPU-based heterogeneous system

    Get PDF
    H.264/MVC is a standard for supporting the sensation of 3D, based on coding from 2 (stereo) to N views. H.264/MVC adopts many coding options inherited from single view H.264/AVC, and thus its complexity is even higher, mainly because the number of processing views is higher. In this manuscript, we aim at an efficient parallelization of the most computationally intensive video encoding module for stereo sequences. In particular, inter prediction and its collaborative execution on a heterogeneous platform. The proposal is based on an efficient dynamic load balancing algorithm and on breaking encoding dependencies. Experimental results demonstrate the proposed algorithm's ability to reduce the encoding time for different stereo high definition sequences. Speed-up values of up to 90× were obtained when compared with the reference encoder on the same platform. Moreover, the proposed algorithm also provides a more energy-efficient approach and hence requires less energy than the sequential reference algorith

    Side information exploitation, quality control and low complexity implementation for distributed video coding

    Get PDF
    Distributed video coding (DVC) is a new video coding methodology that shifts the highly complex motion search components from the encoder to the decoder, such a video coder would have a great advantage in encoding speed and it is still able to achieve similar rate-distortion performance as the conventional coding solutions. Applications include wireless video sensor networks, mobile video cameras and wireless video surveillance, etc. Although many progresses have been made in DVC over the past ten years, there is still a gap in RD performance between conventional video coding solutions and DVC. The latest development of DVC is still far from standardization and practical use. The key problems remain in the areas such as accurate and efficient side information generation and refinement, quality control between Wyner-Ziv frames and key frames, correlation noise modelling and decoder complexity, etc. Under this context, this thesis proposes solutions to improve the state-of-the-art side information refinement schemes, enable consistent quality control over decoded frames during coding process and implement highly efficient DVC codec. This thesis investigates the impact of reference frames on side information generation and reveals that reference frames have the potential to be better side information than the extensively used interpolated frames. Based on this investigation, we also propose a motion range prediction (MRP) method to exploit reference frames and precisely guide the statistical motion learning process. Extensive simulation results show that choosing reference frames as SI performs competitively, and sometimes even better than interpolated frames. Furthermore, the proposed MRP method is shown to significantly reduce the decoding complexity without degrading any RD performance. To minimize the block artifacts and achieve consistent improvement in both subjective and objective quality of side information, we propose a novel side information synthesis framework working on pixel granularity. We synthesize the SI at pixel level to minimize the block artifacts and adaptively change the correlation noise model according to the new SI. Furthermore, we have fully implemented a state-of-the-art DVC decoder with the proposed framework using serial and parallel processing technologies to identify bottlenecks and areas to further reduce the decoding complexity, which is another major challenge for future practical DVC system deployments. The performance is evaluated based on the latest transform domain DVC codec and compared with different standard codecs. Extensive experimental results show substantial and consistent rate-distortion gains over standard video codecs and significant speedup over serial implementation. In order to bring the state-of-the-art DVC one step closer to practical use, we address the problem of distortion variation introduced by typical rate control algorithms, especially in a variable bit rate environment. Simulation results show that the proposed quality control algorithm is capable to meet user defined target distortion and maintain a rather small variation for sequence with slow motion and performs similar to fixed quantization for fast motion sequence at the cost of some RD performance. Finally, we propose the first implementation of a distributed video encoder on a Texas Instruments TMS320DM6437 digital signal processor. The WZ encoder is efficiently implemented, using rate adaptive low-density-parity-check accumulative (LDPCA) codes, exploiting the hardware features and optimization techniques to improve the overall performance. Implementation results show that the WZ encoder is able to encode at 134M instruction cycles per QCIF frame on a TMS320DM6437 DSP running at 700MHz. This results in encoder speed 29 times faster than non-optimized encoder implementation. We also implemented a highly efficient DVC decoder using both serial and parallel technology based on a PC-HPC (high performance cluster) architecture, where the encoder is running in a general purpose PC and the decoder is running in a multicore HPC. The experimental results show that the parallelized decoder can achieve about 10 times speedup under various bit-rates and GOP sizes compared to the serial implementation and significant RD gains with regards to the state-of-the-art DISCOVER codec

    Arithmetic coding revisited

    Get PDF
    Over the last decade, arithmetic coding has emerged as an important compression tool. It is now the method of choice for adaptive coding on multisymbol alphabets because of its speed, low storage requirements, and effectiveness of compression. This article describes a new implementation of arithmetic coding that incorporates several improvements over a widely used earlier version by Witten, Neal, and Cleary, which has become a de facto standard. These improvements include fewer multiplicative operations, greatly extended range of alphabet sizes and symbol probabilities, and the use of low-precision arithmetic, permitting implementation by fast shift/add operations. We also describe a modular structure that separates the coding, modeling, and probability estimation components of a compression system. To motivate the improved coder, we consider the needs of a word-based text compression program. We report a range of experimental results using this and other models. Complete source code is available

    A highly scalable parallel implementation of H.264

    Get PDF
    Developing parallel applications that can harness and efficiently use future many-core architectures is the key challenge for scalable computing systems. We contribute to this challenge by presenting a parallel implementation of H.264 that scales to a large number of cores. The algorithm exploits the fact that independent macroblocks (MBs) can be processed in parallel, but whereas a previous approach exploits only intra-frame MB-level parallelism, our algorithm exploits intra-frame as well as inter-frame MB-level parallelism. It is based on the observation that inter-frame dependencies have a limited spatial range. The algorithm has been implemented on a many-core architecture consisting of NXP TriMedia TM3270 embedded processors. This required to develop a subscription mechanism, where MBs are subscribed to the kick-off lists associated with the reference MBs. Extensive simulation results show that the implementation scales very well, achieving a speedup of more than 54 on a 64-core processor, in which case the previous approach achieves a speedup of only 23. Potential drawbacks of the 3D-Wave strategy are that the memory requirements increase since there can be many frames in flight, and that the frame latency might increase. Scheduling policies to address these drawbacks are also presented. The results show that these policies combat memory and latency issues with a negligible effect on the performance scalability. Results analyzing the impact of the memory latency, L1 cache size, and the synchronization and thread management overhead are also presented. Finally, we present performance requirements for entropy (CABAC) decoding. This work was performed while the fourth author was with NXP Semiconductors.Peer ReviewedPostprint (author's final draft
    corecore