160 research outputs found

    In-layer multi-buffer framework for rate-controlled scalable video coding

    Get PDF
    Temporal scalability is supported in scalable video coding (SVC) by means of hierarchical prediction structures, where the higher layers can be ignored for frame rate reduction. Nevertheless, this kind of scalability is not totally exploited by the rate control (RC) algorithms since the hypothetical reference decoder (HRD) requirement is only satisfied for the highest frame rate sub-stream of every dependency (spatial or coarse grain scalability) layer. In this paper we propose a novel RC approach that aims to deliver several HRD-compliant temporal resolutions within a particular dependency layer. Instead of using the common SVC encoder configuration consisting of a dependency layer per each temporal resolution, a compact configuration that does not require additional dependency layers for providing different HRD-compliant temporal resolutions is proposed. Specifically, the proposed framework for rate-controlled SVC uses a set of virtual buffers within a dependency layer so that their levels can be simultaneously controlled for overflow and underflow prevention while minimizing the reconstructed video distortion of the corresponding sub-streams. This in-layer multi-buffer approach has been built on top of a baseline H.264/SVC RC algorithm for variable bit rate applications. The experimental results show that our proposal achieves a good performance in terms of mean quality, quality consistency, and buffer control using a reduced number of layers.This work has been partially supported by the National Grant TEC2011-26807 of the Spanish Ministry of Science and Innovation.Publicad

    Efficient algorithms for scalable video coding

    Get PDF
    A scalable video bitstream specifically designed for the needs of various client terminals, network conditions, and user demands is much desired in current and future video transmission and storage systems. The scalable extension of the H.264/AVC standard (SVC) has been developed to satisfy the new challenges posed by heterogeneous environments, as it permits a single video stream to be decoded fully or partially with variable quality, resolution, and frame rate in order to adapt to a specific application. This thesis presents novel improved algorithms for SVC, including: 1) a fast inter-frame and inter-layer coding mode selection algorithm based on motion activity; 2) a hierarchical fast mode selection algorithm; 3) a two-part Rate Distortion (RD) model targeting the properties of different prediction modes for the SVC rate control scheme; and 4) an optimised Mean Absolute Difference (MAD) prediction model. The proposed fast inter-frame and inter-layer mode selection algorithm is based on the empirical observation that a macroblock (MB) with slow movement is more likely to be best matched by one in the same resolution layer. However, for a macroblock with fast movement, motion estimation between layers is required. Simulation results show that the algorithm can reduce the encoding time by up to 40%, with negligible degradation in RD performance. The proposed hierarchical fast mode selection scheme comprises four levels and makes full use of inter-layer, temporal and spatial correlation aswell as the texture information of each macroblock. Overall, the new technique demonstrates the same coding performance in terms of picture quality and compression ratio as that of the SVC standard, yet produces a saving in encoding time of up to 84%. Compared with state-of-the-art SVC fast mode selection algorithms, the proposed algorithm achieves a superior computational time reduction under very similar RD performance conditions. The existing SVC rate distortion model cannot accurately represent the RD properties of the prediction modes, because it is influenced by the use of inter-layer prediction. A separate RD model for inter-layer prediction coding in the enhancement layer(s) is therefore introduced. Overall, the proposed algorithms improve the average PSNR by up to 0.34dB or produce an average saving in bit rate of up to 7.78%. Furthermore, the control accuracy is maintained to within 0.07% on average. As aMADprediction error always exists and cannot be avoided, an optimisedMADprediction model for the spatial enhancement layers is proposed that considers the MAD from previous temporal frames and previous spatial frames together, to achieve a more accurateMADprediction. Simulation results indicate that the proposedMADprediction model reduces the MAD prediction error by up to 79% compared with the JVT-W043 implementation

    A New Transcoding Scheme for Scalable Video Coding to H.264/AVC

    Get PDF
    Requests from various video terminals push video servers to equip with scalability for video contents distribution in different ways. Scalable Video Coding (SVC) as the extension of H.264/AVC standard can provide the scalability for video servers by encoding videos into one base layer and several enhancement layers. To enable mobile devices without scalability receive videos at their best extent, converting bit-streams from SVC into H.264/AVC becomes the key technique. Bit-stream rewriting is the simplest way without quality loss. However, rewriting is not a real transcoding scheme, since it needs to modify SVC encoders. This paper proposes a novel transcoding approach to support spatial scalability by minimizing the distortions generated from re-encoding process. The proposed scheme keeps the input bit-streams’ information at maximum and adopts the hybrid upsampling method to do residue scaling, which can reduce the transcoding distortion into minimization. Experimental results demonstrate that the loss of the rate-distortion (RD) performance of the proposed transcoding scheme is better than Full Decoding Re-encoding (FDR) which can get the highest video quality in general sense, by achieving up to 0.9 dB Y-PSNR gain while saving 95%~97% processing time

    Fast Implementation of the Scalable Video Coding Extension of the H.264/AVC Standard

    Get PDF
    In order to improve coding efficiency in the scalable extension of H.264/AVC, an inter-layer prediction mechanism is incorporated. This exploits as much lower layer information as possible to inform the process of coding the enhancement layer(s). However it also greatly increases the computational complexity. In this paper, a fast mode decision algorithm for efficient implementation of the SVC encoder is described. The proposed algorithm not only considers inter-layer correlation but also fully exploits both spatial and temporal correlation as well as an assessment of macroblock texture. All of these factors are organised within a hierarchical structure in the mode decision process. At each level of the structure, different strategies are implemented to eliminate inappropriate candidate modes. Simulation results show that the proposed algorithm reduces encoding time by up to 85% compared with the JSVM 9.18 implementation. This is achieved without any noticeable degradation in rate distortion

    Quality of Experience and Adaptation Techniques for Multimedia Communications

    Get PDF
    The widespread use of multimedia services on the World Wide Web and the advances in end-user portable devices have recently increased the user demands for better quality. Moreover, providing these services seamlessly and ubiquitously on wireless networks and with user mobility poses hard challenges. To meet these challenges and fulfill the end-user requirements, suitable strategies need to be adopted at both application level and network level. At the application level rate and quality have to be adapted to time-varying bandwidth limitations, whereas on the network side a mechanism for efficient use of the network resources has to be implemented, to provide a better end-user Quality of Experience (QoE) through better Quality of Service (QoS). The work in this thesis addresses these issues by first investigating multi-stream rate adaptation techniques for Scalable Video Coding (SVC) applications aimed at a fair provision of QoE to end-users. Rate Distortion (R-D) models for real-time and non real-time video streaming have been proposed and a rate adaptation technique is also developed to minimize with fairness the distortion of multiple videos with difference complexities. To provide resiliency against errors, the effect of Unequal Error protection (UXP) based on Reed Solomon (RS) encoding with erasure correction has been also included in the proposed R-D modelling. Moreover, to improve the support of QoE at the network level for multimedia applications sensitive to delays, jitters and packet drops, a technique to prioritise different traffic flows using specific QoS classes within an intermediate DiffServ network integrated with a WiMAX access system is investigated. Simulations were performed to test the network under different congestion scenarios

    Fast Depth and Inter Mode Prediction for Quality Scalable High Efficiency Video Coding

    Get PDF
    International audienceThe scalable high efficiency video coding (SHVC) is an extension of high efficiency video coding (HEVC), which introduces multiple layers and inter-layer prediction, thus significantly increases the coding complexity on top of the already complicated HEVC encoder. In inter prediction for quality SHVC, in order to determine the best possible mode at each depth level, a coding tree unit can be recursively split into four depth levels, including merge mode, inter2Nx2N, inter2NxN, interNx2N, interNxN, in-ter2NxnU, inter2NxnD, internLx2N and internRx2N, intra modes and inter-layer reference (ILR) mode. This can obtain the highest coding efficiency, but also result in very high coding complexity. Therefore, it is crucial to improve coding speed while maintaining coding efficiency. In this research, we have proposed a new depth level and inter mode prediction algorithm for quality SHVC. First, the depth level candidates are predicted based on inter-layer correlation, spatial correlation and its correlation degree. Second, for a given depth candidate, we divide mode prediction into square and non-square mode predictions respectively. Third, in the square mode prediction, ILR and merge modes are predicted according to depth correlation, and early terminated whether residual distribution follows a Gaussian distribution. Moreover, ILR mode, merge mode and inter2Nx2N are early terminated based on significant differences in Rate Distortion (RD) costs. Fourth, if the early termination condition cannot be satisfied, non-square modes are further predicted based on significant differences in expected values of residual coefficients. Finally, inter-layer and spatial correlations are combined with residual distribution to examine whether to early terminate depth selection. Experimental results have demonstrated that, on average, the proposed algorithm can achieve a time saving of 71.14%, with a bit rate increase of 1.27%
    corecore