1,111 research outputs found

    Statistical framework for video decoding complexity modeling and prediction

    Get PDF
    Video decoding complexity modeling and prediction is an increasingly important issue for efficient resource utilization in a variety of applications, including task scheduling, receiver-driven complexity shaping, and adaptive dynamic voltage scaling. In this paper we present a novel view of this problem based on a statistical framework perspective. We explore the statistical structure (clustering) of the execution time required by each video decoder module (entropy decoding, motion compensation, etc.) in conjunction with complexity features that are easily extractable at encoding time (representing the properties of each module's input source data). For this purpose, we employ Gaussian mixture models (GMMs) and an expectation-maximization algorithm to estimate the joint execution-time - feature probability density function (PDF). A training set of typical video sequences is used for this purpose in an offline estimation process. The obtained GMM representation is used in conjunction with the complexity features of new video sequences to predict the execution time required for the decoding of these sequences. Several prediction approaches are discussed and compared. The potential mismatch between the training set and new video content is addressed by adaptive online joint-PDF re-estimation. An experimental comparison is performed to evaluate the different approaches and compare the proposed prediction scheme with related resource prediction schemes from the literature. The usefulness of the proposed complexity-prediction approaches is demonstrated in an application of rate-distortion-complexity optimized decoding

    Energy Minimization of Portable Video Communication Devices Based on Power-Rate-Distortion Optimization

    Get PDF
    Digital Object Identifier 10.1109/TCSVT.2008.918802Portable video communication devices operate on batteries with limited energy supply. However, video compression is computationally intensive and energy-demanding. Therefore, one of the central challenging issues in portable video communication system design is to minimize the energy consumption of video encoding so as to prolong the operational lifetime of portable video devices. In this work, based on power-rate-distortion (P-R-D) optimization, we develop a new approach for energy minimization by exploring the energy tradeoff between video encoding and wireless communication and exploiting the nonstationary characteristics of input video data. Both analytically and experimentally, we demonstrate that incorporating the third dimension of power consumption into conventional R-D analysis gives us one extra dimension of flexibility in resource allocation and allows us to achieve significant energy saving. Within the P-R-D analysis framework, power is tightly coupled with rate, enabling us to trade bits for joules and perform energy minimization through optimum bit allocation. Our experimental studies show that, for typical videos with nonstationary scene statistics, using the proposed P-R-D optimization technology, the energy consumption of video encoding can be significantly reduced (by up to 50%), especially in delay-tolerant portable video communication applications

    Efficient HEVC-based video adaptation using transcoding

    Get PDF
    In a video transmission system, it is important to take into account the great diversity of the network/end-user constraints. On the one hand, video content is typically streamed over a network that is characterized by different bandwidth capacities. In many cases, the bandwidth is insufficient to transfer the video at its original quality. On the other hand, a single video is often played by multiple devices like PCs, laptops, and cell phones. Obviously, a single video would not satisfy their different constraints. These diversities of the network and devices capacity lead to the need for video adaptation techniques, e.g., a reduction of the bit rate or spatial resolution. Video transcoding, which modifies a property of the video without the change of the coding format, has been well-known as an efficient adaptation solution. However, this approach comes along with a high computational complexity, resulting in huge energy consumption in the network and possibly network latency. This presentation provides several optimization strategies for the transcoding process of HEVC (the latest High Efficiency Video Coding standard) video streams. First, the computational complexity of a bit rate transcoder (transrater) is reduced. We proposed several techniques to speed-up the encoder of a transrater, notably a machine-learning-based approach and a novel coding-mode evaluation strategy have been proposed. Moreover, the motion estimation process of the encoder has been optimized with the use of decision theory and the proposed fast search patterns. Second, the issues and challenges of a spatial transcoder have been solved by using machine-learning algorithms. Thanks to their great performance, the proposed techniques are expected to significantly help HEVC gain popularity in a wide range of modern multimedia applications

    Decoding-complexity-aware HEVC encoding using a complexity–rate–distortion model

    Get PDF
    The energy consumption of Consumer Electronic (CE) devices during media playback is inexorably linked to the computational complexity of decoding compressed video. Reducing a CE device's the energy consumption is therefore becoming ever more challenging with the increasing video resolutions and the complexity of the video coding algorithms. To this end, this paper proposes a framework that alters the video bit stream to reduce the decoding complexity and simultaneously limits the impact on the coding efficiency. In this context, this paper (i) first performs an analysis to determine the trade-off between the decoding complexity, video quality and bit rate with respect to a reference decoder implementation on a General Purpose Processor (GPP) architecture. Thereafter, (ii) a novel generic decoding complexity-aware video coding algorithm is proposed to generate decoding complexity-rate-distortion optimized High Efficiency Video Coding (HEVC) bit streams. The experimental results reveal that the bit streams generated by the proposed algorithm achieve 29.43% and 13.22% decoding complexity reductions for a similar video quality with minimal coding efficiency impact compared to the state-of-the-art approaches when applied to the HM16.0 and openHEVC decoder implementations, respectively. In addition, analysis of the energy consumption behavior for the same scenarios reveal up to 20% energy consumption reductions while achieving a similar video quality to that of HM 16.0 encoded HEVC bit streams

    Lowpass Filtering of Rate-Distortion Functions for Quality Smoothing in Real-Time Video Communication

    Get PDF
    Digital Object Identifier 10.1109/TCSVT.2005.852417In variable-bit-rate (VBR) video coding, the video is pre-processed to collect sequence-level statistics, which are used for global bit allocation in the actual encoding stage to obtain a smoothed video presentation quality. However, in real-time video recording and network streaming, this type of two-pass encoding scheme is not allowed because the access to future frames and global statistics is not available. To address this issue, we introduce the concept of low-pass filtering of rate-distortion (R-D) functions and develop a smoothed rate control (SRC) framework for real-time video recording and streaming. Theoretically, we prove that, using a geometric averaging filter, the SRC algorithm is able to maintain a smoothed video presentation quality while achieving the target bit rate automatically. We also analyze the buffer requirement of the SRC algorithm in real-time video streaming, and propose a scheme to seamlessly integrate robust buffer control into the SRC framework. The proposed SRC algorithm has very low computational complexity and implementation cost. Our extensive experimental results demonstrate that the SRC algorithm significantly reduces the picture quality variation in the encoded video clips

    REGION-BASED ADAPTIVE DISTRIBUTED VIDEO CODING CODEC

    Get PDF
    The recently developed Distributed Video Coding (DVC) is typically suitable for the applications where the conventional video coding is not feasible because of its inherent high-complexity encoding. Examples include video surveillance usmg wireless/wired video sensor network and applications using mobile cameras etc. With DVC, the complexity is shifted from the encoder to the decoder. The practical application of DVC is referred to as Wyner-Ziv video coding (WZ) where an estimate of the original frame called "side information" is generated using motion compensation at the decoder. The compression is achieved by sending only that extra information that is needed to correct this estimation. An error-correcting code is used with the assumption that the estimate is a noisy version of the original frame and the rate needed is certain amount of the parity bits. The side information is assumed to have become available at the decoder through a virtual channel. Due to the limitation of compensation method, the predicted frame, or the side information, is expected to have varying degrees of success. These limitations stem from locationspecific non-stationary estimation noise. In order to avoid these, the conventional video coders, like MPEG, make use of frame partitioning to allocate optimum coder for each partition and hence achieve better rate-distortion performance. The same, however, has not been used in DVC as it increases the encoder complexity. This work proposes partitioning the considered frame into many coding units (region) where each unit is encoded differently. This partitioning is, however, done at the decoder while generating the side-information and the region map is sent over to encoder at very little rate penalty. The partitioning allows allocation of appropriate DVC coding parameters (virtual channel, rate, and quantizer) to each region. The resulting regions map is compressed by employing quadtree algorithm and communicated to the encoder via the feedback channel. The rate control in DVC is performed by channel coding techniques (turbo codes, LDPC, etc.). The performance of the channel code depends heavily on the accuracy of virtual channel model that models estimation error for each region. In this work, a turbo code has been used and an adaptive WZ DVC is designed both in transform domain and in pixel domain. The transform domain WZ video coding (TDWZ) has distinct superior performance as compared to the normal Pixel Domain Wyner-Ziv (PDWZ), since it exploits the ' spatial redundancy during the encoding. The performance evaluations show that the proposed system is superior to the existing distributed video coding solutions. Although the, proposed system requires extra bits representing the "regions map" to be transmitted, fuut still the rate gain is noticeable and it outperforms the state-of-the-art frame based DVC by 0.6-1.9 dB. The feedback channel (FC) has the role to adapt the bit rate to the changing ' statistics between the side infonmation and the frame to be encoded. In the unidirectional scenario, the encoder must perform the rate control. To correctly estimate the rate, the encoder must calculate typical side information. However, the rate cannot be exactly calculated at the encoder, instead it can only be estimated. This work also prbposes a feedback-free region-based adaptive DVC solution in pixel domain based on machine learning approach to estimate the side information. Although the performance evaluations show rate-penalty but it is acceptable considering the simplicity of the proposed algorithm. vii

    Measuring And Improving Internet Video Quality Of Experience

    Get PDF
    Streaming multimedia content over the IP-network is poised to be the dominant Internet traffic for the coming decade, predicted to account for more than 91% of all consumer traffic in the coming years. Streaming multimedia content ranges from Internet television (IPTV), video on demand (VoD), peer-to-peer streaming, and 3D television over IP to name a few. Widespread acceptance, growth, and subscriber retention are contingent upon network providers assuring superior Quality of Experience (QoE) on top of todays Internet. This work presents the first empirical understanding of Internet’s video-QoE capabilities, and tools and protocols to efficiently infer and improve them. To infer video-QoE at arbitrary nodes in the Internet, we design and implement MintMOS: a lightweight, real-time, noreference framework for capturing perceptual quality. We demonstrate that MintMOS’s projections closely match with subjective surveys in accessing perceptual quality. We use MintMOS to characterize Internet video-QoE both at the link level and end-to-end path level. As an input to our study, we use extensive measurements from a large number of Internet paths obtained from various measurement overlays deployed using PlanetLab. Link level degradations of intra– and inter–ISP Internet links are studied to create an empirical understanding of their shortcomings and ways to overcome them. Our studies show that intra–ISP links are often poorly engineered compared to peering links, and that iii degradations are induced due to transient network load imbalance within an ISP. Initial results also indicate that overlay networks could be a promising way to avoid such ISPs in times of degradations. A large number of end-to-end Internet paths are probed and we measure delay, jitter, and loss rates. The measurement data is analyzed offline to identify ways to enable a source to select alternate paths in an overlay network to improve video-QoE, without the need for background monitoring or apriori knowledge of path characteristics. We establish that for any unstructured overlay of N nodes, it is sufficient to reroute key frames using a random subset of k nodes in the overlay, where k is bounded by O(lnN). We analyze various properties of such random subsets to derive simple, scalable, and an efficient path selection strategy that results in a k-fold increase in path options for any source-destination pair; options that consistently outperform Internet path selection. Finally, we design a prototype called source initiated frame restoration (SIFR) that employs random subsets to derive alternate paths and demonstrate its effectiveness in improving Internet video-QoE

    Towards Hybrid-Optimization Video Coding

    Full text link
    Video coding is a mathematical optimization problem of rate and distortion essentially. To solve this complex optimization problem, two popular video coding frameworks have been developed: block-based hybrid video coding and end-to-end learned video coding. If we rethink video coding from the perspective of optimization, we find that the existing two frameworks represent two directions of optimization solutions. Block-based hybrid coding represents the discrete optimization solution because those irrelevant coding modes are discrete in mathematics. It searches for the best one among multiple starting points (i.e. modes). However, the search is not efficient enough. On the other hand, end-to-end learned coding represents the continuous optimization solution because the gradient descent is based on a continuous function. It optimizes a group of model parameters efficiently by the numerical algorithm. However, limited by only one starting point, it is easy to fall into the local optimum. To better solve the optimization problem, we propose to regard video coding as a hybrid of the discrete and continuous optimization problem, and use both search and numerical algorithm to solve it. Our idea is to provide multiple discrete starting points in the global space and optimize the local optimum around each point by numerical algorithm efficiently. Finally, we search for the global optimum among those local optimums. Guided by the hybrid optimization idea, we design a hybrid optimization video coding framework, which is built on continuous deep networks entirely and also contains some discrete modes. We conduct a comprehensive set of experiments. Compared to the continuous optimization framework, our method outperforms pure learned video coding methods. Meanwhile, compared to the discrete optimization framework, our method achieves comparable performance to HEVC reference software HM16.10 in PSNR

    Fast watermarking of MPEG-1/2 streams using compressed-domain perceptual embedding and a generalized correlator detector

    Get PDF
    A novel technique is proposed for watermarking of MPEG-1 and MPEG-2 compressed video streams. The proposed scheme is applied directly in the domain of MPEG-1 system streams and MPEG-2 program streams (multiplexed streams). Perceptual models are used during the embedding process in order to avoid degradation of the video quality. The watermark is detected without the use of the original video sequence. A modified correlation-based detector is introduced that applies nonlinear preprocessing before correlation. Experimental evaluation demonstrates that the proposed scheme is able to withstand several common attacks. The resulting watermarking system is very fast and therefore suitable for copyright protection of compressed video
    corecore