9,286 research outputs found

    Perceptually-Driven Video Coding with the Daala Video Codec

    Full text link
    The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which ones worked and which did not. We evaluate which tools are easy to integrate into a more traditional codec design, and show results in the context of the codec being developed by the Alliance for Open Media.Comment: 19 pages, Proceedings of SPIE Workshop on Applications of Digital Image Processing (ADIP), 201

    Error concealment for slice group based multiple description video coding

    Get PDF

    Combining open- and closed-loop architectures for H.264/AVC-TO-SVC transcoding

    Get PDF
    Scalable video coding (SVC) allows encoded bitstreams to be adapted. However, most bitstreams do not incorporate this scalability so bitstreams have to be adapted multiple times to accommodate for varying network conditions or end-user devices. Each adaptation incorporates an additional loss of quality due to transcoding. To overcome this issue, we propose a single transcoding step from H.264/AVC to SVC. Doing so, the resulting bitstream can be freely adapted without any additional quality reduction. Open-loop transcoding architectures can be used for H.264/AVC-to-SVC transcoding with a low complexity, although these architectures suffer from drift artifacts. Closed-loop transcoding, on the other hand, requires a higher complexity. To overcome the drawbacks of both systems, we propose combining both techniques

    Coding local and global binary visual features extracted from video sequences

    Get PDF
    Binary local features represent an effective alternative to real-valued descriptors, leading to comparable results for many visual analysis tasks, while being characterized by significantly lower computational complexity and memory requirements. When dealing with large collections, a more compact representation based on global features is often preferred, which can be obtained from local features by means of, e.g., the Bag-of-Visual-Word (BoVW) model. Several applications, including for example visual sensor networks and mobile augmented reality, require visual features to be transmitted over a bandwidth-limited network, thus calling for coding techniques that aim at reducing the required bit budget, while attaining a target level of efficiency. In this paper we investigate a coding scheme tailored to both local and global binary features, which aims at exploiting both spatial and temporal redundancy by means of intra- and inter-frame coding. In this respect, the proposed coding scheme can be conveniently adopted to support the Analyze-Then-Compress (ATC) paradigm. That is, visual features are extracted from the acquired content, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast with the traditional approach, in which visual content is acquired at a node, compressed and then sent to a central unit for further processing, according to the Compress-Then-Analyze (CTA) paradigm. In this paper we experimentally compare ATC and CTA by means of rate-efficiency curves in the context of two different visual analysis tasks: homography estimation and content-based retrieval. Our results show that the novel ATC paradigm based on the proposed coding primitives can be competitive with CTA, especially in bandwidth limited scenarios.Comment: submitted to IEEE Transactions on Image Processin

    Enabling error-resilient internet broadcasting using motion compensated spatial partitioning and packet FEC for the dirac video codec

    Get PDF
    Video transmission over the wireless or wired network require protection from channel errors since compressed video bitstreams are very sensitive to transmission errors because of the use of predictive coding and variable length coding. In this paper, a simple, low complexity and patent free error-resilient coding is proposed. It is based upon the idea of using spatial partitioning on the motion compensated residual frame without employing the transform coefficient coding. The proposed scheme is intended for open source Dirac video codec in order to enable the codec to be used for Internet broadcasting. By partitioning the wavelet transform coefficients of the motion compensated residual frame into groups and independently processing each group using arithmetic coding and Forward Error Correction (FEC), robustness to transmission errors over the packet erasure wired network could be achieved. Using the Rate Compatibles Punctured Code (RCPC) and Turbo Code (TC) as the FEC, the proposed technique provides gracefully decreasing perceptual quality over packet loss rates up to 30%. The PSNR performance is much better when compared with the conventional data partitioning only methods. Simulation results show that the use of multiple partitioning of wavelet coefficient in Dirac can achieve up to 8 dB PSNR gain over its existing un-partitioned method

    Predictive Coding For Animation-Based Video Compression

    Full text link
    We address the problem of efficiently compressing video for conferencing-type applications. We build on recent approaches based on image animation, which can achieve good reconstruction quality at very low bitrate by representing face motions with a compact set of sparse keypoints. However, these methods encode video in a frame-by-frame fashion, i.e. each frame is reconstructed from a reference frame, which limits the reconstruction quality when the bandwidth is larger. Instead, we propose a predictive coding scheme which uses image animation as a predictor, and codes the residual with respect to the actual target frame. The residuals can be in turn coded in a predictive manner, thus removing efficiently temporal dependencies. Our experiments indicate a significant bitrate gain, in excess of 70% compared to the HEVC video standard and over 30% compared to VVC, on a datasetof talking-head videosComment: Accepted paper: ICIP 202

    Graph-based transform with weighted self-loops for predictive transform coding based on template matching

    Get PDF
    This paper introduces the GBT-L, a novel class of Graph-based Transform within the con- text of block-based predictive transform coding. The GBT-L is constructed using a 2D graph with unit edge weights and weighted self-loops in every vertex. The weighted self- loops are selected based on the residual values to be transformed. To avoid signalling any additional information required to compute the inverse GBT-L, we also introduce a coding framework that uses a template-based strategy to predict residual blocks in the pixel and residual domains. Evaluation results on several video frames and medical images, in terms of the percentage of preserved energy and mean square error, show that the GBT-L can outperform the DST, DCT and the Graph-based Separable Transfor

    Design of a digital compression technique for shuttle television

    Get PDF
    The determination of the performance and hardware complexity of data compression algorithms applicable to color television signals, were studied to assess the feasibility of digital compression techniques for shuttle communications applications. For return link communications, it is shown that a nonadaptive two dimensional DPCM technique compresses the bandwidth of field-sequential color TV to about 13 MBPS and requires less than 60 watts of secondary power. For forward link communications, a facsimile coding technique is recommended which provides high resolution slow scan television on a 144 KBPS channel. The onboard decoder requires about 19 watts of secondary power
    corecore