81 research outputs found

    Dyadic spatial resolution reduction transcoding for H.264/AVC

    Get PDF
    In this paper, we examine spatial resolution downscaling transcoding for H.264/AVC video coding. A number of advanced coding tools limit the applicability of techniques, which were developed for previous video coding standards. We present a spatial resolution reduction transcoding architecture for H.264/AVC, which extends open-loop transcoding with a low-complexity compensation technique in the reduced-resolution domain. The proposed architecture tackles the problems in H.264/AVC and avoids visual artifacts in the transcoded sequence, while keeping complexity significantly lower than more traditional cascaded decoder-encoder architectures. The refinement step of the proposed architecture can be used to further improve rate-distortion performance, at the cost of additional complexity. In this way, a dynamic-complexity transcoder is rendered possible. We present a thorough investigation of the problems related to motion and residual data mapping, leading to a transcoding solution resulting in fully compliant reduced-size H.264/AVC bitstreams

    An Efficient Motion Estimation Method for H.264-Based Video Transcoding with Arbitrary Spatial Resolution Conversion

    Get PDF
    As wireless and wired network connectivity is rapidly expanding and the number of network users is steadily increasing, it has become more and more important to support universal access of multimedia content over the whole network. A big challenge, however, is the great diversity of network devices from full screen computers to small smart phones. This leads to research on transcoding, which involves in efficiently reformatting compressed data from its original high resolution to a desired spatial resolution supported by the displaying device. Particularly, there is a great momentum in the multimedia industry for H.264-based transcoding as H.264 has been widely employed as a mandatory player feature in applications ranging from television broadcast to video for mobile devices. While H.264 contains many new features for effective video coding with excellent rate distortion (RD) performance, a major issue for transcoding H.264 compressed video from one spatial resolution to another is the computational complexity. Specifically, it is the motion compensated prediction (MCP) part. MCP is the main contributor to the excellent RD performance of H.264 video compression, yet it is very time consuming. In general, a brute-force search is used to find the best motion vectors for MCP. In the scenario of transcoding, however, an immediate idea for improving the MCP efficiency for the re-encoding procedure is to utilize the motion vectors in the original compressed stream. Intuitively, motion in the high resolution scene is highly related to that in the down-scaled scene. In this thesis, we study homogeneous video transcoding from H.264 to H.264. Specifically, for the video transcoding with arbitrary spatial resolution conversion, we propose a motion vector estimation algorithm based on a multiple linear regression model, which systematically utilizes the motion information in the original scenes. We also propose a practical solution for efficiently determining a reference frame to take the advantage of the new feature of multiple references in H.264. The performance of the algorithm was assessed in an H.264 transcoder. Experimental results show that, as compared with a benchmark solution, the proposed method significantly reduces the transcoding complexity without degrading much the video quality

    Compressed-domain transcoding of H.264/AVC and SVC video streams

    Get PDF

    DCT-based video downscaling transcoder using split and merge technique

    Get PDF
    2005-2006 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe

    Algorithms & implementation of advanced video coding standards

    Get PDF
    Advanced video coding standards have become widely deployed coding techniques used in numerous products, such as broadcast, video conference, mobile television and blu-ray disc, etc. New compression techniques are gradually included in video coding standards so that a 50% compression rate reduction is achievable every five years. However, the trend also has brought many problems, such as, dramatically increased computational complexity, co-existing multiple standards and gradually increased development time. To solve the above problems, this thesis intends to investigate efficient algorithms for the latest video coding standard, H.264/AVC. Two aspects of H.264/AVC standard are inspected in this thesis: (1) Speeding up intra4x4 prediction with parallel architecture. (2) Applying an efficient rate control algorithm based on deviation measure to intra frame. Another aim of this thesis is to work on low-complexity algorithms for MPEG-2 to H.264/AVC transcoder. Three main mapping algorithms and a computational complexity reduction algorithm are focused by this thesis: motion vector mapping, block mapping, field-frame mapping and efficient modes ranking algorithms. Finally, a new video coding framework methodology to reduce development time is examined. This thesis explores the implementation of MPEG-4 simple profile with the RVC framework. A key technique of automatically generating variable length decoder table is solved in this thesis. Moreover, another important video coding standard, DV/DVCPRO, is further modeled by RVC framework. Consequently, besides the available MPEG-4 simple profile and China audio/video standard, a new member is therefore added into the RVC framework family. A part of the research work presented in this thesis is targeted algorithms and implementation of video coding standards. In the wide topic, three main problems are investigated. The results show that the methodologies presented in this thesis are efficient and encourage

    On transcoding a B-frame to a P-frame in the compressed domain

    Get PDF
    2007-2008 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe

    Video transcoding: an overview of various techniques and research issues

    Full text link

    Image and Video Coding/Transcoding: A Rate Distortion Approach

    Get PDF
    Due to the lossy nature of image/video compression and the expensive bandwidth and computation resources in a multimedia system, one of the key design issues for image and video coding/transcoding is to optimize trade-off among distortion, rate, and/or complexity. This thesis studies the application of rate distortion (RD) optimization approaches to image and video coding/transcoding for exploring the best RD performance of a video codec compatible to the newest video coding standard H.264 and for designing computationally efficient down-sampling algorithms with high visual fidelity in the discrete Cosine transform (DCT) domain. RD optimization for video coding in this thesis considers two objectives, i.e., to achieve the best encoding efficiency in terms of minimizing the actual RD cost and to maintain decoding compatibility with the newest video coding standard H.264. By the actual RD cost, we mean a cost based on the final reconstruction error and the entire coding rate. Specifically, an operational RD method is proposed based on a soft decision quantization (SDQ) mechanism, which has its root in a fundamental RD theoretic study on fixed-slope lossy data compression. Using SDQ instead of hard decision quantization, we establish a general framework in which motion prediction, quantization, and entropy coding in a hybrid video coding scheme such as H.264 are jointly designed to minimize the actual RD cost on a frame basis. The proposed framework is applicable to optimize any hybrid video coding scheme, provided that specific algorithms are designed corresponding to coding syntaxes of a given standard codec, so as to maintain compatibility with the standard. Corresponding to the baseline profile syntaxes and the main profile syntaxes of H.264, respectively, we have proposed three RD algorithms---a graph-based algorithm for SDQ given motion prediction and quantization step sizes, an algorithm for residual coding optimization given motion prediction, and an iterative overall algorithm for jointly optimizing motion prediction, quantization, and entropy coding---with them embedded in the indicated order. Among the three algorithms, the SDQ design is the core, which is developed based on a given entropy coding method. Specifically, two SDQ algorithms have been developed based on the context adaptive variable length coding (CAVLC) in H.264 baseline profile and the context adaptive binary arithmetic coding (CABAC) in H.264 main profile, respectively. Experimental results for the H.264 baseline codec optimization show that for a set of typical testing sequences, the proposed RD method for H.264 baseline coding achieves a better trade-off between rate and distortion, i.e., 12\% rate reduction on average at the same distortion (ranging from 30dB to 38dB by PSNR) when compared with the RD optimization method implemented in H.264 baseline reference codec. Experimental results for optimizing H.264 main profile coding with CABAC show 10\% rate reduction over a main profile reference codec using CABAC, which also suggests 20\% rate reduction over the RD optimization method implemented in H.264 baseline reference codec, leading to our claim of having developed the best codec in terms of RD performance, while maintaining the compatibility with H.264. By investigating trade-off between distortion and complexity, we have also proposed a designing framework for image/video transcoding with spatial resolution reduction, i.e., to down-sample compressed images/video with an arbitrary ratio in the DCT domain. First, we derive a set of DCT-domain down-sampling methods, which can be represented by a linear transform with double-sided matrix multiplication (LTDS) in the DCT domain. Then, for a pre-selected pixel-domain down-sampling method, we formulate an optimization problem for finding an LTDS to approximate the given pixel-domain method to achieve the best trade-off between visual quality and computational complexity. The problem is then solved by modeling an LTDS with a multi-layer perceptron network and using a structural learning with forgetting algorithm for training the network. Finally, by selecting a pixel-domain reference method with the popular Butterworth lowpass filtering and cubic B-spline interpolation, the proposed framework discovers an LTDS with better visual quality and lower computational complexity when compared with state-of-the-art methods in the literature
    corecore