22 research outputs found

    Efficient HEVC-based video adaptation using transcoding

    Get PDF
    In a video transmission system, it is important to take into account the great diversity of the network/end-user constraints. On the one hand, video content is typically streamed over a network that is characterized by different bandwidth capacities. In many cases, the bandwidth is insufficient to transfer the video at its original quality. On the other hand, a single video is often played by multiple devices like PCs, laptops, and cell phones. Obviously, a single video would not satisfy their different constraints. These diversities of the network and devices capacity lead to the need for video adaptation techniques, e.g., a reduction of the bit rate or spatial resolution. Video transcoding, which modifies a property of the video without the change of the coding format, has been well-known as an efficient adaptation solution. However, this approach comes along with a high computational complexity, resulting in huge energy consumption in the network and possibly network latency. This presentation provides several optimization strategies for the transcoding process of HEVC (the latest High Efficiency Video Coding standard) video streams. First, the computational complexity of a bit rate transcoder (transrater) is reduced. We proposed several techniques to speed-up the encoder of a transrater, notably a machine-learning-based approach and a novel coding-mode evaluation strategy have been proposed. Moreover, the motion estimation process of the encoder has been optimized with the use of decision theory and the proposed fast search patterns. Second, the issues and challenges of a spatial transcoder have been solved by using machine-learning algorithms. Thanks to their great performance, the proposed techniques are expected to significantly help HEVC gain popularity in a wide range of modern multimedia applications

    Compressed-domain transcoding of H.264/AVC and SVC video streams

    Get PDF

    H.264/AVC to HEVC Video Transcoder Based on Dynamic Thresholding and Content Modeling

    Get PDF

    Towards one video encoder per individual : guided High Efficiency Video Coding

    Get PDF

    Algorithms & implementation of advanced video coding standards

    Get PDF
    Advanced video coding standards have become widely deployed coding techniques used in numerous products, such as broadcast, video conference, mobile television and blu-ray disc, etc. New compression techniques are gradually included in video coding standards so that a 50% compression rate reduction is achievable every five years. However, the trend also has brought many problems, such as, dramatically increased computational complexity, co-existing multiple standards and gradually increased development time. To solve the above problems, this thesis intends to investigate efficient algorithms for the latest video coding standard, H.264/AVC. Two aspects of H.264/AVC standard are inspected in this thesis: (1) Speeding up intra4x4 prediction with parallel architecture. (2) Applying an efficient rate control algorithm based on deviation measure to intra frame. Another aim of this thesis is to work on low-complexity algorithms for MPEG-2 to H.264/AVC transcoder. Three main mapping algorithms and a computational complexity reduction algorithm are focused by this thesis: motion vector mapping, block mapping, field-frame mapping and efficient modes ranking algorithms. Finally, a new video coding framework methodology to reduce development time is examined. This thesis explores the implementation of MPEG-4 simple profile with the RVC framework. A key technique of automatically generating variable length decoder table is solved in this thesis. Moreover, another important video coding standard, DV/DVCPRO, is further modeled by RVC framework. Consequently, besides the available MPEG-4 simple profile and China audio/video standard, a new member is therefore added into the RVC framework family. A part of the research work presented in this thesis is targeted algorithms and implementation of video coding standards. In the wide topic, three main problems are investigated. The results show that the methodologies presented in this thesis are efficient and encourage

    Algorithms and methods for video transcoding.

    Get PDF
    Video transcoding is the process of dynamic video adaptation. Dynamic video adaptation can be defined as the process of converting video from one format to another, changing the bit rate, frame rate or resolution of the encoded video, which is mainly necessitated by the end user requirements. H.264 has been the predominantly used video compression standard for the last 15 years. HEVC (High Efficiency Video Coding) is the latest video compression standard finalised in 2013, which is an improvement over H.264 video compression standard. HEVC performs significantly better than H.264 in terms of the Rate-Distortion performance. As H.264 has been widely used in the last decade, a large amount of video content exists in H.264 format. There is a need to convert H.264 video content to HEVC format to achieve better Rate-Distortion performance and to support legacy video formats on newer devices. However, the computational complexity of HEVC encoder is 2-10 times higher than that of H.264 encoder. This makes it necessary to develop low complexity video transcoding algorithms to transcode from H.264 to HEVC format. This research work proposes low complexity algorithms for H.264 to HEVC video transcoding. The proposed algorithms reduce the computational complexity of H.264 to HEVC video transcoding significantly, with negligible loss in Rate-Distortion performance. This work proposes three different video transcoding algorithms. The MV-based mode merge algorithm uses the block mode and MV variances to estimate the split/non-split decision as part of the HEVC block prediction process. The conditional probability-based mode mapping algorithm models HEVC blocks of sizes 16×16 and lower as a function of H.264 block modes, H.264 and HEVC Quantisation Parameters (QP). The motion-compensated MB residual-based mode mapping algorithm makes the split/non-split decision based on content-adaptive classification models. With a combination of the proposed set of algorithms, the computational complexity of the HEVC encoder is reduced by around 60%, with negligible loss in Rate-Distortion performance, outperforming existing state-of-art algorithms by 20-25% in terms of computational complexity. The proposed algorithms can be used in computation-constrained video transcoding applications, to support video format conversion in smart devices, migration of large-scale H.264 video content from host servers to HEVC, cloud computing-based transcoding applications, and also to support high quality videos over bandwidth-constrained networks

    Advanced heterogeneous video transcoding

    Get PDF
    PhDVideo transcoding is an essential tool to promote inter-operability between different video communication systems. This thesis presents two novel video transcoders, both operating on bitstreams of the cur- rent H.264/AVC standard. The first transcoder converts H.264/AVC bitstreams to a Wavelet Scalable Video Codec (W-SVC), while the second targets the emerging High Efficiency Video Coding (HEVC). Scalable Video Coding (SVC) enables low complexity adaptation of compressed video, providing an efficient solution for content delivery through heterogeneous networks. The transcoder proposed here aims at exploiting the advantages offered by SVC technology when dealing with conventional coders and legacy video, efficiently reusing information found in the H.264/AVC bitstream to achieve a high rate-distortion performance at a low complexity cost. Its main features include new mode mapping algorithms that exploit the W-SVC larger macroblock sizes, and a new state-of-the-art motion vector composition algorithm that is able to tackle different coding configurations in the H.264/AVC bitstream, including IPP or IBBP with multiple reference frames. The emerging video coding standard, HEVC, is currently approaching the final stage of development prior to standardization. This thesis proposes and evaluates several transcoding algorithms for the HEVC codec. In particular, a transcoder based on a new method that is capable of complexity scalability, trading off rate-distortion performance for complexity reduction, is proposed. Furthermore, other transcoding solutions are explored, based on a novel content-based modeling approach, in which the transcoder adapts its parameters based on the contents of the sequence being encoded. Finally, the application of this research is not constrained to these transcoders, as many of the techniques developed aim to contribute to advance the research on this field, and have the potential to be incorporated in different video transcoding architectures

    Image and Video Coding/Transcoding: A Rate Distortion Approach

    Get PDF
    Due to the lossy nature of image/video compression and the expensive bandwidth and computation resources in a multimedia system, one of the key design issues for image and video coding/transcoding is to optimize trade-off among distortion, rate, and/or complexity. This thesis studies the application of rate distortion (RD) optimization approaches to image and video coding/transcoding for exploring the best RD performance of a video codec compatible to the newest video coding standard H.264 and for designing computationally efficient down-sampling algorithms with high visual fidelity in the discrete Cosine transform (DCT) domain. RD optimization for video coding in this thesis considers two objectives, i.e., to achieve the best encoding efficiency in terms of minimizing the actual RD cost and to maintain decoding compatibility with the newest video coding standard H.264. By the actual RD cost, we mean a cost based on the final reconstruction error and the entire coding rate. Specifically, an operational RD method is proposed based on a soft decision quantization (SDQ) mechanism, which has its root in a fundamental RD theoretic study on fixed-slope lossy data compression. Using SDQ instead of hard decision quantization, we establish a general framework in which motion prediction, quantization, and entropy coding in a hybrid video coding scheme such as H.264 are jointly designed to minimize the actual RD cost on a frame basis. The proposed framework is applicable to optimize any hybrid video coding scheme, provided that specific algorithms are designed corresponding to coding syntaxes of a given standard codec, so as to maintain compatibility with the standard. Corresponding to the baseline profile syntaxes and the main profile syntaxes of H.264, respectively, we have proposed three RD algorithms---a graph-based algorithm for SDQ given motion prediction and quantization step sizes, an algorithm for residual coding optimization given motion prediction, and an iterative overall algorithm for jointly optimizing motion prediction, quantization, and entropy coding---with them embedded in the indicated order. Among the three algorithms, the SDQ design is the core, which is developed based on a given entropy coding method. Specifically, two SDQ algorithms have been developed based on the context adaptive variable length coding (CAVLC) in H.264 baseline profile and the context adaptive binary arithmetic coding (CABAC) in H.264 main profile, respectively. Experimental results for the H.264 baseline codec optimization show that for a set of typical testing sequences, the proposed RD method for H.264 baseline coding achieves a better trade-off between rate and distortion, i.e., 12\% rate reduction on average at the same distortion (ranging from 30dB to 38dB by PSNR) when compared with the RD optimization method implemented in H.264 baseline reference codec. Experimental results for optimizing H.264 main profile coding with CABAC show 10\% rate reduction over a main profile reference codec using CABAC, which also suggests 20\% rate reduction over the RD optimization method implemented in H.264 baseline reference codec, leading to our claim of having developed the best codec in terms of RD performance, while maintaining the compatibility with H.264. By investigating trade-off between distortion and complexity, we have also proposed a designing framework for image/video transcoding with spatial resolution reduction, i.e., to down-sample compressed images/video with an arbitrary ratio in the DCT domain. First, we derive a set of DCT-domain down-sampling methods, which can be represented by a linear transform with double-sided matrix multiplication (LTDS) in the DCT domain. Then, for a pre-selected pixel-domain down-sampling method, we formulate an optimization problem for finding an LTDS to approximate the given pixel-domain method to achieve the best trade-off between visual quality and computational complexity. The problem is then solved by modeling an LTDS with a multi-layer perceptron network and using a structural learning with forgetting algorithm for training the network. Finally, by selecting a pixel-domain reference method with the popular Butterworth lowpass filtering and cubic B-spline interpolation, the proposed framework discovers an LTDS with better visual quality and lower computational complexity when compared with state-of-the-art methods in the literature

    Development of Some Spatial-domain Preprocessing and Post-processing Algorithms for Better 2-D Up-scaling

    Get PDF
    Image super-resolution is an area of great interest in recent years and is extensively used in applications like video streaming, multimedia, internet technologies, consumer electronics, display and printing industries. Image super-resolution is a process of increasing the resolution of a given image without losing its integrity. Its most common application is to provide better visual effect after resizing a digital image for display or printing. One of the methods of improving the image resolution is through the employment of a 2-D interpolation. An up-scaled image should retain all the image details with very less degree of blurring meant for better visual quality. In literature, many efficient 2-D interpolation schemes are found that well preserve the image details in the up-scaled images; particularly at the regions with edges and fine details. Nevertheless, these existing interpolation schemes too give blurring effect in the up-scaled images due to the high frequency (HF) degradation during the up-sampling process. Hence, there is a scope to further improve their performance through the incorporation of various spatial domain pre-processing, post-processing and composite algorithms. Therefore, it is felt that there is sufficient scope to develop various efficient but simple pre-processing, post-processing and composite schemes to effectively restore the HF contents in the up-scaled images for various online and off-line applications. An efficient and widely used Lanczos-3 interpolation is taken for further performance improvement through the incorporation of various proposed algorithms. The various pre-processing algorithms developed in this thesis are summarized here. The term pre-processing refers to processing the low-resolution input image prior to image up-scaling. The various pre-processing algorithms proposed in this thesis are: Laplacian of Laplacian based global pre-processing (LLGP) scheme; Hybrid global pre-processing (HGP); Iterative Laplacian of Laplacian based global pre-processing (ILLGP); Unsharp masking based pre-processing (UMP); Iterative unsharp masking (IUM); Error based up-sampling(EU) scheme. The proposed algorithms: LLGP, HGP and ILLGP are three spatial domain preprocessing algorithms which are based on 4th, 6th and 8th order derivatives to alleviate nonuniform blurring in up-scaled images. These algorithms are used to obtain the high frequency (HF) extracts from an image by employing higher order derivatives and perform precise sharpening on a low resolution image to alleviate the blurring in its 2-D up-sampled counterpart. In case of unsharp masking based pre-processing (UMP) scheme, the blurred version of a low resolution image is used for HF extraction from the original version through image subtraction. The weighted version of the HF extracts are superimposed with the original image to produce a sharpened image prior to image up-scaling to counter blurring effectively. IUM makes use of many iterations to generate an unsharp mask which contains very high frequency (VHF) components. The VHF extract is the result of signal decomposition in terms of sub-bands using the concept of analysis filter bank. Since the degradation of VHF components is maximum, restoration of such components would produce much better restoration performance. EU is another pre-processing scheme in which the HF degradation due to image upscaling is extracted and is called prediction error. The prediction error contains the lost high frequency components. When this error is superimposed on the low resolution image prior to image up-sampling, blurring is considerably reduced in the up-scaled images. Various post-processing algorithms developed in this thesis are summarized in following. The term post-processing refers to processing the high resolution up-scaled image. The various post-processing algorithms proposed in this thesis are: Local adaptive Laplacian (LAL); Fuzzy weighted Laplacian (FWL); Legendre functional link artificial neural network(LFLANN). LAL is a non-fuzzy, local based scheme. The local regions of an up-scaled image with high variance are sharpened more than the region with moderate or low variance by employing a local adaptive Laplacian kernel. The weights of the LAL kernel are varied as per the normalized local variance so as to provide more degree of HF enhancement to high variance regions than the low variance counterpart to effectively counter the non-uniform blurring. Furthermore, FWL post-processing scheme with a higher degree of non-linearity is proposed to further improve the performance of LAL. FWL, being a fuzzy based mapping scheme, is highly nonlinear to resolve the blurring problem more effectively than LAL which employs a linear mapping. Another LFLANN based post-processing scheme is proposed here to minimize the cost function so as to reduce the blurring in a 2-D up-scaled image. Legendre polynomials are used for functional expansion of the input pattern-vector and provide high degree of nonlinearity. Therefore, the requirement of multiple layers can be replaced by single layer LFLANN architecture so as to reduce the cost function effectively for better restoration performance. With single layer architecture, it has reduced the computational complexity and hence is suitable for various real-time applications. There is a scope of further improvement of the stand-alone pre-processing and postprocessing schemes by combining them through composite schemes. Here, two spatial domain composite schemes, CS-I and CS-II are proposed to tackle non-uniform blurring in an up-scaled image. CS-I is developed by combining global iterative Laplacian (GIL) preprocessing scheme with LAL post-processing scheme. Another highly nonlinear composite scheme, CS-II is proposed which combines ILLGP scheme with a fuzzy weighted Laplacian post-processing scheme for more improved performance than the stand-alone schemes. Finally, it is observed that the proposed algorithms: ILLGP, IUM, FWL, LFLANN and CS-II are better algorithms in their respective categories for effectively reducing blurring in the up-scaled images
    corecore