305 research outputs found

    Steered mixture-of-experts for light field images and video : representation and coding

    Get PDF
    Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

    Improvement of Decision on Coding Unit Split Mode and Intra-Picture Prediction by Machine Learning

    Get PDF
    High efficiency Video Coding (HEVC) has been deemed as the newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The reference software (i.e., HM) have included the implementations of the guidelines in appliance with the new standard. The software includes both encoder and decoder functionality. Machine learning (ML) works with data and processes it to discover patterns that can be later used to analyze new trends. ML can play a key role in a wide range of critical applications, such as data mining, natural language processing, image recognition, and expert systems. In this research project, in compliance with H.265 standard, we are focused on improvement of the performance of encode/decode by optimizing the partition of prediction block in coding unit with the help of supervised machine learning. We used Keras library as the main tool to implement the experiments. Key parameters were tuned for the model in our convolution neuron network. The coding tree unit mode decision time produced in the model was compared with that produced in HM software, and it was proved to have improved significantly. The intra-picture prediction mode decision was also investigated with modified model and yielded satisfactory results

    Key-point Detection based Fast CU Decision for HEVC Intra Encoding

    Get PDF
    As the most recent video coding standard, High Efficiency Video Coding (HEVC) adopts various novel techniques, including a quad-tree based coding unit (CU) structure and additional angular modes used for intra encoding. These newtechniques achieve a notable improvement in coding efficiency at the penalty of significant computational complexity increase. Thus, a fast HEVC coding algorithm is highly desirable. In this paper, we propose a fast intra CU decision algorithm for HEVC to reduce the coding complexity, mainly based on a key-point detection. A CU block is considered to have multiple gradients and is early split if corner points are detected inside the block. On the other hand, a CU block without corner points is treated to be terminated when its RD cost is also small according to statistics of the previous frames. The proposed fast algorithm achieves over 62% encoding time reduction with 3.66%, 2.82%, and 2.53% BD-Rate loss for Y, U, and V components, averagely. The experimental results show that the proposed method is efficient to fast decide CU size in HEVC intra coding, even though only static parameters are applied to all test sequences

    HEVC-based 3D holoscopic video coding using self-similarity compensated prediction

    Get PDF
    Holoscopic imaging, also known as integral, light field, and plenoptic imaging, is an appealing technology for glassless 3D video systems, which has recently emerged as a prospective candidate for future image and video applications, such as 3D television. However, to successfully introduce 3D holoscopic video applications into the market, adequate coding tools that can efficiently handle 3D holoscopic video are necessary. In this context, this paper discusses the requirements and challenges for 3D holoscopic video coding, and presents an efficient 3D holoscopic coding scheme based on High Efficiency Video Coding (HEVC). The proposed 3D holoscopic codec makes use of the self-similarity (SS) compensated prediction concept to efficiently explore the inherent correlation of the 3D holoscopic content in Intra- and Inter-coded frames, as well as a novel vector prediction scheme to take advantage of the peculiar characteristics of the SS prediction data. Extensive experiments were conducted, and have shown that the proposed solution is able to outperform HEVC as well as other coding solutions proposed in the literature. Moreover, a consistently better performance is also observed for a set of different quality metrics proposed in the literature for 3D holoscopic content, as well as for the visual quality of views synthesized from decompressed 3D holoscopic content.info:eu-repo/semantics/submittedVersio
    corecore