360 research outputs found

    CNN-based Prediction of Partition Path for VVC Fast Inter Partitioning Using Motion Fields

    Full text link
    The Versatile Video Coding (VVC) standard has been recently finalized by the Joint Video Exploration Team (JVET). Compared to the High Efficiency Video Coding (HEVC) standard, VVC offers about 50% compression efficiency gain, in terms of Bjontegaard Delta-Rate (BD-rate), at the cost of a 10-fold increase in encoding complexity. In this paper, we propose a method based on Convolutional Neural Network (CNN) to speed up the inter partitioning process in VVC. Firstly, a novel representation for the quadtree with nested multi-type tree (QTMT) partition is introduced, derived from the partition path. Secondly, we develop a U-Net-based CNN taking a multi-scale motion vector field as input at the Coding Tree Unit (CTU) level. The purpose of CNN inference is to predict the optimal partition path during the Rate-Distortion Optimization (RDO) process. To achieve this, we divide CTU into grids and predict the Quaternary Tree (QT) depth and Multi-type Tree (MT) split decisions for each cell of the grid. Thirdly, an efficient partition pruning algorithm is introduced to employ the CNN predictions at each partitioning level to skip RDO evaluations of unnecessary partition paths. Finally, an adaptive threshold selection scheme is designed, making the trade-off between complexity and efficiency scalable. Experiments show that the proposed method can achieve acceleration ranging from 16.5% to 60.2% under the RandomAccess Group Of Picture 32 (RAGOP32) configuration with a reasonable efficiency drop ranging from 0.44% to 4.59% in terms of BD-rate, which surpasses other state-of-the-art solutions. Additionally, our method stands out as one of the lightest approaches in the field, which ensures its applicability to other encoders

    3D-CE5.h: Merge candidate list for disparity compensated prediction

    Get PDF
    HEVC implements a candidate vector list for merge and skip modes. The construction of this list has been extensively studied in the JCT-VC group (see for instance JCTVC-G039). It has been shown in JCTVC-I0293 that it is possible to improve the HEVC coding performance by adding in the merge list copies of the first candidate shifted by an arbitrary offset. The same basis is considered in this document and applied to disparity compensation. A gain of 0.3 % is obtained on average on side views

    Can we prevent antimicrobial resistance by using antimicrobials better?

    Get PDF
    Since their development over 60 years ago, antimicrobials have become an integral part of healthcare practice worldwide. Recently, this has been put in jeopardy by the emergence of widespread antimicrobial resistance, which is one of the major problems facing modern medicine. In the past, the development of new antimicrobials kept us one step ahead of the problem of resistance, but only three new classes of antimicrobials have reached the market in the last thirty years. A time is therefore approaching when we may not have effective treatment against bacterial infections, particularly for those that are caused by Gram-negative organisms. An important strategy to reduce the development of antimicrobial resistance is to use antimicrobials more appropriately, in ways that will prevent resistance. This involves a consideration of the pharmacokinetic and pharmacodynamics properties of antimicrobials, the possible use of combinations, and more appropriate choice of antimicrobials, which may include rapid diagnostic testing and antimicrobial cycling. Examples given in this review include Mycobacterium tuberculosis, Gram-negative and Gram-positive organisms. We shall summarise the current evidence for these strategies and outline areas for future development

    CE5.h related: Merge candidate list extension for disparity compensated prediction

    Get PDF
    HEVC implements a candidate vector list for merge and skip modes. The construction of this list has been extensively studied in the JCT-VC group (see for instance JCTVC-G039). It has been shown in JCTVC-I0293 that it is possible to improve the HEVC coding performance by adding in the merge list copies of the first candidate shifted by an arbitrary offset. The same basis is considered in this document and applied to disparity compensation. A gain of 0.4 % is obtained on average on side views

    Prediction and Sampling with Local Graph Transforms for Quasi-Lossless Light Field Compression

    Get PDF
    International audienc

    Motion Compensation-based Low-Complexity Decoder Side Depth Estimation for MPEG Immersive Video

    Get PDF
    International audienceDecoder-Side Depth Estimation (DSDE) is a system firstly enabled in the novel MPEG Immersive Video (MIV) coding standard. In DSDE, only texture components are coded, while the depth is estimated at the decoder-side. This is motivated by previous work, which has shown high coding gain and pixel rate savings in DSDE. However, the computational complexity remains a concern, as high quality depth search has a high runtime and memory requirement. In this work we extend the concept of depth estimation to depth recovery. Using this mode, the decoder-side depth information is recovered through motion compensation utilizing the displacement vectors contained in the texture bitstream. This strategy enables us to replace most of the complex depth estimation processes with a simple motion compensation step, a decision that is drawn on the encoder-side and signaled per coding unit. With only minor losses in terms of synthesis PSNR and similar perceptual quality in terms of MS-SSIM, the complexity is significantly reduced. Depending on the acceptable loss, up to 80% of the moving objects depth may be motion compensated instead of estimated by a depth estimator translating into a speed-up of a factor of 104 for inter-frames compared to the reference depth estimator

    Omni-nerf: neural radiance field from 360° image captures

    Get PDF
    International audienceThis paper tackles the problem of novel view synthesis (NVS) from 360° images with imperfect camera poses or intrinsic parameters. We propose a novel end-to-end framework for training Neural Radiance Field (NeRF) models given only 360° RGB images and their rough poses, which we refer to as Omni-NeRF. We extend the pinhole camera model of NeRF to a more general camera model that better fits omni-directional fish-eye lenses. The approach jointly learns the scene geometry and optimizes the camera parameters without knowing the fisheye projection

    Rate-Distortion Optimized Super-Ray Merging for Light Field Compression

    Get PDF
    International audienceIn this paper, we focus on the problem of compressing dense light fields which represent very large volumes of highly redundant data. In our scheme, view synthesis based on convolutional neural networks (CNN) is used as a first prediction step to exploit interview correlation. Super-rays are then constructed to capture the interview and spatial redundancy remaining in the prediction residues. To ensure that the super-ray segmentation is highly correlated with the residues to be encoded, the super-rays are computed on synthesized residues (the difference between the four transmitted corner views and their corresponding synthesized views), instead of the synthesized views. Neighboring super-rays are merged into a larger super-ray according to a rate-distortion cost. A 4D shape adaptive discrete cosine transform (SA-DCT) is applied per super-ray on the prediction residues in both the spatial and angular dimensions. A traditional coding scheme consisting of quantization and entropy coding is then used for encoding the transformed coefficients. Experimental results show that the proposed coding scheme outperforms HEVC-based schemes at low bitrate

    Decoder Side Multiplane Images using Geometry Assistance SEI for MPEG Immersive Video

    Get PDF
    International audienceThe MPEG Immersive Video (MIV) standard enables a novel technology denoted as decoder side depth estimation (DSDE) by introducing a dedicated Geometry Absent profile. In DSDE only texture information is coded and the corresponding geometry is reconstructed on the decoder side. MIV further enables the coding of side-information useful to the geometry reconstruction, denoted as Geometry Assistance SEI message. An emerging format for immersive video are Multiplane Images, which is investigated for feasibility in coding systems due to their promising rendering quality with complex sequences. In this work, we show that MIV can be used to construct block-based Multiplane Images on the decoder-side and to enhance the view synthesis performance utilizing the Geometry Assistance SEI. In a complexity-aware setting using only 32 planes, up to 6 dB of quality improvement is achieved compared to the reference
    • 

    corecore