28 research outputs found

    Deep learning-based switchable network for in-loop filtering in high efficiency video coding

    Get PDF
    The video codecs are focusing on a smart transition in this era. A future area of research that has not yet been fully investigated is the effect of deep learning on video compression. The paper’s goal is to reduce the ringing and artifacts that loop filtering causes when high-efficiency video compression is used. Even though there is a lot of research being done to lessen this effect, there are still many improvements that can be made. In This paper we have focused on an intelligent solution for improvising in-loop filtering in high efficiency video coding (HEVC) using a deep convolutional neural network (CNN). The paper proposes the design and implementation of deep CNN-based loop filtering using a series of 15 CNN networks followed by a combine and squeeze network that improves feature extraction. The resultant output is free from double enhancement and the peak signal-to-noise ratio is improved by 0.5 dB compared to existing techniques. The experiments then demonstrate that improving the coding efficiency by pipelining this network to the current network and using it for higher quantization parameters (QP) is more effective than using it separately. Coding efficiency is improved by an average of 8.3% with the switching based deep CNN in-loop filtering

    Deep learning-based artifacts removal in video compression

    Get PDF
    Title from PDF of title page viewed December 15, 2021Dissertation advisor: Zhu LiVitaIncludes bibliographical references (pages 112-129)Thesis (Ph.D.)--School of Computing and Engineering. University of Missouri--Kansas City, 2021The block-based coding structure in the hybrid video coding framework inevitably introduces compression artifacts such as blocking, ringing, etc. To compensate for those artifacts, extensive filtering techniques were proposed in the loop of video codecs, which are capable of boosting the subjective and objective qualities of reconstructed videos. Recently, neural network-based filters were presented with the power of deep learning from a large magnitude of data. Though the coding efficiency has been improved from traditional methods in High-Efficiency Video Coding (HEVC), the rich features and in- formation generated by the compression pipeline has not been fully utilized in the design of neural networks. Therefore, we propose a learning-based method to further improve the coding efficiency to its full extent. In addition, the point cloud is an essential format for three-dimensional (3-D) ob- jects capture and communication for Augmented Reality (AR) and Virtual Reality (VR) applications. In the current state of the art video-based point cloud compression (V-PCC),a dynamic point cloud is projected onto geometry and attribute videos patch by patch, each represented by its texture, depth, and occupancy map for reconstruction. To deal with oc- clusion, each patch is projected onto near and far depth fields in the geometry video. Once there are artifacts on the compressed two-dimensional (2-D) geometry video, they would be propagated to the 3-D point cloud frames. In addition, in the lossy compression, there always exists a tradeoff between the rate of bitstream and distortion (RD). Although some methods were proposed to attenuate these artifacts and improve the coding efficiency, the non-linear representation ability of Convolutional Neural Network (CNN) has not been fully considered. Therefore, we propose a learning-based approach to remove the geom- etry artifacts and improve the compressing efficiency. Besides, we propose using a CNN to improve the accuracy of the occupancy map video in V-PCC. To the best of our knowledge, these are the first learning-based solutions of the geometry artifacts removal in HEVC and occupancy map enhancement in V-PCC. The extensive experimental results show that the proposed approaches achieve significant gains in HEVC and V-PCC compared to the state-of-the-art schemes.Residual-Guided In-Loop Filter Using Convolution Neural Network -- Deep learning geometry compression artifacts removal for video-based point cloud compression -- Convolutional Neural Network-Based Occupancy Map Accuracy Improvement for Video-based Point Cloud Compressio

    A Research on Enhancing Reconstructed Frames in Video Codecs

    Get PDF
    A series of video codecs, combining encoder and decoder, have been developed to improve the human experience of video-on-demand: higher quality videos at lower bitrates. Despite being at the leading of the compression race, the High Efficiency Video Coding (HEVC or H.265), the latest Versatile Video Coding (VVC) standard, and compressive sensing (CS) are still suffering from lossy compression. Lossy compression algorithms approximate input signals by smaller file size but degrade reconstructed data, leaving space for further improvement. This work aims to develop hybrid codecs taking advantage of both state-of-the-art video coding technologies and deep learning techniques: traditional non-learning components will either be replaced or combined with various deep learning models. Note that related studies have not made the most of coding information, this work studies and utilizes more potential resources in both encoder and decoder for further improving different codecs.In the encoder, motion compensated prediction (MCP) is one of the key components that bring high compression ratios to video codecs. For enhancing the MCP performance, modern video codecs offer interpolation filters for fractional motions. However, these handcrafted fractional interpolation filters are designed on ideal signals, which limit the codecs in dealing with real-world video data. This proposal introduces a deep learning approach for all Luma and Chroma fractional pixels, aiming for more accurate motion compensation and coding efficiency.One extraordinary feature of CS compared to other codecs is that CS can recover multiple images at the decoder by applying various algorithms on the one and only coded data. Note that the related works have not made use of this property, this work enables a deep learning-based compressive sensing image enhancement framework using multiple reconstructed signals. Learning to enhance from multiple reconstructed images delivers a valuable mechanism for training deep neural networks while requiring no additional transmitted data.In the encoder and decoder of modern video coding standards, in-loop filters (ILF) dedicate the most important role in producing the final reconstructed image quality and compression rate. This work introduces a deep learning approach for improving the handcrafted ILF for modern video coding standards. We first utilize various coding resources and present novel deep learning-based ILF. Related works perform the rate-distortion-based ILF mode selection at the coding-tree-unit (CTU) level to further enhance the deep learning-based ILF, and the corresponding bits are encoded and transmitted to the decoder. In this work, we move towards a deeper approach: a reinforcement-learning based autonomous ILF mode selection scheme is presented, enabling the ability to adapt to different coding unit (CU) levels. Using this approach, we require no additional bits while ensuring the best image quality at local levels beyond the CTU level.While this research mainly targets improving the recent video coding standard VVC and the sparse-based CS, it is also flexibly designed to adapt the previous and future video coding standards with minor modifications.博士(工学)法政大学 (Hosei University
    corecore