16 research outputs found

    CAwa-NeRF: Instant Learning of Compression-Aware NeRF Features

    Full text link
    Modeling 3D scenes by volumetric feature grids is one of the promising directions of neural approximations to improve Neural Radiance Fields (NeRF). Instant-NGP (INGP) introduced multi-resolution hash encoding from a lookup table of trainable feature grids which enabled learning high-quality neural graphics primitives in a matter of seconds. However, this improvement came at the cost of higher storage size. In this paper, we address this challenge by introducing instant learning of compression-aware NeRF features (CAwa-NeRF), that allows exporting the zip compressed feature grids at the end of the model training with a negligible extra time overhead without changing neither the storage architecture nor the parameters used in the original INGP paper. Nonetheless, the proposed method is not limited to INGP but could also be adapted to any model. By means of extensive simulations, our proposed instant learning pipeline can achieve impressive results on different kinds of static scenes such as single object masked background scenes and real-life scenes captured in our studio. In particular, for single object masked background scenes CAwa-NeRF compresses the feature grids down to 6% (1.2 MB) of the original size without any loss in the PSNR (33 dB) or down to 2.4% (0.53 MB) with a slight virtual loss (32.31 dB).Comment: 10 pages, 9 figure

    COOL-CHIC: Coordinate-based Low Complexity Hierarchical Image Codec

    Full text link
    We introduce COOL-CHIC, a Coordinate-based Low Complexity Hierarchical Image Codec. It is a learned alternative to autoencoders with approximately 2000 parameters and 2500 multiplications per decoded pixel. Despite its low complexity, COOL-CHIC offers compression performance close to modern conventional MPEG codecs such as HEVC and VVC. This method is inspired by the Coordinate-based Neural Representation, where an image is represented as a learned function which maps pixel coordinates to RGB values. The parameters of the mapping function are then sent using entropy coding. At the receiver side, the compressed image is obtained by evaluating the mapping function for all pixel coordinates. COOL-CHIC implementation is made available upon request

    Low-complexity Overfitted Neural Image Codec

    Full text link
    We propose a neural image codec at reduced complexity which overfits the decoder parameters to each input image. While autoencoders perform up to a million multiplications per decoded pixel, the proposed approach only requires 2300 multiplications per pixel. Albeit low-complexity, the method rivals autoencoder performance and surpasses HEVC performance under various coding conditions. Additional lightweight modules and an improved training process provide a 14% rate reduction with respect to previous overfitted codecs, while offering a similar complexity. This work is made open-source at https://orange-opensource.github.io/Cool-Chic/Comment: Accepted at IEEE MMSP 202

    Développement de schémas de compression vidéo basés apprentissage

    No full text
    The ever-growing amount of images and videos conveyed over the Internet has a substantial impact on climate change. This impact can be mitigated through the design of better compression algorithms, which aims to reduce the size of videos while maintaining an acceptable quality for the user. Since the 90s, video coding standards have been devised to reduce the video size through a succession of linear operations. Video coding standards are incrementally designed, leading to a separate optimization of the different operations. Recently, neural networks have emerged as a relevant solution for a wide variety of issues, due to their ability to learn non-linear functions through an end-to-end optimization process. While learned approaches already achieve state-of-the-art performance for still image compression, video coding remains a more challenging task. This thesis proposes to design a learned video coder, to leverage the promising abilities of neural networks. Due to the novelty of this field, the design of the proposed learned coding scheme starts from a blank page. The different elements of the coder are thoroughly considered. In the end, it is shown that the proposed learned coder is able to achieve performance equivalent to modern video coding standards in a realistic video coding setup.La quantité toujours croissante d’images et de vidéos échangées sur Internet a un impact substantiel sur le changement climatique. Cet impact peut-être réduit via le développement de meilleurs algorithmes de compression, qui visent à réduire le volume de données représentant les vidéos, tout en conservant une qualité acceptable pour l’utilisateur. Depuis les années 90, des standards de compression vidéo ont été conçus afin de diminuer la taille des vidéos via une succession d’opérations linéaires. Ces standards sont développés de manière incrémentale, entraînant une optimisation séparée des différentes opérations. Récemment, les réseaux de neurones ont émergé comme une réponse pertinente à un grand nombre de problématiques, grâce à leur capacité à apprendre des fonctions non-linéaires au travers d’une optimisation de bout-en-bout. Si la compression neuronale constitue d’ores et déjà l’état de l’art pour le codage d’images fixes, le codage vidéo demeure une tâche plus difficile. Cette thèse propose de concevoir un codeur vidéo basé apprentissage, afin de tirer profit des capacités prometteuses des réseaux de neurones. Étant donné la nouveauté de ce domaine, la conception du codeur démarre d’une page blanche, et les différents éléments le composant sont étudiés en détail. Au final, l’évaluation du codeur proposé sur une tâche de codage vidéo réaliste montre qu’il présente des performances compétitives avec des standards de compression modernes

    Développement de schémas de compression vidéo basés apprentissage

    No full text
    The ever-growing amount of images and videos conveyed over the Internet has a substantial impact on climate change. This impact can be mitigated through the design of better compression algorithms, which aims to reduce the size of videos while maintaining an acceptable quality for the user. Since the 90s, video coding standards have been devised to reduce the video size through a succession of linear operations. Video coding standards are incrementally designed, leading to a separate optimization of the different operations. Recently, neural networks have emerged as a relevant solution for a wide variety of issues, due to their ability to learn non-linear functions through an end-to-end optimization process. While learned approaches already achieve state-of-the-art performance for still image compression, video coding remains a more challenging task. This thesis proposes to design a learned video coder, to leverage the promising abilities of neural networks. Due to the novelty of this field, the design of the proposed learned coding scheme starts from a blank page. The different elements of the coder are thoroughly considered. In the end, it is shown that the proposed learned coder is able to achieve performance equivalent to modern video coding standards in a realistic video coding setup.La quantité toujours croissante d’images et de vidéos échangées sur Internet a un impact substantiel sur le changement climatique. Cet impact peut-être réduit via le développement de meilleurs algorithmes de compression, qui visent à réduire le volume de données représentant les vidéos, tout en conservant une qualité acceptable pour l’utilisateur. Depuis les années 90, des standards de compression vidéo ont été conçus afin de diminuer la taille des vidéos via une succession d’opérations linéaires. Ces standards sont développés de manière incrémentale, entraînant une optimisation séparée des différentes opérations. Récemment, les réseaux de neurones ont émergé comme une réponse pertinente à un grand nombre de problématiques, grâce à leur capacité à apprendre des fonctions non-linéaires au travers d’une optimisation de bout-en-bout. Si la compression neuronale constitue d’ores et déjà l’état de l’art pour le codage d’images fixes, le codage vidéo demeure une tâche plus difficile. Cette thèse propose de concevoir un codeur vidéo basé apprentissage, afin de tirer profit des capacités prometteuses des réseaux de neurones. Étant donné la nouveauté de ce domaine, la conception du codeur démarre d’une page blanche, et les différents éléments le composant sont étudiés en détail. Au final, l’évaluation du codeur proposé sur une tâche de codage vidéo réaliste montre qu’il présente des performances compétitives avec des standards de compression modernes

    Développement de schémas de compression vidéo basés apprentissage

    No full text
    The ever-growing amount of images and videos conveyed over the Internet has a substantial impact on climate change. This impact can be mitigated through the design of better compression algorithms, which aims to reduce the size of videos while maintaining an acceptable quality for the user. Since the 90s, video coding standards have been devised to reduce the video size through a succession of linear operations. Video coding standards are incrementally designed, leading to a separate optimization of the different operations. Recently, neural networks have emerged as a relevant solution for a wide variety of issues, due to their ability to learn non-linear functions through an end-to-end optimization process. While learned approaches already achieve state-of-the-art performance for still image compression, video coding remains a more challenging task. This thesis proposes to design a learned video coder, to leverage the promising abilities of neural networks. Due to the novelty of this field, the design of the proposed learned coding scheme starts from a blank page. The different elements of the coder are thoroughly considered. In the end, it is shown that the proposed learned coder is able to achieve performance equivalent to modern video coding standards in a realistic video coding setup.La quantité toujours croissante d’images et de vidéos échangées sur Internet a un impact substantiel sur le changement climatique. Cet impact peut-être réduit via le développement de meilleurs algorithmes de compression, qui visent à réduire le volume de données représentant les vidéos, tout en conservant une qualité acceptable pour l’utilisateur. Depuis les années 90, des standards de compression vidéo ont été conçus afin de diminuer la taille des vidéos via une succession d’opérations linéaires. Ces standards sont développés de manière incrémentale, entraînant une optimisation séparée des différentes opérations. Récemment, les réseaux de neurones ont émergé comme une réponse pertinente à un grand nombre de problématiques, grâce à leur capacité à apprendre des fonctions non-linéaires au travers d’une optimisation de bout-en-bout. Si la compression neuronale constitue d’ores et déjà l’état de l’art pour le codage d’images fixes, le codage vidéo demeure une tâche plus difficile. Cette thèse propose de concevoir un codeur vidéo basé apprentissage, afin de tirer profit des capacités prometteuses des réseaux de neurones. Étant donné la nouveauté de ce domaine, la conception du codeur démarre d’une page blanche, et les différents éléments le composant sont étudiés en détail. Au final, l’évaluation du codeur proposé sur une tâche de codage vidéo réaliste montre qu’il présente des performances compétitives avec des standards de compression modernes

    Coding standards as anchors for the CVPR CLIC video track

    No full text
    International audienceIn 2021, a new track has been initiated in the Challenge for Learned Image Compression : the video track. This category proposes to explore technologies for the compression of short video clips at 1 Mbit/s. This paper proposes to generate coded videos using the latest standardized video coders, especially Versatile Video Coding (VVC). The objective is not only to measure the progress made by learning techniques compared to the state of the art video coders, but also to quantify their progress from years to years. With this in mind, this paper documents how to generate the video sequences fulfilling the requirements of this challenge, in a reproducible way, targeting the maximum performance for VVC

    Cool-chic video: Learned video coding with 800 parameters

    No full text
    International audienceWe propose a lightweight learned video codec with 900 multiplications per decoded pixel and 800 parameters overall. To the best of our knowledge, this is one of the neural video codecs with the lowest decoding complexity. It is built upon the overfitted image codec Cool-chic and supplements it with an inter coding module to leverage the video’s temporal redundancies. The proposed model is able to compress videos using both low-delay and random access configurations and achieves rate-distortion close to AVC while outperforming other overfitted codecs such as FFNeRV. The system is made open-source: orange-opensource.github.io/Cool-Chic

    Binary Probability Model for Learning Based Image Compression

    No full text
    International audienceIn this paper, we propose to enhance learned image compression systems with a richer probability model for the latent variables. Previous works model the latents with a Gaussian or a Laplace distribution. Inspired by binary arithmetic coding , we propose to signal the latents with three binary values and one integer, with different probability models. A relaxation method is designed to perform gradient-based training. The richer probability model results in a better entropy coding leading to lower rate. Experiments under the Challenge on Learned Image Compression (CLIC) test conditions demonstrate that this method achieves 18 % rate saving compared to Gaussian or Laplace models

    Conditional Coding for Flexible Learned Video Compression

    No full text
    International audienceThis paper introduces a novel framework for end-to-end learned video coding. Image compression is generalized through conditional coding to exploit information from reference frames, allowing to process intra and inter frames with the same coder. The system is trained through the minimization of a rate-distortion cost, with no pre-training or proxy loss. Its flexibility is assessed under three coding configurations (All Intra, Low-delay P and Random Access), where it is shown to achieve performance competitive with the state-of-the-art video codec HEVC
    corecore