16 research outputs found
CAwa-NeRF: Instant Learning of Compression-Aware NeRF Features
Modeling 3D scenes by volumetric feature grids is one of the promising
directions of neural approximations to improve Neural Radiance Fields (NeRF).
Instant-NGP (INGP) introduced multi-resolution hash encoding from a lookup
table of trainable feature grids which enabled learning high-quality neural
graphics primitives in a matter of seconds. However, this improvement came at
the cost of higher storage size. In this paper, we address this challenge by
introducing instant learning of compression-aware NeRF features (CAwa-NeRF),
that allows exporting the zip compressed feature grids at the end of the model
training with a negligible extra time overhead without changing neither the
storage architecture nor the parameters used in the original INGP paper.
Nonetheless, the proposed method is not limited to INGP but could also be
adapted to any model. By means of extensive simulations, our proposed instant
learning pipeline can achieve impressive results on different kinds of static
scenes such as single object masked background scenes and real-life scenes
captured in our studio. In particular, for single object masked background
scenes CAwa-NeRF compresses the feature grids down to 6% (1.2 MB) of the
original size without any loss in the PSNR (33 dB) or down to 2.4% (0.53 MB)
with a slight virtual loss (32.31 dB).Comment: 10 pages, 9 figure
COOL-CHIC: Coordinate-based Low Complexity Hierarchical Image Codec
We introduce COOL-CHIC, a Coordinate-based Low Complexity Hierarchical Image
Codec. It is a learned alternative to autoencoders with approximately 2000
parameters and 2500 multiplications per decoded pixel. Despite its low
complexity, COOL-CHIC offers compression performance close to modern
conventional MPEG codecs such as HEVC and VVC. This method is inspired by the
Coordinate-based Neural Representation, where an image is represented as a
learned function which maps pixel coordinates to RGB values. The parameters of
the mapping function are then sent using entropy coding. At the receiver side,
the compressed image is obtained by evaluating the mapping function for all
pixel coordinates. COOL-CHIC implementation is made available upon request
Low-complexity Overfitted Neural Image Codec
We propose a neural image codec at reduced complexity which overfits the
decoder parameters to each input image. While autoencoders perform up to a
million multiplications per decoded pixel, the proposed approach only requires
2300 multiplications per pixel. Albeit low-complexity, the method rivals
autoencoder performance and surpasses HEVC performance under various coding
conditions. Additional lightweight modules and an improved training process
provide a 14% rate reduction with respect to previous overfitted codecs, while
offering a similar complexity. This work is made open-source at
https://orange-opensource.github.io/Cool-Chic/Comment: Accepted at IEEE MMSP 202
Développement de schémas de compression vidéo basés apprentissage
The ever-growing amount of images and videos conveyed over the Internet has a substantial impact on climate change. This impact can be mitigated through the design of better compression algorithms, which aims to reduce the size of videos while maintaining an acceptable quality for the user. Since the 90s, video coding standards have been devised to reduce the video size through a succession of linear operations. Video coding standards are incrementally designed, leading to a separate optimization of the different operations. Recently, neural networks have emerged as a relevant solution for a wide variety of issues, due to their ability to learn non-linear functions through an end-to-end optimization process. While learned approaches already achieve state-of-the-art performance for still image compression, video coding remains a more challenging task. This thesis proposes to design a learned video coder, to leverage the promising abilities of neural networks. Due to the novelty of this field, the design of the proposed learned coding scheme starts from a blank page. The different elements of the coder are thoroughly considered. In the end, it is shown that the proposed learned coder is able to achieve performance equivalent to modern video coding standards in a realistic video coding setup.La quantité toujours croissante d’images et de vidéos échangées sur Internet a un impact substantiel sur le changement climatique. Cet impact peut-être réduit via le développement de meilleurs algorithmes de compression, qui visent à réduire le volume de données représentant les vidéos, tout en conservant une qualité acceptable pour l’utilisateur. Depuis les années 90, des standards de compression vidéo ont été conçus afin de diminuer la taille des vidéos via une succession d’opérations linéaires. Ces standards sont développés de manière incrémentale, entraînant une optimisation séparée des différentes opérations. Récemment, les réseaux de neurones ont émergé comme une réponse pertinente à un grand nombre de problématiques, grâce à leur capacité à apprendre des fonctions non-linéaires au travers d’une optimisation de bout-en-bout. Si la compression neuronale constitue d’ores et déjà l’état de l’art pour le codage d’images fixes, le codage vidéo demeure une tâche plus difficile. Cette thèse propose de concevoir un codeur vidéo basé apprentissage, afin de tirer profit des capacités prometteuses des réseaux de neurones. Étant donné la nouveauté de ce domaine, la conception du codeur démarre d’une page blanche, et les différents éléments le composant sont étudiés en détail. Au final, l’évaluation du codeur proposé sur une tâche de codage vidéo réaliste montre qu’il présente des performances compétitives avec des standards de compression modernes
Développement de schémas de compression vidéo basés apprentissage
The ever-growing amount of images and videos conveyed over the Internet has a substantial impact on climate change. This impact can be mitigated through the design of better compression algorithms, which aims to reduce the size of videos while maintaining an acceptable quality for the user. Since the 90s, video coding standards have been devised to reduce the video size through a succession of linear operations. Video coding standards are incrementally designed, leading to a separate optimization of the different operations. Recently, neural networks have emerged as a relevant solution for a wide variety of issues, due to their ability to learn non-linear functions through an end-to-end optimization process. While learned approaches already achieve state-of-the-art performance for still image compression, video coding remains a more challenging task. This thesis proposes to design a learned video coder, to leverage the promising abilities of neural networks. Due to the novelty of this field, the design of the proposed learned coding scheme starts from a blank page. The different elements of the coder are thoroughly considered. In the end, it is shown that the proposed learned coder is able to achieve performance equivalent to modern video coding standards in a realistic video coding setup.La quantité toujours croissante d’images et de vidéos échangées sur Internet a un impact substantiel sur le changement climatique. Cet impact peut-être réduit via le développement de meilleurs algorithmes de compression, qui visent à réduire le volume de données représentant les vidéos, tout en conservant une qualité acceptable pour l’utilisateur. Depuis les années 90, des standards de compression vidéo ont été conçus afin de diminuer la taille des vidéos via une succession d’opérations linéaires. Ces standards sont développés de manière incrémentale, entraînant une optimisation séparée des différentes opérations. Récemment, les réseaux de neurones ont émergé comme une réponse pertinente à un grand nombre de problématiques, grâce à leur capacité à apprendre des fonctions non-linéaires au travers d’une optimisation de bout-en-bout. Si la compression neuronale constitue d’ores et déjà l’état de l’art pour le codage d’images fixes, le codage vidéo demeure une tâche plus difficile. Cette thèse propose de concevoir un codeur vidéo basé apprentissage, afin de tirer profit des capacités prometteuses des réseaux de neurones. Étant donné la nouveauté de ce domaine, la conception du codeur démarre d’une page blanche, et les différents éléments le composant sont étudiés en détail. Au final, l’évaluation du codeur proposé sur une tâche de codage vidéo réaliste montre qu’il présente des performances compétitives avec des standards de compression modernes
Développement de schémas de compression vidéo basés apprentissage
The ever-growing amount of images and videos conveyed over the Internet has a substantial impact on climate change. This impact can be mitigated through the design of better compression algorithms, which aims to reduce the size of videos while maintaining an acceptable quality for the user. Since the 90s, video coding standards have been devised to reduce the video size through a succession of linear operations. Video coding standards are incrementally designed, leading to a separate optimization of the different operations. Recently, neural networks have emerged as a relevant solution for a wide variety of issues, due to their ability to learn non-linear functions through an end-to-end optimization process. While learned approaches already achieve state-of-the-art performance for still image compression, video coding remains a more challenging task. This thesis proposes to design a learned video coder, to leverage the promising abilities of neural networks. Due to the novelty of this field, the design of the proposed learned coding scheme starts from a blank page. The different elements of the coder are thoroughly considered. In the end, it is shown that the proposed learned coder is able to achieve performance equivalent to modern video coding standards in a realistic video coding setup.La quantité toujours croissante d’images et de vidéos échangées sur Internet a un impact substantiel sur le changement climatique. Cet impact peut-être réduit via le développement de meilleurs algorithmes de compression, qui visent à réduire le volume de données représentant les vidéos, tout en conservant une qualité acceptable pour l’utilisateur. Depuis les années 90, des standards de compression vidéo ont été conçus afin de diminuer la taille des vidéos via une succession d’opérations linéaires. Ces standards sont développés de manière incrémentale, entraînant une optimisation séparée des différentes opérations. Récemment, les réseaux de neurones ont émergé comme une réponse pertinente à un grand nombre de problématiques, grâce à leur capacité à apprendre des fonctions non-linéaires au travers d’une optimisation de bout-en-bout. Si la compression neuronale constitue d’ores et déjà l’état de l’art pour le codage d’images fixes, le codage vidéo demeure une tâche plus difficile. Cette thèse propose de concevoir un codeur vidéo basé apprentissage, afin de tirer profit des capacités prometteuses des réseaux de neurones. Étant donné la nouveauté de ce domaine, la conception du codeur démarre d’une page blanche, et les différents éléments le composant sont étudiés en détail. Au final, l’évaluation du codeur proposé sur une tâche de codage vidéo réaliste montre qu’il présente des performances compétitives avec des standards de compression modernes
Coding standards as anchors for the CVPR CLIC video track
International audienceIn 2021, a new track has been initiated in the Challenge for Learned Image Compression : the video track. This category proposes to explore technologies for the compression of short video clips at 1 Mbit/s. This paper proposes to generate coded videos using the latest standardized video coders, especially Versatile Video Coding (VVC). The objective is not only to measure the progress made by learning techniques compared to the state of the art video coders, but also to quantify their progress from years to years. With this in mind, this paper documents how to generate the video sequences fulfilling the requirements of this challenge, in a reproducible way, targeting the maximum performance for VVC
Cool-chic video: Learned video coding with 800 parameters
International audienceWe propose a lightweight learned video codec with 900 multiplications per decoded pixel and 800 parameters overall. To the best of our knowledge, this is one of the neural video codecs with the lowest decoding complexity. It is built upon the overfitted image codec Cool-chic and supplements it with an inter coding module to leverage the video’s temporal redundancies. The proposed model is able to compress videos using both low-delay and random access configurations and achieves rate-distortion close to AVC while outperforming other overfitted codecs such as FFNeRV. The system is made open-source: orange-opensource.github.io/Cool-Chic
Binary Probability Model for Learning Based Image Compression
International audienceIn this paper, we propose to enhance learned image compression systems with a richer probability model for the latent variables. Previous works model the latents with a Gaussian or a Laplace distribution. Inspired by binary arithmetic coding , we propose to signal the latents with three binary values and one integer, with different probability models. A relaxation method is designed to perform gradient-based training. The richer probability model results in a better entropy coding leading to lower rate. Experiments under the Challenge on Learned Image Compression (CLIC) test conditions demonstrate that this method achieves 18 % rate saving compared to Gaussian or Laplace models
Conditional Coding for Flexible Learned Video Compression
International audienceThis paper introduces a novel framework for end-to-end learned video coding. Image compression is generalized through conditional coding to exploit information from reference frames, allowing to process intra and inter frames with the same coder. The system is trained through the minimization of a rate-distortion cost, with no pre-training or proxy loss. Its flexibility is assessed under three coding configurations (All Intra, Low-delay P and Random Access), where it is shown to achieve performance competitive with the state-of-the-art video codec HEVC