7 research outputs found
From Rank Estimation to Rank Approximation: Rank Residual Constraint for Image Restoration
In this paper, we propose a novel approach to the rank minimization problem,
termed rank residual constraint (RRC) model. Different from existing low-rank
based approaches, such as the well-known nuclear norm minimization (NNM) and
the weighted nuclear norm minimization (WNNM), which estimate the underlying
low-rank matrix directly from the corrupted observations, we progressively
approximate the underlying low-rank matrix via minimizing the rank residual.
Through integrating the image nonlocal self-similarity (NSS) prior with the
proposed RRC model, we apply it to image restoration tasks, including image
denoising and image compression artifacts reduction. Towards this end, we first
obtain a good reference of the original image groups by using the image NSS
prior, and then the rank residual of the image groups between this reference
and the degraded image is minimized to achieve a better estimate to the desired
image. In this manner, both the reference and the estimated image are updated
gradually and jointly in each iteration. Based on the group-based sparse
representation model, we further provide a theoretical analysis on the
feasibility of the proposed RRC model. Experimental results demonstrate that
the proposed RRC model outperforms many state-of-the-art schemes in both the
objective and perceptual quality
Data-Driven Image Restoration
Every day many images are taken by digital cameras, and people
are demanding visually accurate and pleasing result. Noise and
blur degrade images captured by modern cameras, and high-level
vision tasks (such as segmentation, recognition, and tracking)
require high-quality images. Therefore, image restoration
specifically, image
deblurring and image denoising is a critical preprocessing step.
A fundamental problem in image deblurring is to recover reliably
distinct spatial frequencies that have been suppressed by the
blur kernel. Existing image deblurring techniques often rely on
generic image priors that only help recover part of the frequency
spectrum, such as the frequencies near the high-end. To this end,
we pose the following specific questions: (i) Does class-specific
information offer an advantage over existing generic priors for
image quality restoration? (ii) If a class-specific prior exists,
how should it be encoded into a deblurring framework to recover
attenuated image frequencies? Throughout this work, we devise a
class-specific prior based on the band-pass filter responses and
incorporate it into a deblurring strategy. Specifically, we show
that the subspace of band-pass filtered images and their
intensity distributions serve as useful priors for recovering
image frequencies.
Next, we present a novel image denoising algorithm that uses
external, category specific image database. In contrast to
existing noisy image restoration algorithms, our method selects
clean image “support patches” similar to the noisy patch from
an external database. We employ a content adaptive distribution
model for each patch where we derive the parameters of the
distribution from the support patches. Our objective function
composed of a Gaussian fidelity term that imposes category
specific information, and a low-rank term that encourages the
similarity between the noisy and the support patches in a robust
manner.
Finally, we propose to learn a fully-convolutional network model
that consists of a Chain of Identity Mapping Modules (CIMM) for
image denoising. The CIMM structure possesses two distinctive
features that are important for the noise removal task. Firstly,
each residual unit employs identity mappings as the skip
connections and receives pre-activated input to preserve the
gradient magnitude propagated in both the forward and backward
directions. Secondly, by utilizing dilated kernels for the
convolution layers in the residual branch, each neuron in the
last convolution layer of each module can observe the full
receptive field of the first layer
Nouvelles méthodes de prédiction inter-images pour la compression d’images et de vidéos
Due to the large availability of video cameras and new social media practices, as well as the emergence of cloud services, images and videosconstitute today a significant amount of the total data that is transmitted over the internet. Video streaming applications account for more than 70% of the world internet bandwidth. Whereas billions of images are already stored in the cloud and millions are uploaded every day. The ever growing streaming and storage requirements of these media require the constant improvements of image and video coding tools. This thesis aims at exploring novel approaches for improving current inter-prediction methods. Such methods leverage redundancies between similar frames, and were originally developed in the context of video compression. In a first approach, novel global and local inter-prediction tools are associated to improve the efficiency of image sets compression schemes based on video codecs. By leveraging a global geometric and photometric compensation with a locally linear prediction, significant improvements can be obtained. A second approach is then proposed which introduces a region-based inter-prediction scheme. The proposed method is able to improve the coding performances compared to existing solutions by estimating and compensating geometric and photometric distortions on a semi-local level. This approach is then adapted and validated in the context of video compression. Bit-rate improvements are obtained, especially for sequences displaying complex real-world motions such as zooms and rotations. The last part of the thesis focuses on deep learning approaches for inter-prediction. Deep neural networks have shown striking results for a large number of computer vision tasks over the last years. Deep learning based methods proposed for frame interpolation applications are studied here in the context of video compression. Coding performance improvements over traditional motion estimation and compensation methods highlight the potential of these deep architectures.En raison de la grande disponibilité des dispositifs de capture vidéo et des nouvelles pratiques liées aux réseaux sociaux, ainsi qu’à l’émergence desservices en ligne, les images et les vidéos constituent aujourd’hui une partie importante de données transmises sur internet. Les applications de streaming vidéo représentent ainsi plus de 70% de la bande passante totale de l’internet. Des milliards d’images sont déjà stockées dans le cloud et des millions y sont téléchargés chaque jour. Les besoins toujours croissants en streaming et stockage nécessitent donc une amélioration constante des outils de compression d’image et de vidéo. Cette thèse vise à explorer des nouvelles approches pour améliorer les méthodes actuelles de prédiction inter-images. De telles méthodes tirent parti des redondances entre images similaires, et ont été développées à l’origine dans le contexte de la vidéo compression. Dans une première partie, de nouveaux outils de prédiction inter globaux et locaux sont associés pour améliorer l’efficacité des schémas de compression de bases de données d’image. En associant une compensation géométrique et photométrique globale avec une prédiction linéaire locale, des améliorations significatives peuvent être obtenues. Une seconde approche est ensuite proposée qui introduit un schéma deprédiction inter par régions. La méthode proposée est en mesure d’améliorer les performances de codage par rapport aux solutions existantes en estimant et en compensant les distorsions géométriques et photométriques à une échelle semi locale. Cette approche est ensuite adaptée et validée dans le cadre de la compression vidéo. Des améliorations en réduction de débit sont obtenues, en particulier pour les séquences présentant des mouvements complexes réels tels que des zooms et des rotations. La dernière partie de la thèse se concentre sur l’étude des méthodes d’apprentissage en profondeur dans le cadre de la prédiction inter. Ces dernières années, les réseaux de neurones profonds ont obtenu des résultats impressionnants pour un grand nombre de tâches de vision par ordinateur. Les méthodes basées sur l’apprentissage en profondeur proposéesà l’origine pour de l’interpolation d’images sont étudiées ici dans le contexte de la compression vidéo. Des améliorations en terme de performances de codage sont obtenues par rapport aux méthodes d’estimation et de compensation de mouvements traditionnelles. Ces résultats mettent en évidence le fort potentiel de ces architectures profondes dans le domaine de la compression vidéo