30 research outputs found

    Underwater image restoration: super-resolution and deblurring via sparse representation and denoising by means of marine snow removal

    Get PDF
    Underwater imaging has been widely used as a tool in many fields, however, a major issue is the quality of the resulting images/videos. Due to the light's interaction with water and its constituents, the acquired underwater images/videos often suffer from a significant amount of scatter (blur, haze) and noise. In the light of these issues, this thesis considers problems of low-resolution, blurred and noisy underwater images and proposes several approaches to improve the quality of such images/video frames. Quantitative and qualitative experiments validate the success of proposed algorithms

    Layer Decomposition Learning Based on Gaussian Convolution Model and Residual Deblurring for Inverse Halftoning

    Full text link
    Layer decomposition to separate an input image into base and detail layers has been steadily used for image restoration. Existing residual networks based on an additive model require residual layers with a small output range for fast convergence and visual quality improvement. However, in inverse halftoning, homogenous dot patterns hinder a small output range from the residual layers. Therefore, a new layer decomposition network based on the Gaussian convolution model (GCM) and structure-aware deblurring strategy is presented to achieve residual learning for both the base and detail layers. For the base layer, a new GCM-based residual subnetwork is presented. The GCM utilizes a statistical distribution, in which the image difference between a blurred continuous-tone image and a blurred halftoned image with a Gaussian filter can result in a narrow output range. Subsequently, the GCM-based residual subnetwork uses a Gaussian-filtered halftoned image as input and outputs the image difference as residual, thereby generating the base layer, i.e., the Gaussian-blurred continuous-tone image. For the detail layer, a new structure-aware residual deblurring subnetwork (SARDS) is presented. To remove the Gaussian blurring of the base layer, the SARDS uses the predicted base layer as input and outputs the deblurred version. To more effectively restore image structures such as lines and texts, a new image structure map predictor is incorporated into the deblurring network to induce structure-adaptive learning. This paper provides a method to realize the residual learning of both the base and detail layers based on the GCM and SARDS. In addition, it is verified that the proposed method surpasses state-of-the-art methods based on U-Net, direct deblurring networks, and progressively residual networks

    Subspace Representations for Robust Face and Facial Expression Recognition

    Get PDF
    Analyzing human faces and modeling their variations have always been of interest to the computer vision community. Face analysis based on 2D intensity images is a challenging problem, complicated by variations in pose, lighting, blur, and non-rigid facial deformations due to facial expressions. Among the different sources of variation, facial expressions are of interest as important channels of non-verbal communication. Facial expression analysis is also affected by changes in view-point and inter-subject variations in performing different expressions. This dissertation makes an attempt to address some of the challenges involved in developing robust algorithms for face and facial expression recognition by exploiting the idea of proper subspace representations for data. Variations in the visual appearance of an object mostly arise due to changes in illumination and pose. So we first present a video-based sequential algorithm for estimating the face albedo as an illumination-insensitive signature for face recognition. We show that by knowing/estimating the pose of the face at each frame of a sequence, the albedo can be efficiently estimated using a Kalman filter. Then we extend this to the case of unknown pose by simultaneously tracking the pose as well as updating the albedo through an efficient Bayesian inference method performed using a Rao-Blackwellized particle filter. Since understanding the effects of blur, especially motion blur, is an important problem in unconstrained visual analysis, we then propose a blur-robust recognition algorithm for faces with spatially varying blur. We model a blurred face as a weighted average of geometrically transformed instances of its clean face. We then build a matrix, for each gallery face, whose column space spans the space of all the motion blurred images obtained from the clean face. This matrix representation is then used to define a proper objective function and perform blur-robust face recognition. To develop robust and generalizable models for expression analysis one needs to break the dependence of the models on the choice of the coordinate frame of the camera. To this end, we build models for expressions on the affine shape-space (Grassmann manifold), as an approximation to the projective shape-space, by using a Riemannian interpretation of deformations that facial expressions cause on different parts of the face. This representation enables us to perform various expression analysis and recognition algorithms without the need for pose normalization as a preprocessing step. There is a large degree of inter-subject variations in performing various expressions. This poses an important challenge on developing robust facial expression recognition algorithms. To address this challenge, we propose a dictionary-based approach for facial expression analysis by decomposing expressions in terms of action units (AUs). First, we construct an AU-dictionary using domain experts' knowledge of AUs. To incorporate the high-level knowledge regarding expression decomposition and AUs, we then perform structure-preserving sparse coding by imposing two layers of grouping over AU-dictionary atoms as well as over the test image matrix columns. We use the computed sparse code matrix for each expressive face to perform expression decomposition and recognition. Most of the existing methods for the recognition of faces and expressions consider either the expression-invariant face recognition problem or the identity-independent facial expression recognition problem. We propose joint face and facial expression recognition using a dictionary-based component separation algorithm (DCS). In this approach, the given expressive face is viewed as a superposition of a neutral face component with a facial expression component, which is sparse with respect to the whole image. This assumption leads to a dictionary-based component separation algorithm, which benefits from the idea of sparsity and morphological diversity. The DCS algorithm uses the data-driven dictionaries to decompose an expressive test face into its constituent components. The sparse codes we obtain as a result of this decomposition are then used for joint face and expression recognition

    Algorithms for super-resolution of images based on Sparse Representation and Manifolds

    Get PDF
    lmage super-resolution is defined as a class of techniques that enhance the spatial resolution of images. Super-resolution methods can be subdivided in single and multi image methods. This thesis focuses on developing algorithms based on mathematical theories for single image super­ resolution problems. lndeed, in arder to estimate an output image, we adopta mixed approach: i.e., we use both a dictionary of patches with sparsity constraints (typical of learning-based methods) and regularization terms (typical of reconstruction-based methods). Although the existing methods already per- form well, they do not take into account the geometry of the data to: regularize the solution, cluster data samples (samples are often clustered using algorithms with the Euclidean distance as a dissimilarity metric), learn dictionaries (they are often learned using PCA or K-SVD). Thus, state-of-the-art methods still suffer from shortcomings. In this work, we proposed three new methods to overcome these deficiencies. First, we developed SE-ASDS (a structure tensor based regularization term) in arder to improve the sharpness of edges. SE-ASDS achieves much better results than many state-of-the- art algorithms. Then, we proposed AGNN and GOC algorithms for determining a local subset of training samples from which a good local model can be computed for recon- structing a given input test sample, where we take into account the underlying geometry of the data. AGNN and GOC methods outperform spectral clustering, soft clustering, and geodesic distance based subset selection in most settings. Next, we proposed aSOB strategy which takes into account the geometry of the data and the dictionary size. The aSOB strategy outperforms both PCA and PGA methods. Finally, we combine all our methods in a unique algorithm, named G2SR. Our proposed G2SR algorithm shows better visual and quantitative results when compared to the results of state-of-the-art methods.Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorTese (Doutorado)Super-resolução de imagens é definido como urna classe de técnicas que melhora a resolução espacial de imagens. Métodos de super-resolução podem ser subdivididos em métodos para urna única imagens e métodos para múltiplas imagens. Esta tese foca no desenvolvimento de algoritmos baseados em teorias matemáticas para problemas de super-resolução de urna única imagem. Com o propósito de estimar urna imagem de saída, nós adotamos urna abordagem mista, ou seja: nós usamos dicionários de patches com restrição de esparsidade (método baseado em aprendizagem) e termos de regularização (método baseado em reconstrução). Embora os métodos existentes sejam eficientes, eles nao levam em consideração a geometria dos dados para: regularizar a solução, clusterizar os dados (dados sao frequentemente clusterizados usando algoritmos com a distancia Euclideana como métrica de dissimilaridade), aprendizado de dicionários (eles sao frequentemente treinados usando PCA ou K-SVD). Portante, os métodos do estado da arte ainda tem algumas deficiencias. Neste trabalho, nós propomos tres métodos originais para superar estas deficiencias. Primeiro, nós desenvolvemos SE-ASDS (um termo de regularização baseado em structure tensor) afim de melhorar a nitidez das bordas das imagens. SE-ASDS alcança resultados muito melhores que os algoritmos do estado da arte. Em seguida, nós propomos os algoritmos AGNN e GOC para determinar um subconjunto de amostras de treinamento a partir das quais um bom modelo local pode ser calculado para reconstruir urna dada amostra de entrada considerando a geometria dos dados. Os métodos AGNN e GOC superamos métodos spectral clustering, soft clustering e os métodos baseados em distancia geodésica na maioria dos casos. Depois, nós propomos o método aSOB que leva em consideração a geometria dos dados e o tamanho do dicionário. O método aSOB supera os métodos PCA e PGA. Finalmente, nós combinamos todos os métodos que propomos em um único algoritmo, a saber, G2SR. Nosso algoritmo G2SR mostra resultados melhores que os métodos do estado da arte em termos de PSRN, SSIM, FSIM e qualidade visual

    Sparse Modeling for Image and Vision Processing

    Get PDF
    In recent years, a large amount of multi-disciplinary research has been conducted on sparse models and their applications. In statistics and machine learning, the sparsity principle is used to perform model selection---that is, automatically selecting a simple model among a large collection of them. In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Subsequently, the corresponding tools have been widely adopted by several scientific communities such as neuroscience, bioinformatics, or computer vision. The goal of this monograph is to offer a self-contained view of sparse modeling for visual recognition and image processing. More specifically, we focus on applications where the dictionary is learned and adapted to data, yielding a compact representation that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics and Visio

    Data-Driven Image Restoration

    Get PDF
    Every day many images are taken by digital cameras, and people are demanding visually accurate and pleasing result. Noise and blur degrade images captured by modern cameras, and high-level vision tasks (such as segmentation, recognition, and tracking) require high-quality images. Therefore, image restoration specifically, image deblurring and image denoising is a critical preprocessing step. A fundamental problem in image deblurring is to recover reliably distinct spatial frequencies that have been suppressed by the blur kernel. Existing image deblurring techniques often rely on generic image priors that only help recover part of the frequency spectrum, such as the frequencies near the high-end. To this end, we pose the following specific questions: (i) Does class-specific information offer an advantage over existing generic priors for image quality restoration? (ii) If a class-specific prior exists, how should it be encoded into a deblurring framework to recover attenuated image frequencies? Throughout this work, we devise a class-specific prior based on the band-pass filter responses and incorporate it into a deblurring strategy. Specifically, we show that the subspace of band-pass filtered images and their intensity distributions serve as useful priors for recovering image frequencies. Next, we present a novel image denoising algorithm that uses external, category specific image database. In contrast to existing noisy image restoration algorithms, our method selects clean image “support patches” similar to the noisy patch from an external database. We employ a content adaptive distribution model for each patch where we derive the parameters of the distribution from the support patches. Our objective function composed of a Gaussian fidelity term that imposes category specific information, and a low-rank term that encourages the similarity between the noisy and the support patches in a robust manner. Finally, we propose to learn a fully-convolutional network model that consists of a Chain of Identity Mapping Modules (CIMM) for image denoising. The CIMM structure possesses two distinctive features that are important for the noise removal task. Firstly, each residual unit employs identity mappings as the skip connections and receives pre-activated input to preserve the gradient magnitude propagated in both the forward and backward directions. Secondly, by utilizing dilated kernels for the convolution layers in the residual branch, each neuron in the last convolution layer of each module can observe the full receptive field of the first layer

    Super-resolution:A comprehensive survey

    Get PDF

    Variable Splitting as a Key to Efficient Image Reconstruction

    Get PDF
    The problem of reconstruction of digital images from their degraded measurements has always been a problem of central importance in numerous applications of imaging sciences. In real life, acquired imaging data is typically contaminated by various types of degradation phenomena which are usually related to the imperfections of image acquisition devices and/or environmental effects. Accordingly, given the degraded measurements of an image of interest, the fundamental goal of image reconstruction is to recover its close approximation, thereby "reversing" the effect of image degradation. Moreover, the massive production and proliferation of digital data across different fields of applied sciences creates the need for methods of image restoration which would be both accurate and computationally efficient. Developing such methods, however, has never been a trivial task, as improving the accuracy of image reconstruction is generally achieved at the expense of an elevated computational burden. Accordingly, the main goal of this thesis has been to develop an analytical framework which allows one to tackle a wide scope of image reconstruction problems in a computationally efficient manner. To this end, we generalize the concept of variable splitting, as a tool for simplifying complex reconstruction problems through their replacement by a sequence of simpler and therefore easily solvable ones. Moreover, we consider two different types of variable splitting and demonstrate their connection to a number of existing approaches which are currently used to solve various inverse problems. In particular, we refer to the first type of variable splitting as Bregman Type Splitting (BTS) and demonstrate its applicability to the solution of complex reconstruction problems with composite, cross-domain constraints. As specific applications of practical importance, we consider the problem of reconstruction of diffusion MRI signals from sub-critically sampled, incomplete data as well as the problem of blind deconvolution of medical ultrasound images. Further, we refer to the second type of variable splitting as Fuzzy Clustering Splitting (FCS) and show its application to the problem of image denoising. Specifically, we demonstrate how this splitting technique allows us to generalize the concept of neighbourhood operation as well as to derive a unifying approach to denoising of imaging data under a variety of different noise scenarios
    corecore