27 research outputs found

    Sparse and Redundant Representations for Inverse Problems and Recognition

    Get PDF
    Sparse and redundant representation of data enables the description of signals as linear combinations of a few atoms from a dictionary. In this dissertation, we study applications of sparse and redundant representations in inverse problems and object recognition. Furthermore, we propose two novel imaging modalities based on the recently introduced theory of Compressed Sensing (CS). This dissertation consists of four major parts. In the first part of the dissertation, we study a new type of deconvolution algorithm that is based on estimating the image from a shearlet decomposition. Shearlets provide a multi-directional and multi-scale decomposition that has been mathematically shown to represent distributed discontinuities such as edges better than traditional wavelets. We develop a deconvolution algorithm that allows for the approximation inversion operator to be controlled on a multi-scale and multi-directional basis. Furthermore, we develop a method for the automatic determination of the threshold values for the noise shrinkage for each scale and direction without explicit knowledge of the noise variance using a generalized cross validation method. In the second part of the dissertation, we study a reconstruction method that recovers highly undersampled images assumed to have a sparse representation in a gradient domain by using partial measurement samples that are collected in the Fourier domain. Our method makes use of a robust generalized Poisson solver that greatly aids in achieving a significantly improved performance over similar proposed methods. We will demonstrate by experiments that this new technique is more flexible to work with either random or restricted sampling scenarios better than its competitors. In the third part of the dissertation, we introduce a novel Synthetic Aperture Radar (SAR) imaging modality which can provide a high resolution map of the spatial distribution of targets and terrain using a significantly reduced number of needed transmitted and/or received electromagnetic waveforms. We demonstrate that this new imaging scheme, requires no new hardware components and allows the aperture to be compressed. Also, it presents many new applications and advantages which include strong resistance to countermesasures and interception, imaging much wider swaths and reduced on-board storage requirements. The last part of the dissertation deals with object recognition based on learning dictionaries for simultaneous sparse signal approximations and feature extraction. A dictionary is learned for each object class based on given training examples which minimize the representation error with a sparseness constraint. A novel test image is then projected onto the span of the atoms in each learned dictionary. The residual vectors along with the coefficients are then used for recognition. Applications to illumination robust face recognition and automatic target recognition are presented

    Feature Selection in Image Databases

    Get PDF
    Even though the problem of determining the number of features required to provide an acceptable classification performance has been a topic of interest to the researchers in the pattern recognition community for a few decades, a formal method for solving this problem still does not exist. For instance, the well-known dimensionality reduction method of principal component analysis (PCA) sorts the features it generates in the order of their importance, but it does not provide a mechanism for determining the number of sorted features that need to be retained for a meaningful classification. Discrete wavelet transform (DWT) is another linear transformation used for data compaction, in which the coefficients in the transform domain can be sorted in different orders depending on their importance. However, the question of determining the number of features to be retained for a good classification of the data remains unanswered. The objective of this study is to develop schemes for determining the number of features in the PCA and DWT domains that are sufficient for a classifier to provide a maximum possible classifiability of the samples in these transform domains. The energy content of the DWT and PCA coefficients of practical signals follow a specific pattern. The proposed schemes, by exploiting this property of the signals, develop criteria that are based on maintaining the energy of the ensemble of the feature vectors as their dimensionality is reduced. Within this unifying theme, in this thesis, the problem of dimension reduction is investigated when the features are generated by the linear transformation techniques of the discrete wavelet transform and the principal component analysis, and by the nonlinear technique of kernel principal component analysis. The first part of this study is concerned with developing a criterion for determining the number of coefficients when the features are represented as wavelet coefficients. The reduction in the dimensionality of the feature vectors is performed by letting the matrices of the wavelet coefficients of the data samples to undergo the process of Morton scanning and choosing a set of a fixed number of coefficients from these matrices whose energy content approaches to that of the original set of all the samples. In the second part of the thesis, the problem of determining a reduced dimensionality of feature vectors is investigated when the features are PCA generated. The proposed method of finding a reduced dimensionality of feature vectors is based on evaluating a cumulative distance between all the pairs of distinct clusters with a reduced set of features and examining its proximity to the distance when all the features are included. The PCA methods for data classification work well when the distinct clusters are linearly separable. For clusters that are nonlinearly separable, the kernel versions of PCA (KPCA) prove to be more efficient for generating features. The method developed in the second part of this thesis for obtaining the reduced dimensionality of the PCA based feature vectors cannot be readily extended to the kernel space because of the lack of availability of the feature vectors in an explicit form in this space. Therefore, the third part of this study develops a suitable criterion for obtaining reduced dimensionality of the feature vectors when they are generated by a kernel PCA. Extensive experiments are performed on a series of image databases to demonstrate the effectiveness of the criteria developed in this study for predicting the number of features to be retained. It is shown that there is a direct correlation between the expressions developed for the criteria and the classification accuracy as functions of the number of features retained. The results of the experiments show that with the use of the three feature selection techniques, a classifier can provide its maximum classifiability, that is, a classifiability attained by the uncompressed feature vectors, with only a small fraction of the original features. The robustness of the proposed methods is also investigated by applying them to noise-corrupted images

    No intruders - securing face biometric systems from spoofing attacks

    Get PDF
    The use of face verification systems as a primary source of authentication has been very common over past few years. Better and more reliable face recognition system are coming into existence. But despite of the advance in face recognition systems, there are still many open breaches left in this domain. One of the practical challenge is to secure face biometric systems from intruder’s attacks, where an unauthorized person tries to gain access by showing the counterfeit evidence in front of face biometric system. The face-biometric system having only single 2-D camera is unaware that it is facing an attack by an unauthorized person. The idea here is to propose a solution which can be easily integrated to the existing systems without any additional hardware deployment. This field of detection of imposter attempts is still an open research problem, as more sophisticated and advanced spoofing attempts come into play. In this thesis, the problem of securing the biometric systems from these unauthorized or spoofing attacks is addressed. Moreover, independent multi-view face detection framework is also proposed in this thesis. We proposed three different counter-measures which can detect these imposter attempts and can be easily integrated into existing systems. The proposed solutions can run parallel with face recognition module. Mainly, these counter-measures are proposed to encounter the digital photo, printed photo and dynamic videos attacks. To exploit the characteristics of these attacks, we used a large set of features in the proposed solutions, namely local binary patterns, gray-level co-occurrence matrix, Gabor wavelet features, space-time autocorrelation of gradients, image quality based features. We further performed extensive evaluations of these approaches on two different datasets. Support Vector Machine (SVM) with the linear kernel and Partial Least Square Regression (PLS) are used as the classifier for classification. The experimental results improve the current state-of-the-art reference techniques under the same attach categories

    Nonlinear Adaptive Diffusion Models for Image Denoising

    Full text link
    Most of digital image applications demand on high image quality. Unfortunately, images often are degraded by noise during the formation, transmission, and recording processes. Hence, image denoising is an essential processing step preceding visual and automated analyses. Image denoising methods can reduce image contrast, create block or ring artifacts in the process of denoising. In this dissertation, we develop high performance non-linear diffusion based image denoising methods, capable to preserve edges and maintain high visual quality. This is attained by different approaches: First, a nonlinear diffusion is presented with robust M-estimators as diffusivity functions. Secondly, the knowledge of textons derived from Local Binary Patterns (LBP) which unify divergent statistical and structural models of the region analysis is utilized to adjust the time step of diffusion process. Next, the role of nonlinear diffusion which is adaptive to the local context in the wavelet domain is investigated, and the stationary wavelet context based diffusion (SWCD) is developed for performing the iterative shrinkage. Finally, we develop a locally- and feature-adaptive diffusion (LFAD) method, where each image patch/region is diffused individually, and the diffusivity function is modified to incorporate the Inverse Difference Moment as a local estimate of the gradient. Experiments have been conducted to evaluate the performance of each of the developed method and compare it to the reference group and to the state-of-the-art methods

    Robust density modelling using the student's t-distribution for human action recognition

    Full text link
    The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE

    Enhanced processing methods for light field imaging

    Full text link
    The light field camera provides rich textural and geometric information, but it is still challenging to use it efficiently and accurately to solve computer vision problems. Light field image processing is divided into multiple levels. First, low-level processing technology mainly includes the acquisition of light field images and their preprocessing. Second, the middle-level process consists of the depth estimation, light field encoding, and the extraction of cues from the light field. Third, high-level processing involves 3D reconstruction, target recognition, visual odometry, image reconstruction, and other advanced applications. We propose a series of improved algorithms for each of these levels. The light field signal contains rich angular information. By contrast, traditional computer vision methods, as used for 2D images, often cannot make full use of the high-frequency part of the light field angular information. We propose a fast pre-estimation algorithm to enhance the light field feature to improve its speed and accuracy when keeping full use of the angular information.Light field filtering and refocusing are essential cues in light field signal processing. Modern frequency domain filtering technology and wavelet technology have effectively improved light field filtering accuracy but may fail at object edges. We adapted the sub-window filtering with the light field to improve the reconstruction of object edges. Light field images can analyze the effects of scattering and refraction phenomena, and there are still insufficient metrics to evaluate the results. Therefore, we propose a physical rendering-based light field dataset that simulates the distorted light field image through a transparent medium, such as atmospheric turbulence or water surface. The neural network is an essential method to process complex light field data. We propose an efficient 3D convolutional autoencoder network for the light field structure. This network overcomes the severe distortion caused by high-intensity turbulence with limited angular resolution and solves the difficulty of pixel matching between distorted images. This work emphasizes the application and usefulness of light field imaging in computer vision whilst improving light field image processing speed and accuracy through signal processing, computer graphics, computer vision, and artificial neural networks

    Restoration and Domain Adaptation for Unconstrained Face Recognition

    Get PDF
    Face recognition (FR) has received great attention and tremendous progress has been made during the past two decades. While FR at close range under controlled acquisition conditions has achieved a high level of performance, FR at a distance under unconstrained environment remains a largely unsolved problem. This is because images collected from a distance usually suffer from blur, poor illumination, pose variation etc. In this dissertation, we present models and algorithms to compensate for these variations to improve the performance for FR at a distance. Blur is a common factor contributing to the degradation of images collected from a distance, e.g., defocus blur due to long range acquisition, motion blur due to movement of subjects. For this purpose, we study the image deconvolution problem. This is an ill-posed problem, and solutions are usually obtained by exploiting prior information of desired output image to reduce ambiguity, typically through the Bayesian framework. In this dissertation, we consider the role of an example driven manifold prior to address the deconvolution problem. Specifically, we incorporate unlabeled image data of the object class in the form of a patch manifold to effectively regularize the inverse problem. We propose both parametric and non-parametric approaches to implicitly estimate the manifold prior from the given unlabeled data. Extensive experiments show that our method performs better than many competitive image deconvolution methods. More often, variations from the collected images at a distance are difficult to address through physical models of individual degradations. For this problem, we utilize domain adaptation methods to adapt recognition systems to the test data. Domain adaptation addresses the problem where data instances of a source domain have different distributions from that of a target domain. We focus on the unsupervised domain adaptation problem where labeled data are not available in the target domain. We propose to interpolate subspaces through dictionary learning to link the source and target domains. These subspaces are able to capture the intrinsic domain shift and form a shared feature representation for cross domain recognition. Experimental results on publicly available datasets demonstrate the effectiveness of our approach for face recognition across pose, blur and illumination variations, and cross dataset object classification. Most existing domain adaptation methods assume homogeneous source domain which is usually modeled by a single subspace. Yet in practice, oftentimes we are given mixed source data with different inner characteristics. Modeling these source data as a single domain would potentially deteriorate the adaptation performance, as the adaptation procedure needs to account for the large within class variations in the source domain. For this problem, we propose two approaches to mitigate the heterogeneity in source data. We first present an approach for selecting a subset of source samples which is more similar to the target domain to avoid negative knowledge transfer. We then consider the scenario that the heterogenous source data are due to multiple latent domains. For this purpose, we derive a domain clustering framework to recover the latent domains for improved adaptation. Moreover, we formulate submodular objective functions which can be solved by an efficient greedy method. Experimental results show that our approaches compare favorably with the state-of-the-art

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Análise de propriedades intrínsecas e extrínsecas de amostras biométricas para detecção de ataques de apresentação

    Get PDF
    Orientadores: Anderson de Rezende Rocha, Hélio PedriniTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Os recentes avanços nas áreas de pesquisa em biometria, forense e segurança da informação trouxeram importantes melhorias na eficácia dos sistemas de reconhecimento biométricos. No entanto, um desafio ainda em aberto é a vulnerabilidade de tais sistemas contra ataques de apresentação, nos quais os usuários impostores criam amostras sintéticas, a partir das informações biométricas originais de um usuário legítimo, e as apresentam ao sensor de aquisição procurando se autenticar como um usuário válido. Dependendo da modalidade biométrica, os tipos de ataque variam de acordo com o tipo de material usado para construir as amostras sintéticas. Por exemplo, em biometria facial, uma tentativa de ataque é caracterizada quando um usuário impostor apresenta ao sensor de aquisição uma fotografia, um vídeo digital ou uma máscara 3D com as informações faciais de um usuário-alvo. Em sistemas de biometria baseados em íris, os ataques de apresentação podem ser realizados com fotografias impressas ou com lentes de contato contendo os padrões de íris de um usuário-alvo ou mesmo padrões de textura sintéticas. Nos sistemas biométricos de impressão digital, os usuários impostores podem enganar o sensor biométrico usando réplicas dos padrões de impressão digital construídas com materiais sintéticos, como látex, massa de modelar, silicone, entre outros. Esta pesquisa teve como objetivo o desenvolvimento de soluções para detecção de ataques de apresentação considerando os sistemas biométricos faciais, de íris e de impressão digital. As linhas de investigação apresentadas nesta tese incluem o desenvolvimento de representações baseadas nas informações espaciais, temporais e espectrais da assinatura de ruído; em propriedades intrínsecas das amostras biométricas (e.g., mapas de albedo, de reflectância e de profundidade) e em técnicas de aprendizagem supervisionada de características. Os principais resultados e contribuições apresentadas nesta tese incluem: a criação de um grande conjunto de dados publicamente disponível contendo aproximadamente 17K videos de simulações de ataques de apresentações e de acessos genuínos em um sistema biométrico facial, os quais foram coletados com a autorização do Comitê de Ética em Pesquisa da Unicamp; o desenvolvimento de novas abordagens para modelagem e análise de propriedades extrínsecas das amostras biométricas relacionadas aos artefatos que são adicionados durante a fabricação das amostras sintéticas e sua captura pelo sensor de aquisição, cujos resultados de desempenho foram superiores a diversos métodos propostos na literature que se utilizam de métodos tradicionais de análise de images (e.g., análise de textura); a investigação de uma abordagem baseada na análise de propriedades intrínsecas das faces, estimadas a partir da informação de sombras presentes em sua superfície; e, por fim, a investigação de diferentes abordagens baseadas em redes neurais convolucionais para o aprendizado automático de características relacionadas ao nosso problema, cujos resultados foram superiores ou competitivos aos métodos considerados estado da arte para as diferentes modalidades biométricas consideradas nesta tese. A pesquisa também considerou o projeto de eficientes redes neurais com arquiteturas rasas capazes de aprender características relacionadas ao nosso problema a partir de pequenos conjuntos de dados disponíveis para o desenvolvimento e a avaliação de soluções para a detecção de ataques de apresentaçãoAbstract: Recent advances in biometrics, information forensics, and security have improved the recognition effectiveness of biometric systems. However, an ever-growing challenge is the vulnerability of such systems against presentation attacks, in which impostor users create synthetic samples from the original biometric information of a legitimate user and show them to the acquisition sensor seeking to authenticate themselves as legitimate users. Depending on the trait used by the biometric authentication, the attack types vary with the type of material used to build the synthetic samples. For instance, in facial biometric systems, an attempted attack is characterized by the type of material the impostor uses such as a photograph, a digital video, or a 3D mask with the facial information of a target user. In iris-based biometrics, presentation attacks can be accomplished with printout photographs or with contact lenses containing the iris patterns of a target user or even synthetic texture patterns. In fingerprint biometric systems, impostor users can deceive the authentication process using replicas of the fingerprint patterns built with synthetic materials such as latex, play-doh, silicone, among others. This research aimed at developing presentation attack detection (PAD) solutions whose objective is to detect attempted attacks considering different attack types, in each modality. The lines of investigation presented in this thesis aimed at devising and developing representations based on spatial, temporal and spectral information from noise signature, intrinsic properties of the biometric data (e.g., albedo, reflectance, and depth maps), and supervised feature learning techniques, taking into account different testing scenarios including cross-sensor, intra-, and inter-dataset scenarios. The main findings and contributions presented in this thesis include: the creation of a large and publicly available benchmark containing 17K videos of presentation attacks and bona-fide presentations simulations in a facial biometric system, whose collect were formally authorized by the Research Ethics Committee at Unicamp; the development of novel approaches to modeling and analysis of extrinsic properties of biometric samples related to artifacts added during the manufacturing of the synthetic samples and their capture by the acquisition sensor, whose results were superior to several approaches published in the literature that use traditional methods for image analysis (e.g., texture-based analysis); the investigation of an approach based on the analysis of intrinsic properties of faces, estimated from the information of shadows present on their surface; and the investigation of different approaches to automatically learning representations related to our problem, whose results were superior or competitive to state-of-the-art methods for the biometric modalities considered in this thesis. We also considered in this research the design of efficient neural networks with shallow architectures capable of learning characteristics related to our problem from small sets of data available to develop and evaluate PAD solutionsDoutoradoCiência da ComputaçãoDoutor em Ciência da Computação140069/2016-0 CNPq, 142110/2017-5CAPESCNP
    corecore