183 research outputs found

    Performance analysis of different matrix decomposition methods on face recognition

    Get PDF
    Applications using face biometric are ubiquitous in various domains. We propose an efficient method using Discrete Wavelet Transform (DWT), Extended Directional Binary codes (EDBC), three matrix decompositions and Singular Value Decomposition (SVD) for face recognition. The combined effect of Schur, Hessenberg and QR matrix decompositions are utilized with existing algorithm. The discrimination power between two different persons is justified using Average Overall Deviation (AOD) parameter. Fused EDBC and SVD features are considered for performance calculation. City-block and Euclidean Distance (ED) measure is used for matching. Performance is improved on YALE, GTAV and ORL face databases compared with existing methods

    Performance analysis of different matrix decomposition methods on face recognition

    Full text link

    Contribution to supervised representation learning: algorithms and applications.

    Get PDF
    278 p.In this thesis, we focus on supervised learning methods for pattern categorization. In this context, itremains a major challenge to establish efficient relationships between the discriminant properties of theextracted features and the inter-class sparsity structure.Our first attempt to address this problem was to develop a method called "Robust Discriminant Analysiswith Feature Selection and Inter-class Sparsity" (RDA_FSIS). This method performs feature selectionand extraction simultaneously. The targeted projection transformation focuses on the most discriminativeoriginal features while guaranteeing that the extracted (or transformed) features belonging to the sameclass share a common sparse structure, which contributes to small intra-class distances.In a further study on this approach, some improvements have been introduced in terms of theoptimization criterion and the applied optimization process. In fact, we proposed an improved version ofthe original RDA_FSIS called "Enhanced Discriminant Analysis with Class Sparsity using GradientMethod" (EDA_CS). The basic improvement is twofold: on the first hand, in the alternatingoptimization, we update the linear transformation and tune it with the gradient descent method, resultingin a more efficient and less complex solution than the closed form adopted in RDA_FSIS.On the other hand, the method could be used as a fine-tuning technique for many feature extractionmethods. The main feature of this approach lies in the fact that it is a gradient descent based refinementapplied to a closed form solution. This makes it suitable for combining several extraction methods andcan thus improve the performance of the classification process.In accordance with the above methods, we proposed a hybrid linear feature extraction scheme called"feature extraction using gradient descent with hybrid initialization" (FE_GD_HI). This method, basedon a unified criterion, was able to take advantage of several powerful linear discriminant methods. Thelinear transformation is computed using a descent gradient method. The strength of this approach is thatit is generic in the sense that it allows fine tuning of the hybrid solution provided by different methods.Finally, we proposed a new efficient ensemble learning approach that aims to estimate an improved datarepresentation. The proposed method is called "ICS Based Ensemble Learning for Image Classification"(EM_ICS). Instead of using multiple classifiers on the transformed features, we aim to estimate multipleextracted feature subsets. These were obtained by multiple learned linear embeddings. Multiple featuresubsets were used to estimate the transformations, which were ranked using multiple feature selectiontechniques. The derived extracted feature subsets were concatenated into a single data representationvector with strong discriminative properties.Experiments conducted on various benchmark datasets ranging from face images, handwritten digitimages, object images to text datasets showed promising results that outperformed the existing state-ofthe-art and competing methods

    Detección y localización automática de la superficie facial en imágenes 3d de una profundidad

    Get PDF
    La detección del rostro es uno de los primeros pasos en las aplicaciones de procesamiento facial, cuyo propósito es identificar y localizar todos los rostros que se encuentran en una imagen, independientemente de su posición, escala y orientaciones. En esta tesis se reporta la investigación realizada en la Maestría en Ciencias de la Ingeniería de la Universidad Autónoma del Estado de México para detectar y localizar automáticamente la superficie facial utilizando imágenes 3D de una profundidad con variaciones en número y posición de las personas en la escena. Para esto se revisó la literatura relacionada y se discute la problemática actual en la detección de la superficie facial en imágenes 3D. Se propone y evalúa experimentalmente una técnica para la detección de la superficie facial automáticamente con variaciones en el número de usuarios y su posición con respecto a la cámara. Esta técnica consta de cuatro etapas: a) Eliminación de partes planas en la imagen 3D utilizando el algoritmo MSAC, b) Análisis de curvatura para detectar puntos convexos, c) Selección automática de la punta de la nariz utilizando SpinImages y d) Clasificación rostro o no rostro. En esta investigación se utilizó el sensor Kinect One™, para colectar una base de datos de 1020 imágenes 2D/3D variando el número y posición en la escena de treinta y dos sujetos. Las imágenes 3D colectadas fueron utilizadas para la evaluación de nuestro proceso de detección de la superficie facial. Este proceso ha sido capaz de detectar con éxito el 91% de los rostros en las imágenes colectadas que contienen más de un sujeto en la escena, y el 100% y 98% de los rostros contenidos en las bases de datos Face Recognition Grand Challenge y CurtinFaces, respectivamente. En ese sentido, se obtuvieron resultados que son alentadores para abordar el problema general en la detección de la superficie facial en condiciones no controladas, con más de una persona en la escena y variación en su posición con el objetivo de crear sistemas de procesamiento facial automáticos.CONACYT (CVU: 634473

    Machine Learning

    Get PDF
    Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience

    Efficient Learning Framework for Training Deep Learning Models with Limited Supervision

    Get PDF
    In recent years, deep learning has shown tremendous success in different applications, however these modes mostly need a large labeled dataset for training their parameters. In this work, we aim to explore the potentials of efficient learning frameworks for training deep models on different problems in the case of limited supervision or noisy labels. For the image clustering problem, we introduce a new deep convolutional autoencoder with an unsupervised learning framework. We employ a relative entropy minimization as the clustering objective regularized by the frequency of cluster assignments and a reconstruction loss. In the case of noisy labels obtained by crowdsourcing platforms, we proposed a novel deep hybrid model for sentiment analysis of text data like tweets based on noisy crowd labels. The proposed model consists of a crowdsourcing aggregation model and a deep text autoencoder. We combine these sub-models based on a probabilistic framework rather than a heuristic way, and derive an efficient optimization algorithm to jointly solve the corresponding problem. In order to improve the performance of unsupervised deep hash functions on image similarity search in big datasets, we adopt generative adversarial networks to propose a new deep image retrieval model, where the adversarial loss is employed as a data-dependent regularization in our objective function. We also introduce a balanced self-paced learning algorithm for training a GAN-based model for image clustering, where the input samples are gradually included into training from easy to difficult, while the diversity of selected samples from all clusters are also considered. In addition, we explore adopting discriminative approaches for unsupervised visual representation learning rather than the generative algorithms, such as maximizing the mutual information between an input image and its representation and a contrastive loss for decreasing the distance between the representations of original and augmented image data

    Reconnaissance de visage robuste aux occultations

    Get PDF
    Face recognition is an important technology in computer vision, which often acts as an essential component in biometrics systems, HCI systems, access control systems, multimedia indexing applications, etc. Partial occlusion, which significantly changes the appearance of part of a face, cannot only cause large performance deterioration of face recognition, but also can cause severe security issues. In this thesis, we focus on the occlusion problem in automatic face recognition in non-controlled environments. Toward this goal, we propose a framework that consists of applying explicit occlusion analysis and processing to improve face recognition under different occlusion conditions. We demonstrate in this thesis that the proposed framework is more efficient than the methods based on non-explicit occlusion treatments from the literature. We identify two new types of facial occlusions, namely the sparse occlusion and dynamic occlusion. Solutions are presented to handle the identified occlusion problems in more advanced surveillance context. Recently, the emerging Kinect sensor has been successfully applied in many computer vision fields. We introduce this new sensor in the context of face recognition, particularly in presence of occlusions, and demonstrate its efficiency compared with traditional 2D cameras. Finally, we propose two approaches based on 2D and 3D to improve the baseline face recognition techniques. Improving the baseline methods can also have the positive impact on the recognition results when partial occlusion occurs.La reconnaissance faciale est une technologie importante en vision par ordinateur, avec un rôle central en biométrie, interface homme-machine, contrôle d’accès, indexation multimédia, etc. L’occultation partielle, qui change complétement l’apparence d’une partie du visage, ne provoque pas uniquement une dégradation des performances en reconnaissance faciale, mai peut aussi avoir des conséquences en termes de sécurité. Dans cette thèse, nous concentrons sur le problème des occultations en reconnaissance faciale en environnements non contrôlés. Nous proposons une séquence qui consiste à analyser de manière explicite les occultations et à fiabiliser la reconnaissance faciale soumises à diverses occultations. Nous montrons dans cette thèse que l’approche proposée est plus efficace que les méthodes de l’état de l’art opérant sans traitement explicite dédié aux occultations. Nous identifions deux nouveaux types d’occultations, à savoir éparses et dynamiques. Des solutions sont introduites pour gérer ces problèmes d’occultation nouvellement identifiés dans un contexte de vidéo surveillance avancé. Récemment, le nouveau capteur Kinect a été utilisé avec succès dans de nombreuses applications en vision par ordinateur. Nous introduisons ce nouveau capteur dans le contexte de la reconnaissance faciale, en particulier en présence d’occultations, et démontrons son efficacité par rapport aux caméras traditionnelles. Finalement, nous proposons deux approches basées 2D et 3D permettant d’améliorer les techniques de base en reconnaissance de visages. L’amélioration des méthodes de base peut alors générer un impact positif sur les résultats de reconnaissance en présence d’occultations

    Learning from noisy data through robust feature selection, ensembles and simulation-based optimization

    Get PDF
    The presence of noise and uncertainty in real scenarios makes machine learning a challenging task. Acquisition errors or missing values can lead to models that do not generalize well on new data. Under-fitting and over-fitting can occur because of feature redundancy in high-dimensional problems as well as data scarcity. In these contexts the learning task can show difficulties in extracting relevant and stable information from noisy features or from a limited set of samples with high variance. In some extreme cases, the presence of only aggregated data instead of individual samples prevents the use of instance-based learning. In these contexts, parametric models can be learned through simulations to take into account the inherent stochastic nature of the processes involved. This dissertation includes contributions to different learning problems characterized by noise and uncertainty. In particular, we propose i) a novel approach for robust feature selection based on the neighborhood entropy, ii) an approach based on ensembles for robust salary prediction in the IT job market, and iii) a parametric simulation-based approach for dynamic pricing and what-if analyses in hotel revenue management when only aggregated data are available

    Neural network based image representation for small scale object recognition

    Get PDF
    Object recognition can be abstractedly viewed as a two-stage process. The features learning stage selects key information that can represent the input image in a compact, robust, and discriminative manner in some feature space. Then the classification stage learns the rules to differentiate object classes based on the representations of their images in feature space. Consequently, if the first stage can produce a highly separable features set, simple and cost-effective classifiers can be used to make the recognition system more applicable in practice. Features, or representations, used to be engineered manually with different assumptions about the data population to limit the complexity in a manageable range. As more practical problems are tackled, those assumptions are no longer valid, and so are the representations built on them. More parameters and test cases have to be considered in those new challenges, that causes manual engineering to become too complicated. Machine learning approaches ease those difficulties by allowing computer to learn to identify the appropriate representation automatically. As the number of parameters increases with the divergence of data, it is always beneficial to eliminate irrelevant information from input data to reduce the complexity of learning. Chapter 3 of the thesis reports the study case where removal of colour leads to an improvement in recognition accuracy. Deep learning appears to be a very strong representation learner with new achievements coming in monthly basic. While training the phase of deep structures requires huge amount of data, tremendous calculation, and careful calibration, the inferencing phase is affordable and straightforward. Utilizing knowledge in trained deep networks is therefore promising for efficient feature extraction in smaller systems. Many approaches have been proposed under the name of “transfer learning”, aimed to take advantage of that “deep knowledge”. However, the results achieved so far could be classified as a learning room for improvement. Chapter 4 presents a new method to utilize a trained deep convolutional structure as a feature extractor and achieved state-of-the-art accuracy on the Washington RGBD dataset. Despite some good results, the potential of transfer learning is just barely exploited. On one hand, a dimensionality reduction can be used to make the deep neural network representation even more computationally efficient and allow a wider range of use cases. Inspired by the structure of the network itself, a new random orthogonal projection method for the dimensionality reduction is presented in the first half of Chapter 5. The t-SNE mimicking neural network for low-dimensional embedding is also discussed in this part with promising results. In another approach, feature encoding can be used to improve deep neural network features for classification applications. Thanks to the spatially organized structure, deep neural network features can be considered as local image descriptors, and thus the traditional feature encoding approaches such as the Fisher vector can be applied to improve those features. This method combines the advantages of both discriminative learning and generative learning to boost the features performance in difficult scenarios such as when data is noisy or incomplete. The problem of high dimensionality in deep neural network features is alleviated with the use of the Fisher vector based on sparse coding, where infinite number of Gaussian mixtures was used to model the feature space. In the second half of Chapter 5, the regularized Fisher encoding was shown to be effective in improving classification results on difficult classes. Also, the low cost incremental k-means learning was shown to be a potential dictionary learning approach that can be used to replace the slow and computationally expensive sparse coding method
    corecore