89 research outputs found

    Biometric face recognition using multilinear projection and artificial intelligence

    Get PDF
    PhD ThesisNumerous problems of automatic facial recognition in the linear and multilinear subspace learning have been addressed; nevertheless, many difficulties remain. This work focuses on two key problems for automatic facial recognition and feature extraction: object representation and high dimensionality. To address these problems, a bidirectional two-dimensional neighborhood preserving projection (B2DNPP) approach for human facial recognition has been developed. Compared with 2DNPP, the proposed method operates on 2-D facial images and performs reductions on the directions of both rows and columns of images. Furthermore, it has the ability to reveal variations between these directions. To further improve the performance of the B2DNPP method, a new B2DNPP based on the curvelet decomposition of human facial images is introduced. The curvelet multi- resolution tool enhances the edges representation and other singularities along curves, and thus improves directional features. In this method, an extreme learning machine (ELM) classifier is used which significantly improves classification rate. The proposed C-B2DNPP method decreases error rate from 5.9% to 3.5%, from 3.7% to 2.0% and from 19.7% to 14.2% using ORL, AR, and FERET databases compared with 2DNPP. Therefore, it achieves decreases in error rate more than 40%, 45%, and 27% respectively with the ORL, AR, and FERET databases. Facial images have particular natural structures in the form of two-, three-, or even higher-order tensors. Therefore, a novel method of supervised and unsupervised multilinear neighborhood preserving projection (MNPP) is proposed for face recognition. This allows the natural representation of multidimensional images 2-D, 3-D or higher-order tensors and extracts useful information directly from tensotial data rather than from matrices or vectors. As opposed to a B2DNPP which derives only two subspaces, in the MNPP method multiple interrelated subspaces are obtained over different tensor directions, so that the subspaces are learned iteratively by unfolding the tensor along the different directions. The performance of the MNPP has performed in terms of the two modes of facial recognition biometrics systems of identification and verification. The proposed supervised MNPP method achieved decrease over 50.8%, 75.6%, and 44.6% in error rate using ORL, AR, and FERET databases respectively, compared with 2DNPP. Therefore, the results demonstrate that the MNPP approach obtains the best overall performance in various learning scenarios

    A Survey on Ear Biometrics

    No full text
    Recognizing people by their ear has recently received significant attention in the literature. Several reasons account for this trend: first, ear recognition does not suffer from some problems associated with other non contact biometrics, such as face recognition; second, it is the most promising candidate for combination with the face in the context of multi-pose face recognition; and third, the ear can be used for human recognition in surveillance videos where the face may be occluded completely or in part. Further, the ear appears to degrade little with age. Even though, current ear detection and recognition systems have reached a certain level of maturity, their success is limited to controlled indoor conditions. In addition to variation in illumination, other open research problems include hair occlusion; earprint forensics; ear symmetry; ear classification; and ear individuality. This paper provides a detailed survey of research conducted in ear detection and recognition. It provides an up-to-date review of the existing literature revealing the current state-of-art for not only those who are working in this area but also for those who might exploit this new approach. Furthermore, it offers insights into some unsolved ear recognition problems as well as ear databases available for researchers

    Applying novel machine learning technology to optimize computer-aided detection and diagnosis of medical images

    Get PDF
    The purpose of developing Computer-Aided Detection (CAD) schemes is to assist physicians (i.e., radiologists) in interpreting medical imaging findings and reducing inter-reader variability more accurately. In developing CAD schemes, Machine Learning (ML) plays an essential role because it is widely used to identify effective image features from complex datasets and optimally integrate them with the classifiers, which aims to assist the clinicians to more accurately detect early disease, classify disease types and predict disease treatment outcome. In my dissertation, in different studies, I assess the feasibility of developing several novel CAD systems in the area of medical imaging for different purposes. The first study aims to develop and evaluate a new computer-aided diagnosis (CADx) scheme based on analysis of global mammographic image features to predict the likelihood of cases being malignant. CADx scheme is applied to pre-process mammograms, generate two image maps in the frequency domain using discrete cosine transform and fast Fourier transform, compute bilateral image feature differences from left and right breasts, and apply a support vector machine (SVM) method to predict the likelihood of the case being malignant. This study demonstrates the feasibility of developing a new global image feature analysis based CADx scheme of mammograms with high performance. This new CADx approach is more efficient in development and potentially more robust in future applications by avoiding difficulty and possible errors in breast lesion segmentation. In the second study, to automatically identify a set of effective mammographic image features and build an optimal breast cancer risk stratification model, I investigate advantages of applying a machine learning approach embedded with a locally preserving projection (LPP) based feature combination and regeneration algorithm to predict short-term breast cancer risk. To this purpose, a computer-aided image processing scheme is applied to segment fibro-glandular tissue depicted on mammograms and initially compute 44 features related to the bilateral asymmetry of mammographic tissue density distribution between left and right breasts. Next, an embedded LLP algorithm optimizes the feature space and regenerates a new operational vector with 4 features using a maximal variance approach. This study demonstrates that applying the LPP algorithm effectively reduces feature dimensionality, and yields higher and potentially more robust performance in predicting short-term breast cancer risk. In the third study, to more precisely classify malignant lesions, I investigate the feasibility of applying a random projection algorithm to build an optimal feature vector from the initially CAD-generated large feature pool and improve the performance of the machine learning model. In this process, a CAD scheme is first applied to segment mass regions and initially compute 181 features. An SVM model embedded with the feature dimensionality reduction method is then built to predict the likelihood of lesions being malignant. This study demonstrates that the random project algorithm is a promising method to generate optimal feature vectors to improve the performance of machine learning models of medical images. The last study aims to develop and test a new CAD scheme of chest X-ray images to detect coronavirus (COVID-19) infected pneumonia. To this purpose, the CAD scheme first applies two image preprocessing steps to remove the majority of diaphragm regions, process the original image using a histogram equalization algorithm, and a bilateral low-pass filter. Then, the original image and two filtered images are used to form a pseudo color image. This image is fed into three input channels of a transfer learning-based convolutional neural network (CNN) model to classify chest X-ray images into 3 classes of COVID-19 infected pneumonia, other community-acquired no-COVID-19 infected pneumonia, and normal (non-pneumonia) cases. This study demonstrates that adding two image preprocessing steps and generating a pseudo color image plays an essential role in developing a deep learning CAD scheme of chest X-ray images to improve accuracy in detecting COVID-19 infected pneumonia. In summary, I developed and presented several image pre-processing algorithms, feature extraction methods, and data optimization techniques to present innovative approaches for quantitative imaging markers based on machine learning systems in all these studies. The studies' simulation and results show the discriminative performance of the proposed CAD schemes on different application fields helpful to assist radiologists on their assessments in diagnosing disease and improve their overall performance

    Individual and Inter-related Action Unit Detection in Videos for Affect Recognition

    Get PDF
    The human face has evolved to become the most important source of non-verbal information that conveys our affective, cognitive and mental state to others. Apart from human to human communication facial expressions have also become an indispensable component of human-machine interaction (HMI). Systems capable of understanding how users feel allow for a wide variety of applications in medical, learning, entertainment and marketing technologies in addition to advancements in neuroscience and psychology research and many others. The Facial Action Coding System (FACS) has been built to objectively define and quantify every possible facial movement through what is called Action Units (AU), each representing an individual facial action. In this thesis we focus on the automatic detection and exploitation of these AUs using novel appearance representation techniques as well as incorporation of the prior co-occurrence information between them. Our contributions can be grouped in three parts. In the first part, we propose to improve the detection accuracy of appearance features based on local binary patterns (LBP) for AU detection in videos. For this purpose, we propose two novel methodologies. The first one uses three fundamental image processing tools as a pre-processing step prior to the application of the LBP transform on the facial texture. These tools each enhance the descriptive ability of LBP by emphasizing different transient appearance characteristics, and are proven to increase the AU detection accuracy significantly in our experiments. The second one uses multiple local curvature Gabor binary patterns (LCGBP) for the same problem and achieves state-of-the-art performance on a dataset of mostly posed facial expressions. The curvature information of the face, as well as the proposed multiple filter size scheme is very effective in recognizing these individual facial actions. In the second part, we propose to take advantage of the co-occurrence relation between the AUs, that we can learn through training examples. We use this information in a multi-label discriminant Laplacian embedding (DLE) scheme to train our system with SIFT features extracted around the salient and transient landmarks on the face. The system is first validated on a challenging (containing lots of occlusions and head pose variations) dataset without the DLE, then we show the performance of the full system on the FERA 2015 challenge on AU occurence detection. The challenge consists of two difficult datasets that contain spontaneous facial actions at different intensities. We demonstrate that our proposed system achieves the best results on these datasets for detecting AUs. The third and last part of the thesis contains an application on how this automatic AU detection system can be used in real-life situations, particularly for detecting cognitive distraction. Our contribution in this part is two-fold: First, we present a novel visual database of people driving a simulator while being induced visual and cognitive distraction via secondary tasks. The subjects have been recorded using three near-infrared camera-lighting systems, which makes it a very suitable configuration to use in real driving conditions, i.e. with large head pose and ambient light variations. Secondly, we propose an original framework to automatically discriminate cognitive distraction sequences from baseline sequences by extracting features from continuous AU signals and by exploiting the cross-correlations between them. We achieve a very high classification accuracy in our subject-based experiments and a lower yet acceptable performance for the subject-independent tests. Based on these results we discuss how facial expressions related to this complex mental state are individual, rather than universal, and also how the proposed system can be used in a vehicle to help decrease human error in traffic accidents

    Reconnaissance de visage robuste aux occultations

    Get PDF
    Face recognition is an important technology in computer vision, which often acts as an essential component in biometrics systems, HCI systems, access control systems, multimedia indexing applications, etc. Partial occlusion, which significantly changes the appearance of part of a face, cannot only cause large performance deterioration of face recognition, but also can cause severe security issues. In this thesis, we focus on the occlusion problem in automatic face recognition in non-controlled environments. Toward this goal, we propose a framework that consists of applying explicit occlusion analysis and processing to improve face recognition under different occlusion conditions. We demonstrate in this thesis that the proposed framework is more efficient than the methods based on non-explicit occlusion treatments from the literature. We identify two new types of facial occlusions, namely the sparse occlusion and dynamic occlusion. Solutions are presented to handle the identified occlusion problems in more advanced surveillance context. Recently, the emerging Kinect sensor has been successfully applied in many computer vision fields. We introduce this new sensor in the context of face recognition, particularly in presence of occlusions, and demonstrate its efficiency compared with traditional 2D cameras. Finally, we propose two approaches based on 2D and 3D to improve the baseline face recognition techniques. Improving the baseline methods can also have the positive impact on the recognition results when partial occlusion occurs.La reconnaissance faciale est une technologie importante en vision par ordinateur, avec un rôle central en biométrie, interface homme-machine, contrôle d’accès, indexation multimédia, etc. L’occultation partielle, qui change complétement l’apparence d’une partie du visage, ne provoque pas uniquement une dégradation des performances en reconnaissance faciale, mai peut aussi avoir des conséquences en termes de sécurité. Dans cette thèse, nous concentrons sur le problème des occultations en reconnaissance faciale en environnements non contrôlés. Nous proposons une séquence qui consiste à analyser de manière explicite les occultations et à fiabiliser la reconnaissance faciale soumises à diverses occultations. Nous montrons dans cette thèse que l’approche proposée est plus efficace que les méthodes de l’état de l’art opérant sans traitement explicite dédié aux occultations. Nous identifions deux nouveaux types d’occultations, à savoir éparses et dynamiques. Des solutions sont introduites pour gérer ces problèmes d’occultation nouvellement identifiés dans un contexte de vidéo surveillance avancé. Récemment, le nouveau capteur Kinect a été utilisé avec succès dans de nombreuses applications en vision par ordinateur. Nous introduisons ce nouveau capteur dans le contexte de la reconnaissance faciale, en particulier en présence d’occultations, et démontrons son efficacité par rapport aux caméras traditionnelles. Finalement, nous proposons deux approches basées 2D et 3D permettant d’améliorer les techniques de base en reconnaissance de visages. L’amélioration des méthodes de base peut alors générer un impact positif sur les résultats de reconnaissance en présence d’occultations

    Automatic handwriter identification using advanced machine learning

    Get PDF
    Handwriter identification a challenging problem especially for forensic investigation. This topic has received significant attention from the research community and several handwriter identification systems were developed for various applications including forensic science, document analysis and investigation of the historical documents. This work is part of an investigation to develop new tools and methods for Arabic palaeography, which is is the study of handwritten material, particularly ancient manuscripts with missing writers, dates, and/or places. In particular, the main aim of this research project is to investigate and develop new techniques and algorithms for the classification and analysis of ancient handwritten documents to support palaeographic studies. Three contributions were proposed in this research. The first is concerned with the development of a text line extraction algorithm on colour and greyscale historical manuscripts. The idea uses a modified bilateral filtering approach to adaptively smooth the images while still preserving the edges through a nonlinear combination of neighboring image values. The proposed algorithm aims to compute a median and a separating seam and has been validated to deal with both greyscale and colour historical documents using different datasets. The results obtained suggest that our proposed technique yields attractive results when compared against a few similar algorithms. The second contribution proposes to deploy a combination of Oriented Basic Image features and the concept of graphemes codebook in order to improve the recognition performances. The proposed algorithm is capable to effectively extract the most distinguishing handwriter’s patterns. The idea consists of judiciously combining a multiscale feature extraction with the concept of grapheme to allow for the extraction of several discriminating features such as handwriting curvature, direction, wrinkliness and various edge-based features. The technique was validated for identifying handwriters using both Arabic and English writings captured as scanned images using the IAM dataset for English handwriting and ICFHR 2012 dataset for Arabic handwriting. The results obtained clearly demonstrate the effectiveness of the proposed method when compared against some similar techniques. The third contribution is concerned with an offline handwriter identification approach based on the convolutional neural network technology. At the first stage, the Alex-Net architecture was employed to learn image features (handwritten scripts) and the features obtained from the fully connected layers of the model. Then, a Support vector machine classifier is deployed to classify the writing styles of the various handwriters. In this way, the test scripts can be classified by the CNN training model for further classification. The proposed approach was evaluated based on Arabic Historical datasets; Islamic Heritage Project (IHP) and Qatar National Library (QNL). The obtained results demonstrated that the proposed model achieved superior performances when compared to some similar method

    Robust density modelling using the student's t-distribution for human action recognition

    Full text link
    The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE

    Novel Computer-Aided Diagnosis Schemes for Radiological Image Analysis

    Get PDF
    The computer-aided diagnosis (CAD) scheme is a powerful tool in assisting clinicians (e.g., radiologists) to interpret medical images more accurately and efficiently. In developing high-performing CAD schemes, classic machine learning (ML) and deep learning (DL) algorithms play an essential role because of their advantages in capturing meaningful patterns that are important for disease (e.g., cancer) diagnosis and prognosis from complex datasets. This dissertation, organized into four studies, investigates the feasibility of developing several novel ML-based and DL-based CAD schemes for different cancer research purposes. The first study aims to develop and test a unique radiomics-based CT image marker that can be used to detect lymph node (LN) metastasis for cervical cancer patients. A total of 1,763 radiomics features were first computed from the segmented primary cervical tumor depicted on one CT image with the maximal tumor region. Next, a principal component analysis algorithm was applied on the initial feature pool to determine an optimal feature cluster. Then, based on this optimal cluster, machine learning models (e.g., support vector machine (SVM)) were trained and optimized to generate an image marker to detect LN metastasis. The SVM based imaging marker achieved an AUC (area under the ROC curve) value of 0.841 ± 0.035. This study initially verifies the feasibility of combining CT images and the radiomics technology to develop a low-cost image marker for LN metastasis detection among cervical cancer patients. In the second study, the purpose is to develop and evaluate a unique global mammographic image feature analysis scheme to identify case malignancy for breast cancer. From the entire breast area depicted on the mammograms, 59 features were initially computed to characterize the breast tissue properties in both the spatial and frequency domain. Given that each case consists of two cranio-caudal and two medio-lateral oblique view images of left and right breasts, two feature pools were built, which contain the computed features from either two positive images of one breast or all the four images of two breasts. For each feature pool, a particle swarm optimization (PSO) method was applied to determine the optimal feature cluster followed by training an SVM classifier to generate a final score for predicting likelihood of the case being malignant. The classification performances measured by AUC were 0.79±0.07 and 0.75±0.08 when applying the SVM classifiers trained using image features computed from two-view and four-view images, respectively. This study demonstrates the potential of developing a global mammographic image feature analysis-based scheme to predict case malignancy without including an arduous segmentation of breast lesions. In the third study, given that the performance of DL-based models in the medical imaging field is generally bottlenecked by a lack of sufficient labeled images, we specifically investigate the effectiveness of applying the latest transferring generative adversarial networks (GAN) technology to augment limited data for performance boost in the task of breast mass classification. This transferring GAN model was first pre-trained on a dataset of 25,000 mammogram patches (without labels). Then its generator and the discriminator were fine-tuned on a much smaller dataset containing 1024 labeled breast mass images. A supervised loss was integrated with the discriminator, such that it can be used to directly classify the benign/malignant masses. Our proposed approach improved the classification accuracy by 6.002%, when compared with the classifiers trained without traditional data augmentation. This investigation may provide a new perspective for researchers to effectively train the GAN models on a medical imaging task with only limited datasets. Like the third study, our last study also aims to alleviate DL models’ reliance on large amounts of annotations but uses a totally different approach. We propose employing a semi-supervised method, i.e., virtual adversarial training (VAT), to learn and leverage useful information underlying in unlabeled data for better classification of breast masses. Accordingly, our VAT-based models have two types of losses, namely supervised and virtual adversarial losses. The former loss acts as in supervised classification, while the latter loss works towards enhancing the model’s robustness against virtual adversarial perturbation, thus improving model generalizability. A large CNN and a small CNN were used in this investigation, and both were trained with and without the adversarial loss. When the labeled ratios were 40% and 80%, VAT-based CNNs delivered the highest classification accuracy of 0.740±0.015 and 0.760±0.015, respectively. The experimental results suggest that the VAT-based CAD scheme can effectively utilize meaningful knowledge from unlabeled data to better classify mammographic breast mass images. In summary, several innovative approaches have been investigated and evaluated in this dissertation to develop ML-based and DL-based CAD schemes for the diagnosis of cervical cancer and breast cancer. The promising results demonstrate the potential of these CAD schemes in assisting radiologists to achieve a more accurate interpretation of radiological images

    Automatic Segmentation and Classification of Red and White Blood cells in Thin Blood Smear Slides

    Get PDF
    In this work we develop a system for automatic detection and classification of cytological images which plays an increasing important role in medical diagnosis. A primary aim of this work is the accurate segmentation of cytological images of blood smears and subsequent feature extraction, along with studying related classification problems such as the identification and counting of peripheral blood smear particles, and classification of white blood cell into types five. Our proposed approach benefits from powerful image processing techniques to perform complete blood count (CBC) without human intervention. The general framework in this blood smear analysis research is as follows. Firstly, a digital blood smear image is de-noised using optimized Bayesian non-local means filter to design a dependable cell counting system that may be used under different image capture conditions. Then an edge preservation technique with Kuwahara filter is used to recover degraded and blurred white blood cell boundaries in blood smear images while reducing the residual negative effect of noise in images. After denoising and edge enhancement, the next step is binarization using combination of Otsu and Niblack to separate the cells and stained background. Cells separation and counting is achieved by granulometry, advanced active contours without edges, and morphological operators with watershed algorithm. Following this is the recognition of different types of white blood cells (WBCs), and also red blood cells (RBCs) segmentation. Using three main types of features: shape, intensity, and texture invariant features in combination with a variety of classifiers is next step. The following features are used in this work: intensity histogram features, invariant moments, the relative area, co-occurrence and run-length matrices, dual tree complex wavelet transform features, Haralick and Tamura features. Next, different statistical approaches involving correlation, distribution and redundancy are used to measure of the dependency between a set of features and to select feature variables on the white blood cell classification. A global sensitivity analysis with random sampling-high dimensional model representation (RS-HDMR) which can deal with independent and dependent input feature variables is used to assess dominate discriminatory power and the reliability of feature which leads to an efficient feature selection. These feature selection results are compared in experiments with branch and bound method and with sequential forward selection (SFS), respectively. This work examines support vector machine (SVM) and Convolutional Neural Networks (LeNet5) in connection with white blood cell classification. Finally, white blood cell classification system is validated in experiments conducted on cytological images of normal poor quality blood smears. These experimental results are also assessed with ground truth manually obtained from medical experts
    corecore