10 research outputs found

    A MACHINE LEARNING APPROACH TO EYE BLINK DETECTION IN LOW-LIGHT VIDEOS

    Get PDF
    Inadequate lighting conditions can harm the accuracy of blink detection systems, which play a crucial role in fatigue detection technology, transportation and security applications. While some video capture devices are now equipped with flashlight technology to enhance lighting, users occasionally need to remember to activate this feature, resulting in slightly darker videos. Consequently, there is a pressing need to improve the performance of blink detection systems to detect eye accurately blinks in low light videos. This research proposes developing a machine learning-based blink detection system to see flashes in low-light videos. The Confusion matrix was conducted to evaluate the effectiveness of the proposed blink detection system. These tests involved 31 videos ranging from 5 to 10 seconds in duration. Involving male and female test subjects aged between 20 and 22. The accuracy of the proposed blink detection system was measured using the confusion matrix method. The results indicate that by leveraging a machine learning approach, the blink detection system achieved a remarkable accuracy of 100% in detecting blinks within low-light videos. However, this research necessitates further development to account for more complex and diverse real-life situations. Future studies could focus on developing more sophisticated algorithms and expanding the test subjects to improve the performance of the blink detection system in low light conditions. Such advancements would contribute to the practical application of the system in a broader range of scenarios, ultimately enhancing its effectiveness in fatigue detection technology

    Construction de masques faciaux pour améliorer la reconnaissance d'expressions

    Get PDF
    National audienceCe travail propose une méthode pour détecter de manière automatique les régions qui contribuent le plus à une bonne classification des visages par rapport à des expressions prédéfinies : joie, surprise, etc. Notre méthode détermine les régions ayant le plus, (respectivement le moins) de pouvoir discriminant en utilisant un réseau de neurones de type MultiLayer Perceptron (MLP). A partir de régions de formes et de tailles quelconques, nous créons des masques à appliquer aux images avant de les classifier. Ces masques éliminent les zones de visages non pertinentes pour le processus de classification, en augmentant ainsi la performance du système. Nous avons conduit des expériences sur les bases d'images FERET, GENKI et JAFFE. Les résultats montrent une augmentation du taux de classification en utilisant les masques désignant les pixels d'intérêt

    Gender recognition from facial images: Two or three dimensions?

    Get PDF
    © 2016 Optical Society of America. This paper seeks to compare encoded features from both two-dimensional (2D) and three-dimensional (3D) face images in order to achieve automatic gender recognition with high accuracy and robustness. The Fisher vector encoding method is employed to produce 2D, 3D, and fused features with escalated discriminative power. For 3D face analysis, a two-source photometric stereo (PS) method is introduced that enables 3D surface reconstructions with accurate details as well as desirable efficiency. Moreover, a 2D + 3D imaging device, taking the two-source PS method as its core, has been developed that can simultaneously gather color images for 2D evaluations and PS images for 3D analysis. This system inherits the superior reconstruction accuracy from the standard (three or more light) PS method but simplifies the reconstruction algorithm as well as the hardware design by only requiring two light sources. It also offers great potential for facilitating human computer interaction by being accurate, cheap, efficient, and nonintrusive. Ten types of low-level 2D and 3D features have been experimented with and encoded for Fisher vector gender recognition. Evaluations of the Fisher vector encoding method have been performed on the FERET database, Color FERET database, LFW database, and FRGCv2 database, yielding 97.7%, 98.0%, 92.5%, and 96.7% accuracy, respectively. In addition, the comparison of 2D and 3D features has been drawn from a self-collected dataset, which is constructed with the aid of the 2D + 3D imaging device in a series of data capture experiments. With a variety of experiments and evaluations, it can be proved that the Fisher vector encoding method outperforms most state-of-the-art gender recognition methods. It has also been observed that 3D features reconstructed by the two-source PS method are able to further boost the Fisher vector gender recognition performance, i.e., up to a 6% increase on the self-collected database

    Ultrasound Image Despeckling using Local Binary Pattern Weighted Linear Filtering

    Full text link

    Mineral texture identification using local binary patterns equipped with a Classification and Recognition Updating System (CARUS)

    Get PDF
    In this paper, a rotation-invariant local binary pattern operator equipped with a local contrast measure (riLBPc) is employed to characterize the type of mineral twinning by inspecting the texture properties of crystals. The proposed method uses photomicrographs of minerals and produces LBP histograms, which might be compared with those included in a predefined database using the Kullback–Leibler divergence-based metric. The paper proposes a new LBP-based scheme for concurrent classification and recognition tasks, followed by a novel online updating routine to enhance the locally developed mineral LBP database. The discriminatory power of the proposed Classification and Recognition Updating System (CARUS) for texture identification scheme is verified for plagioclase, orthoclase, microcline, and quartz minerals with sensitivity (TPR) near 99.9%, 87%, 99.9%, and 96%, and accuracy (ACC) equal to about 99%, 97%, 99%, and 99%, respectively. According to the results, the introduced CARUS system is a promising approach that can be applied in a variety of different fields dealing with classification and feature recognition tasks. © 2022 by the authors

    Human gait identification and analysis

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Human gait identification has become an active area of research due to increased security requirements. Human gait identification is a potential new tool for identifying individuals beyond traditional methods. The emergence of motion capture techniques provided a chance of high accuracy in identification because completely recorded gait information can be recorded compared with security cameras. The aim of this research was to build a practical method of gait identification and investigate the individual characteristics of gait. For this purpose, a gait identification approach was proposed, identification results were compared by different methods, and several studies about the individual characteristics of gait were performed. This research included the following: (1) a novel, effective set of gait features were proposed; (2) gait signatures were extracted by three different methods: statistical method, principal component analysis, and Fourier expansion method; (3) gait identification results were compared by these different methods; (4) two indicators were proposed to evaluate gait features for identification; (5) novel and clear definitions of gait phases and gait cycle were proposed; (6) gait features were investigated by gait phases; (7) principal component analysis and the fixing root method were used to elucidate which features were used to represent gait and why; (8) gait similarity was investigated; (9) gait attractiveness was investigated. This research proposed an efficient framework for identifying individuals from gait via a novel feature set based on 3D motion capture data. A novel evaluating method of gait signatures for identification was proposed. Three different gait signature extraction methods were applied and compared. The average identification rate was over 93%, with the best result close to 100%. This research also proposed a novel dividing method of gait phases, and the different appearances of gait features in eight gait phases were investigated. This research identified the similarities and asymmetric appearances between left body movement and right body movement in gait based on the proposed gait phase dividing method. This research also initiated an analysing method for gait features extraction by the fixing root method. A prediction model of gait attractiveness was built with reasonable accuracy by principal component analysis and linear regression of natural logarithm of parameters. A systematic relationship was observed between the motions of individual markers and the attractiveness ratings. The lower legs and feet were extracted as features of attractiveness by the fixing root method. As an extension of gait research, human seated motion was also investigated.This study is funded by the Dorothy Hodgkin Postgraduate Awards and Beijing East Gallery Co. Ltd

    Learning from imbalanced data in face re-identification using ensembles of classifiers

    Get PDF
    Face re-identification is a video surveillance application where systems for video-to-video face recognition are designed using faces of individuals captured from video sequences, and seek to recognize them when they appear in archived or live videos captured over a network of video cameras. Video-based face recognition applications encounter challenges due to variations in capture conditions such as pose, illumination etc. Other challenges in this application are twofold; 1) the imbalanced data distributions between the face captures of the individuals to be re-identified and those of other individuals 2) varying degree of imbalance during operations w.r.t. the design data. Learning from imbalanced data is challenging in general due in part to the bias of performance in most two-class classification systems towards correct classification of the majority (negative, or non-target) class (face images/frames captured from the individuals in not to be re-identified) better than the minority (positive, or target) class (face images/frames captured from the individual to be re-identified) because most two-class classification systems are intended to be used under balanced data condition. Several techniques have been proposed in the literature to learn from imbalanced data that either use data-level techniques to rebalance data (by under-sampling the majority class, up-sampling the minority class, or both) for training classifiers or use algorithm-level methods to guide the learning process (with or without cost sensitive approaches) such that the bias of performance towards correct classification of the majority class is neutralized. Ensemble techniques such as Bagging and Boosting algorithms have been shown to efficiently utilize these methods to address imbalance. However, there are issues faced by these techniques in the literature: (1) some informative samples may be neglected by random under-sampling and adding synthetic positive samples through upsampling adds to training complexity, (2) cost factors must be pre-known or found, (3) classification systems are often optimized and compared using performance measurements (like accuracy) that are unsuitable for imbalance problem; (4) most learning algorithms are designed and tested on a fixed imbalance level of data, which may differ from operational scenarios; The objective of this thesis is to design specialized classifier ensembles to address the issue of imbalance in the face re-identification application and as sub-goals avoiding the abovementioned issues faced in the literature. In addition achieving an efficient classifier ensemble requires a learning algorithm to design and combine component classifiers that hold suitable diversity-accuracy trade off. To reach the objective of the thesis, four major contributions are made that are presented in three chapters summarized in the following. In Chapter 3, a new application-based sampling method is proposed to group samples for under-sampling in order to improve diversity-accuracy trade-off between classifiers of the ensemble. The proposed sampling method takes the advantage of the fact that in face re-identification applications, facial regions of a same person appearing in a camera field of view may be regrouped based on their trajectories found by face tracker. A partitional Bagging ensemble method is proposed that accounts for possible variations in imbalance level of the operational data by combining classifiers that are trained on different imbalance levels. In this method, all samples are used for training classifiers and information loss is therefore avoided. In Chapter 4, a new ensemble learning algorithm called Progressive Boosting (PBoost) is proposed that progressively inserts uncorrelated groups of samples into a Boosting procedure to avoid loosing information while generating a diverse pool of classifiers. From one iteration to the next, the PBoost algorithm accumulates these uncorrelated groups of samples into a set that grows gradually in size and imbalance. This algorithm is more sophisticated than the one proposed in Chapter 3 because instead of training the base classifiers on this set, the base classifiers are trained on balanced subsets sampled from this set and validated on the whole set. Therefore, the base classifiers are more accurate while the robustness to imbalance is not jeopardized. In addition, the sample selection is based on the weights that are assigned to samples which correspond to their importance. In addition, the computation complexity of PBoost is lower than Boosting ensemble techniques in the literature for learning from imbalanced data because not all of the base classifiers are validated on all negative samples. A new loss factor is also proposed to be used in PBoost to avoid biasing performance towards the negative class. Using this loss factor, the weight update of samples and classifier contribution in final predictions are set according to the ability of classifiers to recognize both classes. In comparing the performance of the classifier systems in Chapter 3 and 4, a need is faced for an evaluation space that compares classifiers in terms of a suitable performance metric over all of their decision thresholds, different imbalance levels of test data, and different preference between classes. The F-measure is often used to evaluate two-class classifiers on imbalanced data, and no global evaluation space was available in the literature for this measure. Therefore, in Chapter 5, a new global evaluation space for the F-measure is proposed that is analogous to the cost curves for expected cost. In this space, a classifier is represented as a curve that shows its performance over all of its decision thresholds and a range of possible imbalance levels for the desired preference of true positive rate to precision. These properties are missing in ROC and precision-recall spaces. This space also allows us to empirically improve the performance of specialized ensemble learning methods for imbalance under a given operating condition. Through a validation, the base classifiers are combined using a modified version of the iterative Boolean combination algorithm such that the selection criterion in this algorithm is replaced by F-measure instead of AUC, and the combination is carried out for each operating condition. The proposed approaches in this thesis were validated and compared using synthetic data and videos from the Faces In Action, and COX datasets that emulate face re-identification applications. Results show that the proposed techniques outperforms state of the art techniques over different levels of imbalance and overlap between classes

    Local quality-based matching of faces for watchlist screening applications

    Get PDF
    Video surveillance systems are often exploited by safety organizations for enhanced security and situational awareness. A key application in video surveillance is watchlist screening where target individuals are enrolled to a still-to-video Face Recognition (FR) system using single still images captured a priori under controlled conditions. Watchlist Screening is a very challenging application. Indeed, the latter must provide accurate decisions and timely recognition using limited number of reference faces for the system’s enrolment. This issue is often called the "Single Sample Per Person" (SSPP) problem. Added to that, uncontrolled factors such as variations in illumination pose and occlusion is unpreventable in real case video surveillance which causes the degradation of the FR system’s performance. Another major problem in such applications is the camera interoperability. This means that there is a huge gap between the camera used for taking the still images and the camera used for taking the video surveillance footage in terms of quality and resolution. This issue hinders the classification process then decreases the system‘s performance. Controlled and uniform lighting is indispensable for having good facial captures that contributes in the recognition performance of the system. However, in reality, facial captures are poor in illumination factor and are severely affecting the system’s performance. This is why it is important to implement a FR system which is invariant to illumination changes. The first part of this Thesis consists in investigating different illumination normalization (IN) techniques that are applied at the pre-processing level of the still-to-video FR. Afterwards IN techniques are compared to each other in order to pinpoint the most suitable technique for illumination invariance. In addition, patch-based methods for template matching extracts facial features from different regions which offers more discriminative information and deals with occlusion issues. Thus, local matching is applied for the still-to-video FR system. For that, a profound examination is needed on the manner of applying these IN techniques. Two different approaches were conducted: the global approach which consists in performing IN on the image then performs local matching and the local approach which consists in primarily dividing the images into non overlapping patches then perform on individually on each patch each IN technique. The results obtained after executing these experiments have shown that the Tan and Triggs (TT) and Multi ScaleWeberfaces are likely to offer better illumination invariance for the still-to-video FR system. In addition to that, these outperforming IN techniques applied locally on each patch have shown to improve the performance of the FR compared to the global approach. The performance of a FR system is good when the training data and the operation data are from the same distribution. Unfortunately, in still-to-video FR systems this is not satisfied. The training data are still, high quality, high resolution and frontal images. However, the testing data are video frames, low quality, low resolution and varying head pose images. Thus, the former and the latter do not have the same distribution. To address this domain shift, the second part of this Thesis consists in presenting a new technique of dynamic regional weighting exploiting unsupervised domain adaptation and contextual information based on quality. The main contribution consists in assigning dynamic weights that is specific to a camera domain.This study replaces the static and predefined manner of assigning weights. In order to assess the impact of applying local weights dynamically, results are compared to a baseline (no weights) and static weighting technique. This context based approach has proven to increase the system’s performance compared to the static weighting that is dependent on the dataset and the baseline technique which consists of having no weights. These experiments are conducted and validated using the ChokePoint Dataset. As for the performance of the still-to-video FR system, it is evaluated using performance measures, Receiver operating characteristic (ROC) curve and Precision-Recall (PR) curve analysis
    corecore