Search CORE

12 research outputs found

Facial analysis in video : detection and recognition

Author: Shih Peichung
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2006
Field of study

Biometric authentication systems automatically identify or verify individuals using physiological (e.g., face, fingerprint, hand geometry, retina scan) or behavioral (e.g., speaking pattern, signature, keystroke dynamics) characteristics. Among these biometrics, facial patterns have the major advantage of being the least intrusive. Automatic face recognition systems thus have great potential in a wide spectrum of application areas. Focusing on facial analysis, this dissertation presents a face detection method and numerous feature extraction methods for face recognition. Concerning face detection, a video-based frontal face detection method has been developed using motion analysis and color information to derive field of interests, and distribution-based distance (DBD) and support vector machine (SVM) for classification. When applied to 92 still images (containing 282 faces), this method achieves 98.2% face detection rate with two false detections, a performance comparable to the state-of-the-art face detection methods; when applied to videQ streams, this method detects faces reliably and efficiently. Regarding face recognition, extensive assessments of face recognition performance in twelve color spaces have been performed, and a color feature extraction method defined by color component images across different color spaces is shown to help improve the baseline performance of the Face Recognition Grand Challenge (FRGC) problems. The experimental results show that some color configurations, such as YV in the YUV color space and YJ in the YIQ color space, help improve face recognition performance. Based on these improved results, a novel feature extraction method implementing genetic algorithms (GAs) and the Fisher linear discriminant (FLD) is designed to derive the optimal discriminating features that lead to an effective image representation for face recognition. This method noticeably improves FRGC ver1.0 Experiment 4 baseline recognition rate from 37% to 73%, and significantly elevates FRGC xxxx Experiment 4 baseline verification rate from 12% to 69%. Finally, four two-dimensional (2D) convolution filters are derived for feature extraction, and a 2D+3D face recognition system implementing both 2D and 3D imaging modalities is designed to address the FRGC problems. This method improves FRGC ver2.0 Experiment 3 baseline performance from 54% to 72%

Digital Commons @ New Jersey Institute of Technology (NJIT)

Face Image and Video Analysis in Biometrics and Health Applications

Author: Zhang Na
Publication venue: 'West Virginia University Libraries'
Publication date: 01/01/2023
Field of study

Computer Vision (CV) enables computers and systems to derive meaningful information from acquired visual inputs, such as images and videos, and make decisions based on the extracted information. Its goal is to acquire, process, analyze, and understand the information by developing a theoretical and algorithmic model. Biometrics are distinctive and measurable human characteristics used to label or describe individuals by combining computer vision with knowledge of human physiology (e.g., face, iris, fingerprint) and behavior (e.g., gait, gaze, voice). Face is one of the most informative biometric traits. Many studies have investigated the human face from the perspectives of various different disciplines, ranging from computer vision, deep learning, to neuroscience and biometrics. In this work, we analyze the face characteristics from digital images and videos in the areas of morphing attack and defense, and autism diagnosis. For face morphing attacks generation, we proposed a transformer based generative adversarial network to generate more visually realistic morphing attacks by combining different losses, such as face matching distance, facial landmark based loss, perceptual loss and pixel-wise mean square error. In face morphing attack detection study, we designed a fusion-based few-shot learning (FSL) method to learn discriminative features from face images for few-shot morphing attack detection (FS-MAD), and extend the current binary detection into multiclass classification, namely, few-shot morphing attack fingerprinting (FS-MAF). In the autism diagnosis study, we developed a discriminative few shot learning method to analyze hour-long video data and explored the fusion of facial dynamics for facial trait classification of autism spectrum disorder (ASD) in three severity levels. The results show outstanding performance of the proposed fusion-based few-shot framework on the dataset. Besides, we further explored the possibility of performing face micro- expression spotting and feature analysis on autism video data to classify ASD and control groups. The results indicate the effectiveness of subtle facial expression changes on autism diagnosis

The Research Repository @ WVU (West Virginia University)

Techniques for Ocular Biometric Recognition Under Non-ideal Conditions

Author: Jillela Raghavender Reddy
Publication venue: The Research Repository @ WVU
Publication date: 01/12/2013
Field of study

The use of the ocular region as a biometric cue has gained considerable traction due to recent advances in automated iris recognition. However, a multitude of factors can negatively impact ocular recognition performance under unconstrained conditions (e.g., non-uniform illumination, occlusions, motion blur, image resolution, etc.). This dissertation develops techniques to perform iris and ocular recognition under challenging conditions. The first contribution is an image-level fusion scheme to improve iris recognition performance in low-resolution videos. Information fusion is facilitated by the use of Principal Components Transform (PCT), thereby requiring modest computational efforts. The proposed approach provides improved recognition accuracy when low-resolution iris images are compared against high-resolution iris images. The second contribution is a study demonstrating the effectiveness of the ocular region in improving face recognition under plastic surgery. A score-level fusion approach that combines information from the face and ocular regions is proposed. The proposed approach, unlike other previous methods in this application, is not learning-based, and has modest computational requirements while resulting in better recognition performance. The third contribution is a study on matching ocular regions extracted from RGB face images against that of near-infrared iris images. Face and iris images are typically acquired using sensors operating in visible and near-infrared wavelengths of light, respectively. To this end, a sparse representation approach which generates a joint dictionary from corresponding pairs of face and iris images is designed. The proposed joint dictionary approach is observed to outperform classical ocular recognition techniques. In summary, the techniques presented in this dissertation can be used to improve iris and ocular recognition in practical, unconstrained environments

The Research Repository @ WVU (West Virginia University)

Multi-system Biometric Authentication: Optimal Fusion and User-Specific Information

Author: Poh Norman
Publication venue: École Polytechnique Fédérale de Lausanne
Publication date: 11/02/2010
Field of study

Verifying a person's identity claim by combining multiple biometric systems (fusion) is a promising solution to identity theft and automatic access control. This thesis contributes to the state-of-the-art of multimodal biometric fusion by improving the understanding of fusion and by enhancing fusion performance using information specific to a user. One problem to deal with at the score level fusion is to combine system outputs of different types. Two statistically sound representations of scores are probability and log-likelihood ratio (LLR). While they are equivalent in theory, LLR is much more useful in practice because its distribution can be approximated by a Gaussian distribution, which makes it useful to analyze the problem of fusion. Furthermore, its score statistics (mean and covariance) conditioned on the claimed user identity can be better exploited. Our first contribution is to estimate the fusion performance given the class-conditional score statistics and given a particular fusion operator/classifier. Thanks to the score statistics, we can predict fusion performance with reasonable accuracy, identify conditions which favor a particular fusion operator, study the joint phenomenon of combining system outputs with different degrees of strength and correlation and possibly correct the adverse effect of bias (due to the score-level mismatch between training and test sets) on fusion. While in practice the class-conditional Gaussian assumption is not always true, the estimated performance is found to be acceptable. Our second contribution is to exploit the user-specific prior knowledge by limiting the class-conditional Gaussian assumption to each user. We exploit this hypothesis in two strategies. In the first strategy, we combine a user-specific fusion classifier with a user-independent fusion classifier by means of two LLR scores, which are then weighted to obtain a single output. We show that combining both user-specific and user-independent LLR outputs always results in improved performance than using the better of the two. In the second strategy, we propose a statistic called the user-specific F-ratio, which measures the discriminative power of a given user based on the Gaussian assumption. Although similar class separability measures exist, e.g., the Fisher-ratio for a two-class problem and the d-prime statistic, F-ratio is more suitable because it is related to Equal Error Rate in a closed form. F-ratio is used in the following applications: a user-specific score normalization procedure, a user-specific criterion to rank users and a user-specific fusion operator that selectively considers a subset of systems for fusion. The resultant fusion operator leads to a statistically significantly increased performance with respect to the state-of-the-art fusion approaches. Even though the applications are different, the proposed methods share the following common advantages. Firstly, they are robust to deviation from the Gaussian assumption. Secondly, they are robust to few training data samples thanks to Bayesian adaptation. Finally, they consider both the client and impostor information simultaneously

Graph Analysis and Applications in Clustering and Content-based Image Retrieval

Author: Zhang Honglei
Publication venue: Tampere University
Publication date: 09/08/2019
Field of study

About 300 years ago, when studying Seven Bridges of Königsberg problem - a famous problem concerning paths on graphs - the great mathematician Leonhard Euler said, “This question is very banal, but seems to me worthy of attention”. Since then, graph theory and graph analysis have not only become one of the most important branches of mathematics, but have also found an enormous range of important applications in many other areas. A graph is a mathematical model that abstracts entities and the relationships between them as nodes and edges. Many types of interactions between the entities can be modeled by graphs, for example, social interactions between people, the communications between the entities in computer networks and relations between biological species. Although not appearing to be a graph, many other types of data can be converted into graphs by cer- tain operations, for example, the k-nearest neighborhood graph built from pixels in an image. Cluster structure is a common phenomenon in many real-world graphs, for example, social networks. Finding the clusters in a large graph is important to understand the underlying relationships between the nodes. Graph clustering is a technique that partitions nodes into clus- ters such that connections among nodes in a cluster are dense and connections between nodes in diﬀerent clusters are sparse. Various approaches have been proposed to solve graph clustering problems. A common approach is to optimize a predeﬁned clustering metric using diﬀerent optimization methods. However, most of these optimization problems are NP-hard due to the discrete set-up of the hard-clustering. These optimization problems can be relaxed, and a sub-optimal solu- tion can be found. A diﬀerent approach is to apply data clustering algorithms in solving graph clustering problems. With this approach, one must ﬁrst ﬁnd appropriate features for each node that represent the local structure of the graph. Limited Random Walk algorithm uses the random walk procedure to explore the graph and extracts ef- ﬁcient features for the nodes. It incorporates the embarrassing parallel paradigm, thus, it can process large graph data eﬃciently using mod- ern high-performance computing facilities. This thesis gives the details of this algorithm and analyzes the stability issues of the algorithm. Based on the study of the cluster structures in a graph, we deﬁne the authenticity score of an edge as the diﬀerence between the actual and the expected number of edges that connect the two groups of the neighboring nodes of the two end nodes. Authenticity score can be used in many important applications, such as graph clustering, outlier detection, and graph data preprocessing. In particular, a data clus- tering algorithm that uses the authenticity scores on mutual k-nearest neighborhood graph achieves more reliable and superior performance comparing to other popular algorithms. This thesis also theoretically proves that this algorithm can asymptotically ﬁnd the complete re- covery of the ground truth of the graphs that were generated by a stochastic r-block model. Content-based image retrieval (CBIR) is an important application in computer vision, media information retrieval, and data mining. Given a query image, a CBIR system ranks the images in a large image database by their “similarities” to the query image. However, because of the ambiguities of the deﬁnition of the “similarity”, it is very diﬃ- cult for a CBIR system to select the optimal feature set and ranking algorithm to satisfy the purpose of the query. Graph technologies have been used to improve the performance of CBIR systems in var- ious ways. In this thesis, a novel method is proposed to construct a visual-semantic graph—a graph where nodes represent semantic concepts and edges represent visual associations between concepts. The constructed visual-semantic graph not only helps the user to locate the target images quickly but also helps answer the questions related to the query image. Experiments show that the eﬀorts of locating the target image are reduced by 25% with the help of visual-semantic graphs. Graph analysis will continue to play an important role in future data analysis. In particular, the visual-semantic graph that captures important and interesting visual associations between the concepts is worthyof further attention