240 research outputs found

    A study on different experimental configurations for age, race, and gender estimation problems

    Get PDF
    This paper presents a detailed study about different algorithmic configurations for estimating soft biometric traits. In particular, a recently introduced common framework is the starting point of the study: it includes an initial facial detection, the subsequent facial traits description, the data reduction step, and the final classification step. The algorithmic configurations are featured by different descriptors and different strategies to build the training dataset and to scale the data in input to the classifier. Experimental proofs have been carried out on both publicly available datasets and image sequences specifically acquired in order to evaluate the performance even under real-world conditions, i.e., in the presence of scaling and rotation

    Unifying the Visible and Passive Infrared Bands: Homogeneous and Heterogeneous Multi-Spectral Face Recognition

    Get PDF
    Face biometrics leverages tools and technology in order to automate the identification of individuals. In most cases, biometric face recognition (FR) can be used for forensic purposes, but there remains the issue related to the integration of technology into the legal system of the court. The biggest challenge with the acceptance of the face as a modality used in court is the reliability of such systems under varying pose, illumination and expression, which has been an active and widely explored area of research over the last few decades (e.g. same-spectrum or homogeneous matching). The heterogeneous FR problem, which deals with matching face images from different sensors, should be examined for the benefit of military and law enforcement applications as well. In this work we are concerned primarily with visible band images (380-750 nm) and the infrared (IR) spectrum, which has become an area of growing interest.;For homogeneous FR systems, we formulate and develop an efficient, semi-automated, direct matching-based FR framework, that is designed to operate efficiently when face data is captured using either visible or passive IR sensors. Thus, it can be applied in both daytime and nighttime environments. First, input face images are geometrically normalized using our pre-processing pipeline prior to feature-extraction. Then, face-based features including wrinkles, veins, as well as edges of facial characteristics, are detected and extracted for each operational band (visible, MWIR, and LWIR). Finally, global and local face-based matching is applied, before fusion is performed at the score level. Although this proposed matcher performs well when same-spectrum FR is performed, regardless of spectrum, a challenge exists when cross-spectral FR matching is performed. The second framework is for the heterogeneous FR problem, and deals with the issue of bridging the gap across the visible and passive infrared (MWIR and LWIR) spectrums. Specifically, we investigate the benefits and limitations of using synthesized visible face images from thermal and vice versa, in cross-spectral face recognition systems when utilizing canonical correlation analysis (CCA) and locally linear embedding (LLE), a manifold learning technique for dimensionality reduction. Finally, by conducting an extensive experimental study we establish that the combination of the proposed synthesis and demographic filtering scheme increases system performance in terms of rank-1 identification rate

    Inferring Facial and Body Language

    Get PDF
    Machine analysis of human facial and body language is a challenging topic in computer vision, impacting on important applications such as human-computer interaction and visual surveillance. In this thesis, we present research building towards computational frameworks capable of automatically understanding facial expression and behavioural body language. The thesis work commences with a thorough examination in issues surrounding facial representation based on Local Binary Patterns (LBP). Extensive experiments with different machine learning techniques demonstrate that LBP features are efficient and effective for person-independent facial expression recognition, even in low-resolution settings. We then present and evaluate a conditional mutual information based algorithm to efficiently learn the most discriminative LBP features, and show the best recognition performance is obtained by using SVM classifiers with the selected LBP features. However, the recognition is performed on static images without exploiting temporal behaviors of facial expression. Subsequently we present a method to capture and represent temporal dynamics of facial expression by discovering the underlying low-dimensional manifold. Locality Preserving Projections (LPP) is exploited to learn the expression manifold in the LBP based appearance feature space. By deriving a universal discriminant expression subspace using a supervised LPP, we can effectively align manifolds of different subjects on a generalised expression manifold. Different linear subspace methods are comprehensively evaluated in expression subspace learning. We formulate and evaluate a Bayesian framework for dynamic facial expression recognition employing the derived manifold representation. However, the manifold representation only addresses temporal correlations of the whole face image, does not consider spatial-temporal correlations among different facial regions. We then employ Canonical Correlation Analysis (CCA) to capture correlations among face parts. To overcome the inherent limitations of classical CCA for image data, we introduce and formalise a novel Matrix-based CCA (MCCA), which can better measure correlations in 2D image data. We show this technique can provide superior performance in regression and recognition tasks, whilst requiring significantly fewer canonical factors. All the above work focuses on facial expressions. However, the face is usually perceived not as an isolated object but as an integrated part of the whole body, and the visual channel combining facial and bodily expressions is most informative. Finally we investigate two understudied problems in body language analysis, gait-based gender discrimination and affective body gesture recognition. To effectively combine face and body cues, CCA is adopted to establish the relationship between the two modalities, and derive a semantic joint feature space for the feature-level fusion. Experiments on large data sets demonstrate that our multimodal systems achieve the superior performance in gender discrimination and affective state analysis.Research studentship of Queen Mary, the International Travel Grant of the Royal Academy of Engineering, and the Royal Society International Joint Project

    Face recognition using multiple features in different color spaces

    Get PDF
    Face recognition as a particular problem of pattern recognition has been attracting substantial attention from researchers in computer vision, pattern recognition, and machine learning. The recent Face Recognition Grand Challenge (FRGC) program reveals that uncontrolled illumination conditions pose grand challenges to face recognition performance. Most of the existing face recognition methods use gray-scale face images, which have been shown insufficient to tackle these challenges. To overcome this challenging problem in face recognition, this dissertation applies multiple features derived from the color images instead of the intensity images only. First, this dissertation presents two face recognition methods, which operate in different color spaces, using frequency features by means of Discrete Fourier Transform (DFT) and spatial features by means of Local Binary Patterns (LBP), respectively. The DFT frequency domain consists of the real part, the imaginary part, the magnitude, and the phase components, which provide the different interpretations of the input face images. The advantage of LBP in face recognition is attributed to its robustness in terms of intensity-level monotonic transformation, as well as its operation in the various scale image spaces. By fusing the frequency components or the multi-resolution LBP histograms, the complementary feature sets can be generated to enhance the capability of facial texture description. This dissertation thus uses the fused DFT and LBP features in two hybrid color spaces, the RIQ and the VIQ color spaces, respectively, for improving face recognition performance. Second, a method that extracts multiple features in the CID color space is presented for face recognition. As different color component images in the CID color space display different characteristics, three different image encoding methods, namely, the patch-based Gabor image representation, the multi-resolution LBP feature fusion, and the DCT-based multiple face encodings, are presented to effectively extract features from the component images for enhancing pattern recognition performance. To further improve classification performance, the similarity scores due to the three color component images are fused for the final decision making. Finally, a novel image representation is also discussed in this dissertation. Unlike a traditional intensity image that is directly derived from a linear combination of the R, G, and B color components, the novel image representation adapted to class separability is generated through a PCA plus FLD learning framework from the hybrid color space instead of the RGB color space. Based upon the novel image representation, a multiple feature fusion method is proposed to address the problem of face recognition under the severe illumination conditions. The aforementioned methods have been evaluated using two large-scale databases, namely, the Face Recognition Grand Challenge (FRGC) version 2 database and the FERET face database. Experimental results have shown that the proposed methods improve face recognition performance upon the traditional methods using the intensity images by large margins and outperform some state-of-the-art methods

    Adaptive visual sampling

    Get PDF
    PhDVarious visual tasks may be analysed in the context of sampling from the visual field. In visual psychophysics, human visual sampling strategies have often been shown at a high-level to be driven by various information and resource related factors such as the limited capacity of the human cognitive system, the quality of information gathered, its relevance in context and the associated efficiency of recovering it. At a lower-level, we interpret many computer vision tasks to be rooted in similar notions of contextually-relevant, dynamic sampling strategies which are geared towards the filtering of pixel samples to perform reliable object association. In the context of object tracking, the reliability of such endeavours is fundamentally rooted in the continuing relevance of object models used for such filtering, a requirement complicated by realworld conditions such as dynamic lighting that inconveniently and frequently cause their rapid obsolescence. In the context of recognition, performance can be hindered by the lack of learned context-dependent strategies that satisfactorily filter out samples that are irrelevant or blunt the potency of models used for discrimination. In this thesis we interpret the problems of visual tracking and recognition in terms of dynamic spatial and featural sampling strategies and, in this vein, present three frameworks that build on previous methods to provide a more flexible and effective approach. Firstly, we propose an adaptive spatial sampling strategy framework to maintain statistical object models for real-time robust tracking under changing lighting conditions. We employ colour features in experiments to demonstrate its effectiveness. The framework consists of five parts: (a) Gaussian mixture models for semi-parametric modelling of the colour distributions of multicolour objects; (b) a constructive algorithm that uses cross-validation for automatically determining the number of components for a Gaussian mixture given a sample set of object colours; (c) a sampling strategy for performing fast tracking using colour models; (d) a Bayesian formulation enabling models of object and the environment to be employed together in filtering samples by discrimination; and (e) a selectively-adaptive mechanism to enable colour models to cope with changing conditions and permit more robust tracking. Secondly, we extend the concept to an adaptive spatial and featural sampling strategy to deal with very difficult conditions such as small target objects in cluttered environments undergoing severe lighting fluctuations and extreme occlusions. This builds on previous work on dynamic feature selection during tracking by reducing redundancy in features selected at each stage as well as more naturally balancing short-term and long-term evidence, the latter to facilitate model rigidity under sharp, temporary changes such as occlusion whilst permitting model flexibility under slower, long-term changes such as varying lighting conditions. This framework consists of two parts: (a) Attribute-based Feature Ranking (AFR) which combines two attribute measures; discriminability and independence to other features; and (b) Multiple Selectively-adaptive Feature Models (MSFM) which involves maintaining a dynamic feature reference of target object appearance. We call this framework Adaptive Multi-feature Association (AMA). Finally, we present an adaptive spatial and featural sampling strategy that extends established Local Binary Pattern (LBP) methods and overcomes many severe limitations of the traditional approach such as limited spatial support, restricted sample sets and ad hoc joint and disjoint statistical distributions that may fail to capture important structure. Our framework enables more compact, descriptive LBP type models to be constructed which may be employed in conjunction with many existing LBP techniques to improve their performance without modification. The framework consists of two parts: (a) a new LBP-type model known as Multiscale Selected Local Binary Features (MSLBF); and (b) a novel binary feature selection algorithm called Binary Histogram Intersection Minimisation (BHIM) which is shown to be more powerful than established methods used for binary feature selection such as Conditional Mutual Information Maximisation (CMIM) and AdaBoost

    Texture analysis and Its applications in biomedical imaging: a survey

    Get PDF
    Texture analysis describes a variety of image analysis techniques that quantify the variation in intensity and pattern. This paper provides an overview of several texture analysis approaches addressing the rationale supporting them, their advantages, drawbacks, and applications. This survey’s emphasis is in collecting and categorising over five decades of active research on texture analysis.Brief descriptions of different approaches are presented along with application examples. From a broad range of texture analysis applications, this survey’s final focus is on biomedical image analysis. An up-to-date list of biological tissues and organs in which disorders produce texture changes that may be used to spot disease onset and progression is provided. Finally, the role of texture analysis methods as biomarkers of disease is summarised.Manuscript received February 3, 2021; revised June 23, 2021; accepted September 21, 2021. Date of publication September 27, 2021; date of current version January 24, 2022. This work was supported in part by the Portuguese Foundation for Science and Technology (FCT) under Grants PTDC/EMD-EMD/28039/2017, UIDB/04950/2020, PestUID/NEU/04539/2019, and CENTRO-01-0145-FEDER-000016 and by FEDER-COMPETE under Grant POCI-01-0145-FEDER-028039. (Corresponding author: Rui Bernardes.)info:eu-repo/semantics/publishedVersio
    corecore