240 research outputs found
A study on different experimental configurations for age, race, and gender estimation problems
This paper presents a detailed study about different algorithmic configurations for estimating soft biometric traits. In particular, a recently introduced common framework is the starting point of the study: it includes an initial facial detection, the subsequent facial traits description, the data reduction step, and the final classification step. The algorithmic configurations are featured by different descriptors and different strategies to build the training dataset and to scale the data in input to the classifier. Experimental proofs have been carried out on both publicly available datasets and image sequences specifically acquired in order to evaluate the performance even under real-world conditions, i.e., in the presence of scaling and rotation
Unifying the Visible and Passive Infrared Bands: Homogeneous and Heterogeneous Multi-Spectral Face Recognition
Face biometrics leverages tools and technology in order to automate the identification of individuals. In most cases, biometric face recognition (FR) can be used for forensic purposes, but there remains the issue related to the integration of technology into the legal system of the court. The biggest challenge with the acceptance of the face as a modality used in court is the reliability of such systems under varying pose, illumination and expression, which has been an active and widely explored area of research over the last few decades (e.g. same-spectrum or homogeneous matching). The heterogeneous FR problem, which deals with matching face images from different sensors, should be examined for the benefit of military and law enforcement applications as well. In this work we are concerned primarily with visible band images (380-750 nm) and the infrared (IR) spectrum, which has become an area of growing interest.;For homogeneous FR systems, we formulate and develop an efficient, semi-automated, direct matching-based FR framework, that is designed to operate efficiently when face data is captured using either visible or passive IR sensors. Thus, it can be applied in both daytime and nighttime environments. First, input face images are geometrically normalized using our pre-processing pipeline prior to feature-extraction. Then, face-based features including wrinkles, veins, as well as edges of facial characteristics, are detected and extracted for each operational band (visible, MWIR, and LWIR). Finally, global and local face-based matching is applied, before fusion is performed at the score level. Although this proposed matcher performs well when same-spectrum FR is performed, regardless of spectrum, a challenge exists when cross-spectral FR matching is performed. The second framework is for the heterogeneous FR problem, and deals with the issue of bridging the gap across the visible and passive infrared (MWIR and LWIR) spectrums. Specifically, we investigate the benefits and limitations of using synthesized visible face images from thermal and vice versa, in cross-spectral face recognition systems when utilizing canonical correlation analysis (CCA) and locally linear embedding (LLE), a manifold learning technique for dimensionality reduction. Finally, by conducting an extensive experimental study we establish that the combination of the proposed synthesis and demographic filtering scheme increases system performance in terms of rank-1 identification rate
Inferring Facial and Body Language
Machine analysis of human facial and body language is a challenging topic in computer
vision, impacting on important applications such as human-computer interaction and visual
surveillance. In this thesis, we present research building towards computational frameworks
capable of automatically understanding facial expression and behavioural body language.
The thesis work commences with a thorough examination in issues surrounding facial
representation based on Local Binary Patterns (LBP). Extensive experiments with different
machine learning techniques demonstrate that LBP features are efficient and effective for
person-independent facial expression recognition, even in low-resolution settings. We then
present and evaluate a conditional mutual information based algorithm to efficiently learn the
most discriminative LBP features, and show the best recognition performance is obtained by
using SVM classifiers with the selected LBP features. However, the recognition is performed
on static images without exploiting temporal behaviors of facial expression.
Subsequently we present a method to capture and represent temporal dynamics of facial
expression by discovering the underlying low-dimensional manifold. Locality Preserving Projections
(LPP) is exploited to learn the expression manifold in the LBP based appearance
feature space. By deriving a universal discriminant expression subspace using a supervised
LPP, we can effectively align manifolds of different subjects on a generalised expression manifold.
Different linear subspace methods are comprehensively evaluated in expression subspace
learning. We formulate and evaluate a Bayesian framework for dynamic facial expression
recognition employing the derived manifold representation. However, the manifold representation
only addresses temporal correlations of the whole face image, does not consider
spatial-temporal correlations among different facial regions. We then employ Canonical Correlation Analysis (CCA) to capture correlations among face
parts. To overcome the inherent limitations of classical CCA for image data, we introduce
and formalise a novel Matrix-based CCA (MCCA), which can better measure correlations in
2D image data. We show this technique can provide superior performance in regression and
recognition tasks, whilst requiring significantly fewer canonical factors. All the above work
focuses on facial expressions. However, the face is usually perceived not as an isolated object
but as an integrated part of the whole body, and the visual channel combining facial and
bodily expressions is most informative.
Finally we investigate two understudied problems in body language analysis, gait-based
gender discrimination and affective body gesture recognition. To effectively combine face
and body cues, CCA is adopted to establish the relationship between the two modalities, and
derive a semantic joint feature space for the feature-level fusion. Experiments on large data
sets demonstrate that our multimodal systems achieve the superior performance in gender
discrimination and affective state analysis.Research studentship of Queen Mary, the International Travel Grant of the Royal Academy of Engineering,
and the Royal Society International Joint Project
Face recognition using multiple features in different color spaces
Face recognition as a particular problem of pattern recognition has been attracting substantial attention from researchers in computer vision, pattern recognition, and machine learning. The recent Face Recognition Grand Challenge (FRGC) program reveals that uncontrolled illumination conditions pose grand challenges to face recognition performance. Most of the existing face recognition methods use gray-scale face images, which have been shown insufficient to tackle these challenges. To overcome this challenging problem in face recognition, this dissertation applies multiple features derived from the color images instead of the intensity images only.
First, this dissertation presents two face recognition methods, which operate in different color spaces, using frequency features by means of Discrete Fourier Transform (DFT) and spatial features by means of Local Binary Patterns (LBP), respectively. The DFT frequency domain consists of the real part, the imaginary part, the magnitude, and the phase components, which provide the different interpretations of the input face images. The advantage of LBP in face recognition is attributed to its robustness in terms of intensity-level monotonic transformation, as well as its operation in the various scale image spaces. By fusing the frequency components or the multi-resolution LBP histograms, the complementary feature sets can be generated to enhance the capability of facial texture description. This dissertation thus uses the fused DFT and LBP features in two hybrid color spaces, the RIQ and the VIQ color spaces, respectively, for improving face recognition performance.
Second, a method that extracts multiple features in the CID color space is presented for face recognition. As different color component images in the CID color space display different characteristics, three different image encoding methods, namely, the patch-based Gabor image representation, the multi-resolution LBP feature fusion, and the DCT-based multiple face encodings, are presented to effectively extract features from the component images for enhancing pattern recognition performance. To further improve classification performance, the similarity scores due to the three color component images are fused for the final decision making.
Finally, a novel image representation is also discussed in this dissertation. Unlike a traditional intensity image that is directly derived from a linear combination of the R, G, and B color components, the novel image representation adapted to class separability is generated through a PCA plus FLD learning framework from the hybrid color space instead of the RGB color space. Based upon the novel image representation, a multiple feature fusion method is proposed to address the problem of face recognition under the severe illumination conditions.
The aforementioned methods have been evaluated using two large-scale databases, namely, the Face Recognition Grand Challenge (FRGC) version 2 database and the FERET face database. Experimental results have shown that the proposed methods improve face recognition performance upon the traditional methods using the intensity images by large margins and outperform some state-of-the-art methods
Adaptive visual sampling
PhDVarious visual tasks may be analysed in the context of sampling from the visual field. In visual
psychophysics, human visual sampling strategies have often been shown at a high-level to
be driven by various information and resource related factors such as the limited capacity of
the human cognitive system, the quality of information gathered, its relevance in context and
the associated efficiency of recovering it. At a lower-level, we interpret many computer vision
tasks to be rooted in similar notions of contextually-relevant, dynamic sampling strategies
which are geared towards the filtering of pixel samples to perform reliable object association. In
the context of object tracking, the reliability of such endeavours is fundamentally rooted in the
continuing relevance of object models used for such filtering, a requirement complicated by realworld
conditions such as dynamic lighting that inconveniently and frequently cause their rapid
obsolescence. In the context of recognition, performance can be hindered by the lack of learned
context-dependent strategies that satisfactorily filter out samples that are irrelevant or blunt the
potency of models used for discrimination. In this thesis we interpret the problems of visual
tracking and recognition in terms of dynamic spatial and featural sampling strategies and, in this
vein, present three frameworks that build on previous methods to provide a more flexible and
effective approach.
Firstly, we propose an adaptive spatial sampling strategy framework to maintain statistical object
models for real-time robust tracking under changing lighting conditions. We employ colour
features in experiments to demonstrate its effectiveness. The framework consists of five parts:
(a) Gaussian mixture models for semi-parametric modelling of the colour distributions of multicolour
objects; (b) a constructive algorithm that uses cross-validation for automatically determining
the number of components for a Gaussian mixture given a sample set of object colours; (c) a
sampling strategy for performing fast tracking using colour models; (d) a Bayesian formulation
enabling models of object and the environment to be employed together in filtering samples by
discrimination; and (e) a selectively-adaptive mechanism to enable colour models to cope with
changing conditions and permit more robust tracking.
Secondly, we extend the concept to an adaptive spatial and featural sampling strategy to deal
with very difficult conditions such as small target objects in cluttered environments undergoing
severe lighting fluctuations and extreme occlusions. This builds on previous work on dynamic
feature selection during tracking by reducing redundancy in features selected at each stage as
well as more naturally balancing short-term and long-term evidence, the latter to facilitate model
rigidity under sharp, temporary changes such as occlusion whilst permitting model flexibility
under slower, long-term changes such as varying lighting conditions. This framework consists of
two parts: (a) Attribute-based Feature Ranking (AFR) which combines two attribute measures;
discriminability and independence to other features; and (b) Multiple Selectively-adaptive Feature
Models (MSFM) which involves maintaining a dynamic feature reference of target object
appearance. We call this framework Adaptive Multi-feature Association (AMA). Finally, we present an adaptive spatial and featural sampling strategy that extends established
Local Binary Pattern (LBP) methods and overcomes many severe limitations of the traditional
approach such as limited spatial support, restricted sample sets and ad hoc joint and disjoint statistical
distributions that may fail to capture important structure. Our framework enables more
compact, descriptive LBP type models to be constructed which may be employed in conjunction
with many existing LBP techniques to improve their performance without modification. The
framework consists of two parts: (a) a new LBP-type model known as Multiscale Selected Local
Binary Features (MSLBF); and (b) a novel binary feature selection algorithm called Binary Histogram
Intersection Minimisation (BHIM) which is shown to be more powerful than established
methods used for binary feature selection such as Conditional Mutual Information Maximisation
(CMIM) and AdaBoost
Texture analysis and Its applications in biomedical imaging: a survey
Texture analysis describes a variety of image analysis techniques that quantify the variation in intensity
and pattern. This paper provides an overview of several texture analysis approaches addressing the rationale supporting them, their advantages, drawbacks, and applications.
This survey’s emphasis is in collecting and categorising over five decades of active research on texture analysis.Brief descriptions of different approaches are presented along with application examples. From a broad range of texture analysis applications, this survey’s final focus is on biomedical image analysis. An up-to-date list of biological tissues and organs in which disorders produce texture changes that may be used to spot disease onset and progression is provided. Finally, the role of texture analysis methods as biomarkers of disease is summarised.Manuscript received February 3, 2021; revised June 23, 2021; accepted September 21, 2021. Date of publication September 27, 2021;
date of current version January 24, 2022. This work was supported in
part by the Portuguese Foundation for Science and Technology (FCT)
under Grants PTDC/EMD-EMD/28039/2017, UIDB/04950/2020, PestUID/NEU/04539/2019, and CENTRO-01-0145-FEDER-000016 and by
FEDER-COMPETE under Grant POCI-01-0145-FEDER-028039. (Corresponding author: Rui Bernardes.)info:eu-repo/semantics/publishedVersio
- …