18 research outputs found
Review of Person Re-identification Techniques
Person re-identification across different surveillance cameras with disjoint
fields of view has become one of the most interesting and challenging subjects
in the area of intelligent video surveillance. Although several methods have
been developed and proposed, certain limitations and unresolved issues remain.
In all of the existing re-identification approaches, feature vectors are
extracted from segmented still images or video frames. Different similarity or
dissimilarity measures have been applied to these vectors. Some methods have
used simple constant metrics, whereas others have utilised models to obtain
optimised metrics. Some have created models based on local colour or texture
information, and others have built models based on the gait of people. In
general, the main objective of all these approaches is to achieve a
higher-accuracy rate and lowercomputational costs. This study summarises
several developments in recent literature and discusses the various available
methods used in person re-identification. Specifically, their advantages and
disadvantages are mentioned and compared.Comment: Published 201
Non-ideal iris recognition
Of the many biometrics that exist, iris recognition is finding more attention than any other due to its potential for improved accuracy, permanence, and acceptance. Current iris recognition systems operate on frontal view images of good quality. Due to the small area of the iris, user co-operation is required. In this work, a new system capable of processing iris images which are not necessarily in frontal view is described. This overcomes one of the major hurdles with current iris recognition systems and enhances user convenience and accuracy. The proposed system is designed to operate in two steps: (i) preprocessing and estimation of the gaze direction and (ii) processing and encoding of the rotated iris image. Two objective functions are used to estimate the gaze direction. Later, the off-angle iris image undergoes geometric transformations involving the estimated angle and is further processed as if it were a frontal view image. Two methods: (i) PCA and (ii) ICA are used for encoding. Three different datasets are used to quantify performance of the proposed non-ideal recognition system
Adaptive visual sampling
PhDVarious visual tasks may be analysed in the context of sampling from the visual field. In visual
psychophysics, human visual sampling strategies have often been shown at a high-level to
be driven by various information and resource related factors such as the limited capacity of
the human cognitive system, the quality of information gathered, its relevance in context and
the associated efficiency of recovering it. At a lower-level, we interpret many computer vision
tasks to be rooted in similar notions of contextually-relevant, dynamic sampling strategies
which are geared towards the filtering of pixel samples to perform reliable object association. In
the context of object tracking, the reliability of such endeavours is fundamentally rooted in the
continuing relevance of object models used for such filtering, a requirement complicated by realworld
conditions such as dynamic lighting that inconveniently and frequently cause their rapid
obsolescence. In the context of recognition, performance can be hindered by the lack of learned
context-dependent strategies that satisfactorily filter out samples that are irrelevant or blunt the
potency of models used for discrimination. In this thesis we interpret the problems of visual
tracking and recognition in terms of dynamic spatial and featural sampling strategies and, in this
vein, present three frameworks that build on previous methods to provide a more flexible and
effective approach.
Firstly, we propose an adaptive spatial sampling strategy framework to maintain statistical object
models for real-time robust tracking under changing lighting conditions. We employ colour
features in experiments to demonstrate its effectiveness. The framework consists of five parts:
(a) Gaussian mixture models for semi-parametric modelling of the colour distributions of multicolour
objects; (b) a constructive algorithm that uses cross-validation for automatically determining
the number of components for a Gaussian mixture given a sample set of object colours; (c) a
sampling strategy for performing fast tracking using colour models; (d) a Bayesian formulation
enabling models of object and the environment to be employed together in filtering samples by
discrimination; and (e) a selectively-adaptive mechanism to enable colour models to cope with
changing conditions and permit more robust tracking.
Secondly, we extend the concept to an adaptive spatial and featural sampling strategy to deal
with very difficult conditions such as small target objects in cluttered environments undergoing
severe lighting fluctuations and extreme occlusions. This builds on previous work on dynamic
feature selection during tracking by reducing redundancy in features selected at each stage as
well as more naturally balancing short-term and long-term evidence, the latter to facilitate model
rigidity under sharp, temporary changes such as occlusion whilst permitting model flexibility
under slower, long-term changes such as varying lighting conditions. This framework consists of
two parts: (a) Attribute-based Feature Ranking (AFR) which combines two attribute measures;
discriminability and independence to other features; and (b) Multiple Selectively-adaptive Feature
Models (MSFM) which involves maintaining a dynamic feature reference of target object
appearance. We call this framework Adaptive Multi-feature Association (AMA). Finally, we present an adaptive spatial and featural sampling strategy that extends established
Local Binary Pattern (LBP) methods and overcomes many severe limitations of the traditional
approach such as limited spatial support, restricted sample sets and ad hoc joint and disjoint statistical
distributions that may fail to capture important structure. Our framework enables more
compact, descriptive LBP type models to be constructed which may be employed in conjunction
with many existing LBP techniques to improve their performance without modification. The
framework consists of two parts: (a) a new LBP-type model known as Multiscale Selected Local
Binary Features (MSLBF); and (b) a novel binary feature selection algorithm called Binary Histogram
Intersection Minimisation (BHIM) which is shown to be more powerful than established
methods used for binary feature selection such as Conditional Mutual Information Maximisation
(CMIM) and AdaBoost
Discriminative preprocessing of speech : towards improving biometric authentication
Im Rahmen des "SecurePhone-Projektes" wurde ein multimodales System zur Benutzerauthentifizierung entwickelt, das auf ein PDA implementiert wurde. Bei der vollzogenen Erweiterung dieses Systems wurde der Möglichkeit nachgegangen, die Benutzerauthentifizierung durch eine auf biometrischen Parametern (E.: "feature enhancement") basierende Unterscheidung zwischen Sprechern sowie durch eine Kombination mehrerer Parameter zu verbessern.
In der vorliegenden Dissertation wird ein allgemeines Bezugssystem zur Verbesserung der Parameter präsentiert, das ein mehrschichtiges neuronales Netz (E.: "MLP: multilayer perceptron") benutzt, um zu einer optimalen Sprecherdiskrimination zu gelangen.
In einem ersten Schritt wird beim Trainieren des MLPs eine Teilmenge der Sprecher (Sprecherbasis) berücksichtigt, um die zugrundeliegenden Charakteristika des vorhandenen akustischen Parameterraums darzustellen.
Am Ende eines zweiten Schrittes steht die Erkenntnis, dass die Größe der verwendeten Sprecherbasis die Leistungsfähigkeit eines Sprechererkennungssystems entscheidend beeinflussen kann.
Ein dritter Schritt führt zur Feststellung, dass sich die Selektion der Sprecherbasis ebenfalls auf die Leistungsfähigkeit des Systems auswirken kann. Aufgrund dieser Beobachtung wird eine automatische Selektionsmethode für die Sprecher auf der Basis des maximalen Durchschnittswertes der Zwischenklassenvariation (between-class variance) vorgeschlagen. Unter Rückgriff auf verschiedene sprachliche Produktionssituationen (Sprachproduktion mit und ohne Hintergrundgeräusche; Sprachproduktion beim Telefonieren) wird gezeigt, dass diese Methode die Leistungsfähigkeit des Erkennungssystems verbessern kann.
Auf der Grundlage dieser Ergebnisse wird erwartet, dass sich die hier für die Sprechererkennung verwendete Methode auch für andere biometrische Modalitäten als sinnvoll erweist.
Zusätzlich wird in der vorliegenden Dissertation eine alternative Parameterrepräsentation vorgeschlagen, die aus der sog. "Sprecher-Stimme-Signatur" (E.: "SVS: speaker voice signature") abgeleitet wird. Die SVS besteht aus Trajektorien in einem Kohonennetz (E.: "SOM: self-organising map"), das den akustischen Raum repräsentiert. Als weiteres Ergebnis der Arbeit erweist sich diese Parameterrepräsentation als Ergänzung zu dem zugrundeliegenden Parameterset. Deshalb liegt eine Kombination beider Parametersets im Sinne einer Verbesserung der Leistungsfähigkeit des Erkennungssystems nahe.
Am Ende der Arbeit sind schließlich einige potentielle Erweiterungsmöglichkeiten zu den vorgestellten Methoden zu finden.
Schlüsselwörter: Feature Enhancement, MLP, SOM, Sprecher-Basis-Selektion, SprechererkennungIn the context of the SecurePhone project, a multimodal user authentication system was developed for implementation on a PDA. Extending this system, we investigate biometric feature enhancement and multi-feature fusion with the aim of improving user authentication accuracy.
In this dissertation, a general framework for feature enhancement is proposed which uses a multilayer perceptron (MLP) to achieve optimal speaker discrimination.
First, to train this MLP a subset of speakers (speaker basis) is used to represent the underlying characteristics of the given acoustic feature space.
Second, the size of the speaker basis is found to be among the crucial factors affecting the performance of a speaker recognition system.
Third, it is found that the selection of the speaker basis can also influence system performance. Based on this observation, an automatic speaker selection approach is proposed on the basis of the maximal average between-class variance. Tests in a variety of conditions, including clean and noisy as well as telephone speech, show that this approach can improve the performance of speaker recognition systems. This approach, which is applied here to feature enhancement for speaker recognition, can be expected to also be effective with other biometric modalities besides speech.
Further, an alternative feature representation is proposed in this dissertation, which is derived from what we call speaker voice signatures (SVS). These are trajectories in a Kohonen self organising map (SOM) which has been trained to represent the acoustic space. This feature representation is found to be somewhat complementary to the baseline feature set, suggesting that they can be fused to achieve improved performance in speaker recognition.
Finally, this dissertation finishes with a number of potential extensions of the proposed approaches.
Keywords: feature enhancement, MLP, SOM, speaker basis selection, speaker recognition, biometric, authentication, verificatio
Pedestrian soft biometrics recognition using deep learning on thermal images in smart cities
With technological advancement and the rise of the Internet of Things, our society is becoming more interconnected than ever before. Our computers and devices are getting smaller, and their computing power and memory has been increasing. These advances coupled with the leaps in artificial intelligence caused by the deep learning revolution in recent yearshave led to an increasingly rising interest in the field of pervasive intelligence. Intelligence in the environment has been used in smart homes in order to bring assistance to semi-autonomous people by performing activity recognition based on sensor data. As technology keeps improving, we may start to investigate the extension of assistive technologies beyond the boundaries of smart homes and into our smart cities. In order to bring assistance to semi-autonomous people, the first step is to be able to recognize profiles of vulnerable people. In order to leverage technology and artificial intelligence to make our cities smarter, safer and more accessible, this thesis investigates the use of environmental sensors such as thermal cameras to perform pedestrian soft biometrics recognition (age, gender and mobility) in the city. In this thesis, the process of building prototypes from scratch in order to collect thermal gait data in the city is explored, and the use and optimization of deep learning algorithms to perform soft biometrics recognition, as well as the feasibility of implementing these algorithms on limited resource boards are explored. The use of unprocessed thermal images allows a higher degree of privacy for the citizens, and it is novel in the field of human profile recognition. This thesis aims to set the foundation of future work, both in the field of thermal images-based soft biometrics recognition and pervasive intelligence in our cities in order to make them smarter, and move towards an interconnected society.
Les progrès technologiques et le développement de l’Internet des Objets nous mènent vers une société de plus en plus interconnectée. Nos ordinateurs et nos appareils deviennent de plus en plus petits et leur puissance de calcul et leur mémoire ne cesse de s’améliorer. Ces avancées combinées aux récents progrès dans le domaine de l’intelligence artificielle avec la révolution de l’apprentissage profond ont mené à un intérêt grandissant dans le domaine de l’intelligence ambiante. L’intelligence ambiante a été utilisée dans le domaine des maisons intelligentes sous forme de reconnaissance d’activités, permettant d’assister les personnes semi-autonomes en utilisant des données collectées par des capteurs. Alors que le progrès technologique continue, nous arrivons à un point où l’hypothèse d’étendre ces stratégies d’assistance des maisons aux villes intelligentes devient de plus en plus réaliste. Afin d’étendre cette assistance aux villes, la première étape est d’identifier les personnes vulnérables, qui sont celles qui pourraient bénéficier de cette assistance. Dans le but d’utiliser la technologie pour rendre nos villes plus intelligentes, plus sûres et plus accessibles, cette thèse explore l’utilisation de capteurs environnementaux tels que des caméras thermiques pour effectuer de la reconnaissance de profils dans la ville (âge, genre et mobilité). Dans cette thèse, le processus de construction de prototypes pour récolter des données thermales dans la ville est présenté, et l’utilisation ainsi que l’optimisation d’algorithmes d’apprentissage profond pour la reconnaissance de profils est explorée. L’implémentation des algorithmes sur un système embarqué est également abordée. L’utilisation d’images thermiques garantit un plus grand degré d’anonymat pour les citoyens que l’utilisation de caméras RGB, et cette thèse représente les premiers travaux de reconnaissance de profils multiples en utilisant uniquement des images thermiques sans pré-traitement. Cette thèse a pour objectif de poser les bases pour des travaux futurs dans le domaine de la reconnaissance de profils en utilisant des images thermiques, ainsi que dans le domaine de l’intelligence ambiante dans nos villes, afin de les rendre plus intelligentes et de se diriger vers une société interconnectée
Handbook of Vascular Biometrics
This open access handbook provides the first comprehensive overview of biometrics exploiting the shape of human blood vessels for biometric recognition, i.e. vascular biometrics, including finger vein recognition, hand/palm vein recognition, retina recognition, and sclera recognition. After an introductory chapter summarizing the state of the art in and availability of commercial systems and open datasets/open source software, individual chapters focus on specific aspects of one of the biometric modalities, including questions of usability, security, and privacy. The book features contributions from both academia and major industrial manufacturers
Seamless Multimodal Biometrics for Continuous Personalised Wellbeing Monitoring
Artificially intelligent perception is increasingly present in the lives of
every one of us. Vehicles are no exception, (...) In the near future, pattern
recognition will have an even stronger role in vehicles, as self-driving cars
will require automated ways to understand what is happening around (and within)
them and act accordingly. (...) This doctoral work focused on advancing
in-vehicle sensing through the research of novel computer vision and pattern
recognition methodologies for both biometrics and wellbeing monitoring. The
main focus has been on electrocardiogram (ECG) biometrics, a trait well-known
for its potential for seamless driver monitoring. Major efforts were devoted to
achieving improved performance in identification and identity verification in
off-the-person scenarios, well-known for increased noise and variability. Here,
end-to-end deep learning ECG biometric solutions were proposed and important
topics were addressed such as cross-database and long-term performance,
waveform relevance through explainability, and interlead conversion. Face
biometrics, a natural complement to the ECG in seamless unconstrained
scenarios, was also studied in this work. The open challenges of masked face
recognition and interpretability in biometrics were tackled in an effort to
evolve towards algorithms that are more transparent, trustworthy, and robust to
significant occlusions. Within the topic of wellbeing monitoring, improved
solutions to multimodal emotion recognition in groups of people and
activity/violence recognition in in-vehicle scenarios were proposed. At last,
we also proposed a novel way to learn template security within end-to-end
models, dismissing additional separate encryption processes, and a
self-supervised learning approach tailored to sequential data, in order to
ensure data security and optimal performance. (...)Comment: Doctoral thesis presented and approved on the 21st of December 2022
to the University of Port
Resource-constrained re-identification in camera networks
PhDIn multi-camera surveillance, association of people detected in different camera views over
time, known as person re-identification, is a fundamental task. Re-identification is a challenging
problem because of changes in the appearance of people under varying camera conditions. Existing
approaches focus on improving the re-identification accuracy, while no specific effort has
yet been put into efficiently utilising the available resources that are normally limited in a camera
network, such as storage, computation and communication capabilities. In this thesis, we aim to
perform and improve the task of re-identification under constrained resources. More specifically,
we reduce the data needed to represent the appearance of an object through a proposed feature
selection method and a difference-vector representation method.
The proposed feature-selection method considers the computational cost of feature extraction
and the cost of storing the feature descriptor jointly with the feature’s re-identification performance
to select the most cost-effective and well-performing features. This selection allows us
to improve inter-camera re-identification while reducing storage and computation requirements
within each camera. The selected features are ranked in the order of effectiveness, which enable
a further reduction by dropping the least effective features when application constraints require
this conformity. We also reduce the communication overhead in the camera network by transferring
only a difference vector, obtained from the extracted features of an object and the reference
features within a camera, as an object representation for the association.
In order to reduce the number of possible matches per association, we group the objects appearing
within a defined time-interval in un-calibrated camera pairs. Such a grouping improves
the re-identification, since only those objects that appear within the same time-interval in a camera
pair are needed to be associated. For temporal alignment of cameras, we exploit differences
between the frame numbers of the detected objects in a camera pair. Finally, in contrast to
pairwise camera associations used in literature, we propose a many-to-one camera association
method for re-identification, where multiple cameras can be candidates for having generated the
previous detections of an object. We obtain camera-invariant matching scores from the scores
obtained using the pairwise re-identification approaches. These scores measure the chances of a
correct match between the objects detected in a group of cameras.
Experimental results on publicly available and in-lab multi-camera image and video datasets
show that the proposed methods successfully reduce storage, computation and communication
requirements while improving the re-identification rate compared to existing re-identification
approaches