98 research outputs found
QUIS-CAMPI: Biometric Recognition in Surveillance Scenarios
The concerns about individuals security have justified the increasing number of surveillance
cameras deployed both in private and public spaces. However, contrary to popular belief,
these devices are in most cases used solely for recording, instead of feeding intelligent analysis
processes capable of extracting information about the observed individuals. Thus, even though
video surveillance has already proved to be essential for solving multiple crimes, obtaining relevant
details about the subjects that took part in a crime depends on the manual inspection
of recordings. As such, the current goal of the research community is the development of
automated surveillance systems capable of monitoring and identifying subjects in surveillance
scenarios. Accordingly, the main goal of this thesis is to improve the performance of biometric
recognition algorithms in data acquired from surveillance scenarios. In particular, we aim at
designing a visual surveillance system capable of acquiring biometric data at a distance (e.g.,
face, iris or gait) without requiring human intervention in the process, as well as devising biometric
recognition methods robust to the degradation factors resulting from the unconstrained
acquisition process.
Regarding the first goal, the analysis of the data acquired by typical surveillance systems
shows that large acquisition distances significantly decrease the resolution of biometric samples,
and thus their discriminability is not sufficient for recognition purposes. In the literature,
diverse works point out Pan Tilt Zoom (PTZ) cameras as the most practical way for acquiring
high-resolution imagery at a distance, particularly when using a master-slave configuration. In
the master-slave configuration, the video acquired by a typical surveillance camera is analyzed
for obtaining regions of interest (e.g., car, person) and these regions are subsequently imaged
at high-resolution by the PTZ camera. Several methods have already shown that this configuration
can be used for acquiring biometric data at a distance. Nevertheless, these methods
failed at providing effective solutions to the typical challenges of this strategy, restraining its
use in surveillance scenarios. Accordingly, this thesis proposes two methods to support the development
of a biometric data acquisition system based on the cooperation of a PTZ camera
with a typical surveillance camera. The first proposal is a camera calibration method capable
of accurately mapping the coordinates of the master camera to the pan/tilt angles of the PTZ
camera. The second proposal is a camera scheduling method for determining - in real-time -
the sequence of acquisitions that maximizes the number of different targets obtained, while
minimizing the cumulative transition time. In order to achieve the first goal of this thesis,
both methods were combined with state-of-the-art approaches of the human monitoring field
to develop a fully automated surveillance capable of acquiring biometric data at a distance and
without human cooperation, designated as QUIS-CAMPI system.
The QUIS-CAMPI system is the basis for pursuing the second goal of this thesis. The analysis
of the performance of the state-of-the-art biometric recognition approaches shows that these
approaches attain almost ideal recognition rates in unconstrained data. However, this performance
is incongruous with the recognition rates observed in surveillance scenarios. Taking into
account the drawbacks of current biometric datasets, this thesis introduces a novel dataset comprising
biometric samples (face images and gait videos) acquired by the QUIS-CAMPI system at a
distance ranging from 5 to 40 meters and without human intervention in the acquisition process.
This set allows to objectively assess the performance of state-of-the-art biometric recognition
methods in data that truly encompass the covariates of surveillance scenarios. As such, this set
was exploited for promoting the first international challenge on biometric recognition in the wild. This thesis describes the evaluation protocols adopted, along with the results obtained
by the nine methods specially designed for this competition. In addition, the data acquired by
the QUIS-CAMPI system were crucial for accomplishing the second goal of this thesis, i.e., the
development of methods robust to the covariates of surveillance scenarios. The first proposal
regards a method for detecting corrupted features in biometric signatures inferred by a redundancy
analysis algorithm. The second proposal is a caricature-based face recognition approach
capable of enhancing the recognition performance by automatically generating a caricature
from a 2D photo. The experimental evaluation of these methods shows that both approaches
contribute to improve the recognition performance in unconstrained data.A crescente preocupação com a segurança dos indivíduos tem justificado o crescimento
do número de câmaras de vídeo-vigilância instaladas tanto em espaços privados como públicos.
Contudo, ao contrário do que normalmente se pensa, estes dispositivos são, na maior parte dos
casos, usados apenas para gravação, não estando ligados a nenhum tipo de software inteligente
capaz de inferir em tempo real informações sobre os indivíduos observados. Assim, apesar de a
vídeo-vigilância ter provado ser essencial na resolução de diversos crimes, o seu uso está ainda
confinado à disponibilização de vídeos que têm que ser manualmente inspecionados para extrair
informações relevantes dos sujeitos envolvidos no crime. Como tal, atualmente, o principal
desafio da comunidade científica é o desenvolvimento de sistemas automatizados capazes de
monitorizar e identificar indivíduos em ambientes de vídeo-vigilância.
Esta tese tem como principal objetivo estender a aplicabilidade dos sistemas de reconhecimento
biométrico aos ambientes de vídeo-vigilância. De forma mais especifica, pretende-se
1) conceber um sistema de vídeo-vigilância que consiga adquirir dados biométricos a longas distâncias
(e.g., imagens da cara, íris, ou vídeos do tipo de passo) sem requerer a cooperação dos
indivíduos no processo; e 2) desenvolver métodos de reconhecimento biométrico robustos aos
fatores de degradação inerentes aos dados adquiridos por este tipo de sistemas.
No que diz respeito ao primeiro objetivo, a análise aos dados adquiridos pelos sistemas típicos
de vídeo-vigilância mostra que, devido à distância de captura, os traços biométricos amostrados
não são suficientemente discriminativos para garantir taxas de reconhecimento aceitáveis.
Na literatura, vários trabalhos advogam o uso de câmaras Pan Tilt Zoom (PTZ) para adquirir
imagens de alta resolução à distância, principalmente o uso destes dispositivos no modo masterslave.
Na configuração master-slave um módulo de análise inteligente seleciona zonas de interesse
(e.g. carros, pessoas) a partir do vídeo adquirido por uma câmara de vídeo-vigilância
e a câmara PTZ é orientada para adquirir em alta resolução as regiões de interesse. Diversos
métodos já mostraram que esta configuração pode ser usada para adquirir dados biométricos
à distância, ainda assim estes não foram capazes de solucionar alguns problemas relacionados
com esta estratégia, impedindo assim o seu uso em ambientes de vídeo-vigilância. Deste modo,
esta tese propõe dois métodos para permitir a aquisição de dados biométricos em ambientes de
vídeo-vigilância usando uma câmara PTZ assistida por uma câmara típica de vídeo-vigilância. O
primeiro é um método de calibração capaz de mapear de forma exata as coordenadas da câmara
master para o ângulo da câmara PTZ (slave) sem o auxílio de outros dispositivos óticos. O
segundo método determina a ordem pela qual um conjunto de sujeitos vai ser observado pela
câmara PTZ. O método proposto consegue determinar em tempo-real a sequência de observações
que maximiza o número de diferentes sujeitos observados e simultaneamente minimiza o
tempo total de transição entre sujeitos. De modo a atingir o primeiro objetivo desta tese, os
dois métodos propostos foram combinados com os avanços alcançados na área da monitorização
de humanos para assim desenvolver o primeiro sistema de vídeo-vigilância completamente automatizado
e capaz de adquirir dados biométricos a longas distâncias sem requerer a cooperação
dos indivíduos no processo, designado por sistema QUIS-CAMPI.
O sistema QUIS-CAMPI representa o ponto de partida para iniciar a investigação relacionada
com o segundo objetivo desta tese. A análise do desempenho dos métodos de reconhecimento
biométrico do estado-da-arte mostra que estes conseguem obter taxas de reconhecimento
quase perfeitas em dados adquiridos sem restrições (e.g., taxas de reconhecimento
maiores do que 99% no conjunto de dados LFW). Contudo, este desempenho não é corroborado pelos resultados observados em ambientes de vídeo-vigilância, o que sugere que os conjuntos
de dados atuais não contêm verdadeiramente os fatores de degradação típicos dos ambientes de
vídeo-vigilância. Tendo em conta as vulnerabilidades dos conjuntos de dados biométricos atuais,
esta tese introduz um novo conjunto de dados biométricos (imagens da face e vídeos do tipo de
passo) adquiridos pelo sistema QUIS-CAMPI a uma distância máxima de 40m e sem a cooperação
dos sujeitos no processo de aquisição. Este conjunto permite avaliar de forma objetiva o desempenho
dos métodos do estado-da-arte no reconhecimento de indivíduos em imagens/vídeos
capturados num ambiente real de vídeo-vigilância. Como tal, este conjunto foi utilizado para
promover a primeira competição de reconhecimento biométrico em ambientes não controlados.
Esta tese descreve os protocolos de avaliação usados, assim como os resultados obtidos por 9
métodos especialmente desenhados para esta competição. Para além disso, os dados adquiridos
pelo sistema QUIS-CAMPI foram essenciais para o desenvolvimento de dois métodos para
aumentar a robustez aos fatores de degradação observados em ambientes de vídeo-vigilância. O
primeiro é um método para detetar características corruptas em assinaturas biométricas através
da análise da redundância entre subconjuntos de características. O segundo é um método de
reconhecimento facial baseado em caricaturas automaticamente geradas a partir de uma única
foto do sujeito. As experiências realizadas mostram que ambos os métodos conseguem reduzir
as taxas de erro em dados adquiridos de forma não controlada
{3D} Morphable Face Models -- Past, Present and Future
In this paper, we provide a detailed survey of 3D Morphable Face Models over the 20 years since they were first proposed. The challenges in building and applying these models, namely capture, modeling, image formation, and image analysis, are still active research topics, and we review the state-of-the-art in each of these areas. We also look ahead, identifying unsolved challenges, proposing directions for future research and highlighting the broad range of current and future applications
Geometric Expression Invariant 3D Face Recognition using Statistical Discriminant Models
Currently there is no complete face recognition system that is invariant to all facial expressions.
Although humans find it easy to identify and recognise faces regardless of changes in illumination,
pose and expression, producing a computer system with a similar capability has proved to
be particularly di cult. Three dimensional face models are geometric in nature and therefore
have the advantage of being invariant to head pose and lighting. However they are still susceptible
to facial expressions. This can be seen in the decrease in the recognition results using
principal component analysis when expressions are added to a data set.
In order to achieve expression-invariant face recognition systems, we have employed a tensor
algebra framework to represent 3D face data with facial expressions in a parsimonious
space. Face variation factors are organised in particular subject and facial expression modes.
We manipulate this using single value decomposition on sub-tensors representing one variation
mode. This framework possesses the ability to deal with the shortcomings of PCA in less constrained
environments and still preserves the integrity of the 3D data. The results show improved
recognition rates for faces and facial expressions, even recognising high intensity expressions
that are not in the training datasets.
We have determined, experimentally, a set of anatomical landmarks that best describe facial
expression e ectively. We found that the best placement of landmarks to distinguish di erent
facial expressions are in areas around the prominent features, such as the cheeks and eyebrows.
Recognition results using landmark-based face recognition could be improved with better placement.
We looked into the possibility of achieving expression-invariant face recognition by reconstructing
and manipulating realistic facial expressions. We proposed a tensor-based statistical
discriminant analysis method to reconstruct facial expressions and in particular to neutralise
facial expressions. The results of the synthesised facial expressions are visually more realistic
than facial expressions generated using conventional active shape modelling (ASM). We
then used reconstructed neutral faces in the sub-tensor framework for recognition purposes.
The recognition results showed slight improvement. Besides biometric recognition, this novel
tensor-based synthesis approach could be used in computer games and real-time animation
applications
Analysis and Manipulation of Repetitive Structures of Varying Shape
Self-similarity and repetitions are ubiquitous in man-made and natural objects. Such structural regularities often relate to form, function, aesthetics, and design considerations. Discovering structural redundancies along with their dominant variations from 3D geometry not only allows us to better understand the underlying objects, but is also beneficial for several geometry processing tasks including compact representation, shape completion, and intuitive shape manipulation. To identify these repetitions, we present a novel detection algorithm based on analyzing a graph of surface features. We combine general feature detection schemes with a RANSAC-based randomized subgraph searching algorithm in order to reliably detect recurring patterns of locally unique structures. A subsequent segmentation step based on a simultaneous region growing is applied to verify that the actual data supports the patterns detected in the feature graphs. We introduce our graph based detection algorithm on the example of rigid repetitive structure detection. Then we extend the approach to allow more general deformations between the detected parts. We introduce subspace symmetries whereby we characterize similarity by requiring the set of repeating structures to form a low dimensional shape space. We discover these structures based on detecting linearly correlated correspondences among graphs of invariant features. The found symmetries along with the modeled variations are useful for a variety of applications including non-local and non-rigid denoising. Employing subspace symmetries for shape editing, we introduce a morphable part model for smart shape manipulation. The input geometry is converted to an assembly of deformable parts with appropriate boundary conditions. Our method uses self-similarities from a single model or corresponding parts of shape collections as training input and allows the user also to reassemble the identified parts in new configurations, thus exploiting both the discrete and continuous learned variations while ensuring appropriate boundary conditions across part boundaries. We obtain an interactive yet intuitive shape deformation framework producing realistic deformations on classes of objects that are difficult to edit using repetition-unaware deformation techniques
Qualifying 4D Deforming Surfaces by Registered Differential Features
Institute of Perception, Action and BehaviourRecent advances in 4D data acquisition systems in the field of Computer Vision have opened up many exciting new possibilities for the interpretation of complex moving surfaces. However, a fundamental problem is that this has also led to a huge increase in the volume of data to be handled. Attempting to make sense of this wealth of information is then a core issue to be addressed if such data can be applied to more complex tasks. Similar problems have been historically encountered in the analysis of 3D static surfaces, leading to the extraction of higher-level features based on analysis of the differential geometry.Our central hypothesis is that there exists a compact set of similarly useful descriptors for the analysis of dynamic 4D surfaces. The primary advantages in considering localised changes are that they provide a naturally useful set of invariant characteristics. We seek a constrained set of terms - a vocabulary - for describing all types of deformation. By using this, we show how to describe what the surface is doing more effectively; and thereby enable better characterisation, and consequently more effective visualisation and comparison.This thesis investigates this claim. We adopt a bottom-up approach of the problem, in which we acquire raw data from a newly constructed commercial 4D data capture system developed by our industrial partners. A crucial first step resolves the temporal non-linear registration between instances of the captured surface. We employ a combined optical/range flow to guide a conformation over a sequence. By extending the use of aligned colour information alongside the depth data we improve this estimation in the case of local surface motion ambiguities. By employing a KLT/thin-plate-spline method we also seek to preserve global deformation for regions with no estimate.We then extend aspects of differential geometry theory for existing static surface analysis to the temporal domain. Our initial formulation considers the possible intrinsic transitions from the set of shapes defined by the variations in the magnitudes of the principal curvatures. This gives rise to a total of 15 basic types of deformation. The change in the combined magnitudes also gives an indication of the extent of change. We then extend this to surface characteristics associated with expanding, rotating and shearing; to derive a full set of differential features.Our experimental results include qualitative assessment of deformations for short episodic registered sequences of both synthetic and real data. The higher-level distinctions extracted are furthermore a useful first step for parsimonious feature extraction, which we then proceed to demonstrate can be used as a basis for further analysis. We ultimately evaluate this approach by considering shape transition features occurring within the human face, and the applicability for identification and expression analysis tasks
- …