79 research outputs found
Reconnaissance de postures humaines par fusion de la silhouette et de l'ombre dans l'infrarouge
Les systèmes multicaméras utilisés pour la vidéosurveillance sont complexes, lourds et coûteux. Pour la surveillance d'une pièce, serait-il possible de les remplacer par un système beaucoup plus simple utilisant une seule caméra et une ou plusieurs sources lumineuses en misant sur les ombres projetées pour obtenir de l'information 3D ?
Malgré les résultats intéressants offerts par les systèmes multicaméras, la quantité d'information à traiter et leur complexité limitent grandement leur usage. Dans le même contexte, nous proposons de simplifier ces systèmes en remplaçant une caméra par une source lumineuse. En effet, une source lumineuse peut être vue comme une caméra qui génère une image d'ombre révélant l'objet qui bloque la lumière. Notre système sera composé par une seule caméra et une ou plusieurs sources lumineuses infrarouges (invisibles à l'oeil). Malgré les difficultés prévues quant à l'extraction de l'ombre et la déformation et l'occultation de l'ombre par des obstacles (murs, meubles...), les gains sont multiples en utilisant notre système. En effet, on peut éviter ainsi les problèmes de synchronisation et de calibrage de caméras et réduire le coût en remplaçant des caméras par de simples sources infrarouges.
Nous proposons deux approches différentes pour automatiser la reconnaissance de postures humaines. La première approche reconstruit la forme 3D d'une personne pour faire la reconnaissance de la posture en utilisant des descripteurs de forme. La deuxième approche combine directement l'information 2D (ombre+silhouette) pour faire la reconnaissance de postures.
Scientifiquement, nous cherchons à prouver que l'information offerte par une silhouette et l'ombre générée par une source lumineuse est suffisante pour permettre la reconnaissance de postures humaines élémentaires (p.ex. debout, assise, couchée, penchée, etc.).
Le système proposé peut être utilisé pour la vidéosurveillance d'endroits non encombrés tels qu'un corridor dans une résidence de personnes âgées (pour la détection des chutes p. ex.) ou d'une compagnie (pour la sécurité). Son faible coût permettrait un plus grand usage de la vidéosurveillance au bénéfice de la société. Au niveau scientifique, la démonstration théorique et pratique d'un tel système est originale et offre un grand potentiel pour la vidéosurveillance.Human posture recognition (HPR) from video sequences is one of the major active
research areas of computer vision. It is one step of the global process of human activity
recognition (HAR) for behaviors analysis. Many HPR application systems have
been developed including video surveillance, human-machine interaction, and the video
retrieval. Generally, applications related to HPR can be achieved using mainly two
approaches : single camera or multi-cameras. Despite the interesting performance achieved
by multi-camera systems, their complexity and the huge information to be processed
greatly limit their widespread use for HPR.
The main goal of this thesis is to simplify the multi-camera system by replacing a
camera by a light source. In fact, a light source can be seen as a virtual camera, which
generates a cast shadow image representing the silhouette of the person that blocks the
light. Our system will consist of a single camera and one or more infrared light sources.
Despite some technical difficulties in cast shadow segmentation and cast shadow deformation
because of walls and furniture, different advantages can be achieved by using our
system. Indeed, we can avoid the synchronization and calibration problems of multiple
cameras, reducing the cost of the system and the amount of processed data by replacing
a camera by one light source.
We introduce two different approaches in order to automatically recognize human
postures. The first approach directly combines the person’s silhouette and cast shadow
information, and uses 2D silhouette descriptor in order to extract discriminative features
useful for HPR. The second approach is inspired from the shape from silhouette technique
to reconstruct the visual hull of the posture using a set of cast shadow silhouettes,
and extract informative features through 3D shape descriptor. Using these approaches,
our goal is to prove the utility of the combination of person’s silhouette and cast shadow
information for recognizing elementary human postures (stand, bend, crouch, fall,...)
The proposed system can be used for video surveillance of uncluttered areas such as
a corridor in a senior’s residence (for example, for the detection of falls) or in a company (for security). Its low cost may allow greater use of video surveillance for the benefit of
society
Analysis of 3D human gait reconstructed with a depth camera and mirrors
L'évaluation de la démarche humaine est l'une des composantes essentielles dans les soins de santé. Les systèmes à base de marqueurs avec plusieurs caméras sont largement utilisés pour faire cette analyse. Cependant, ces systèmes nécessitent généralement des équipements spécifiques à prix élevé et/ou des moyens de calcul intensif. Afin de réduire le coût de ces dispositifs, nous nous concentrons sur un système d'analyse de la marche qui utilise une seule caméra de profondeur. Le principe de notre travail est similaire aux systèmes multi-caméras, mais l'ensemble de caméras est remplacé par un seul capteur de profondeur et des miroirs. Chaque miroir dans notre configuration joue le rôle d'une caméra qui capture la scène sous un point de vue différent. Puisque nous n'utilisons qu'une seule caméra, il est ainsi possible d'éviter l'étape de synchronisation et également de réduire le coût de l'appareillage.
Notre thèse peut être divisée en deux sections: reconstruction 3D et analyse de la marche. Le résultat de la première section est utilisé comme entrée de la seconde. Notre système pour la reconstruction 3D est constitué d'une caméra de profondeur et deux miroirs. Deux types de capteurs de profondeur, qui se distinguent sur la base du mécanisme d'estimation de profondeur, ont été utilisés dans nos travaux. Avec la technique de lumière structurée (SL) intégrée dans le capteur Kinect 1, nous effectuons la reconstruction 3D à partir des principes de l'optique géométrique. Pour augmenter le niveau des détails du modèle reconstruit en 3D, la Kinect 2 qui estime la profondeur par temps de vol (ToF), est ensuite utilisée pour l'acquisition d'images. Cependant, en raison de réflections multiples sur les miroirs, il se produit une distorsion de la profondeur dans notre système. Nous proposons donc une approche simple pour réduire cette distorsion avant d'appliquer les techniques d'optique géométrique pour reconstruire un nuage de points de l'objet 3D.
Pour l'analyse de la démarche, nous proposons diverses alternatives centrées sur la normalité de la marche et la mesure de sa symétrie. Cela devrait être utile lors de traitements cliniques pour évaluer, par exemple, la récupération du patient après une intervention chirurgicale. Ces méthodes se composent d'approches avec ou sans modèle qui ont des inconvénients et avantages différents. Dans cette thèse, nous présentons 3 méthodes qui traitent directement les nuages de points reconstruits dans la section précédente. La première utilise la corrélation croisée des demi-corps gauche et droit pour évaluer la symétrie de la démarche, tandis que les deux autres methodes utilisent des autoencodeurs issus de l'apprentissage profond pour mesurer la normalité de la démarche.The problem of assessing human gaits has received a great attention in the literature since gait analysis is one of key components in healthcare. Marker-based and multi-camera systems are widely employed to deal with this problem. However, such systems usually require specific equipments with high price and/or high computational cost. In order to reduce the cost of devices, we focus on a system of gait analysis which employs only one depth sensor. The principle of our work is similar to multi-camera systems, but the collection of cameras is replaced by one depth sensor and mirrors. Each mirror in our setup plays the role of a camera which captures the scene at a different viewpoint. Since we use only one camera, the step of synchronization can thus be avoided and the cost of devices is also reduced.
Our studies can be separated into two categories: 3D reconstruction and gait analysis. The result of the former category is used as the input of the latter one. Our system for 3D reconstruction is built with a depth camera and two mirrors. Two types of depth sensor, which are distinguished based on the scheme of depth estimation, have been employed in our works. With the structured light (SL) technique integrated into the Kinect 1, we perform the 3D reconstruction based on geometrical optics. In order to increase the level of details of the 3D reconstructed model, the Kinect 2 with time-of-flight (ToF) depth measurement is used for image acquisition instead of the previous generation. However, due to multiple reflections on the mirrors, depth distortion occurs in our setup. We thus propose a simple approach for reducing such distortion before applying geometrical optics to reconstruct a point cloud of the 3D object.
For the task of gait analysis, we propose various alternative approaches focusing on the problem of gait normality/symmetry measurement. They are expected to be useful for clinical treatments such as monitoring patient's recovery after surgery. These methods consist of model-free and model-based approaches that have different cons and pros. In this dissertation, we present 3 methods that directly process point clouds reconstructed from the previous work. The first one uses cross-correlation of left and right half-bodies to assess gait symmetry while the other ones employ deep auto-encoders to measure gait normality
Freeform 3D interactions in everyday environments
PhD ThesisPersonal computing is continuously moving away from traditional input using
mouse and keyboard, as new input technologies emerge. Recently, natural user interfaces
(NUI) have led to interactive systems that are inspired by our physical interactions
in the real-world, and focus on enabling dexterous freehand input in 2D or 3D. Another
recent trend is Augmented Reality (AR), which follows a similar goal to further reduce
the gap between the real and the virtual, but predominately focuses on output, by overlaying
virtual information onto a tracked real-world 3D scene.
Whilst AR and NUI technologies have been developed for both immersive 3D output as
well as seamless 3D input, these have mostly been looked at separately. NUI focuses on
sensing the user and enabling new forms of input; AR traditionally focuses on capturing
the environment around us and enabling new forms of output that are registered to the
real world. The output of NUI systems is mainly presented on a 2D display, while
the input technologies for AR experiences, such as data gloves and body-worn motion
trackers are often uncomfortable and restricting when interacting in the real world.
NUI and AR can be seen as very complimentary, and bringing these two fields together
can lead to new user experiences that radically change the way we interact with
our everyday environments. The aim of this thesis is to enable real-time, low latency,
dexterous input and immersive output without heavily instrumenting the user. The
main challenge is to retain and to meaningfully combine the positive qualities that are
attributed to both NUI and AR systems.
I review work in the intersecting research fields of AR and NUI, and explore freehand
3D interactions with varying degrees of expressiveness, directness and mobility
in various physical settings. There a number of technical challenges that arise when
designing a mixed NUI/AR system, which I will address is this work: What can we capture,
and how? How do we represent the real in the virtual? And how do we physically
couple input and output? This is achieved by designing new systems, algorithms, and
user experiences that explore the combination of AR and NUI
3D object reconstruction using computer vision : reconstruction and characterization applications for external human anatomical structures
Tese de doutoramento. Engenharia Informática. Faculdade de Engenharia. Universidade do Porto. 201
Recommended from our members
An Investigation into the Relationship between Static and Dynamic Gait Features. A biometrics Perspective
Biometrics is a unique physical or behavioral characteristic of a person. This unique attribute, such as fingerprints or gait, can be used for identification or verification purposes. Gait is an emerging biometrics with great potential. Gait recognition is based on recognizing a person by the manner in which they walk. Its potential lays in that it can be captured at a distance and does not require the cooperation of the subject. This advantage makes it a very attractive tool for forensic cases and applications, where it can assist in identifying a suspect when other evidence such as DNA, fingerprints, or a face were not attainable. Gait can be used for recognition in a direct manner when the two samples are shot from similar camera resolution, position, and conditions. Yet in some cases, the only sample available is of an incomplete gait cycle, low resolution, low frame rate, a partially visible subject, or a single static image. Most of these conditions have one thing in common: static measurements. A gait signature is usually formed from a number of dynamic and static features. Static features are physical measurements of height, length, or build; while dynamic features are representations of joint rotations or trajectories.
The aim of this thesis is to study the potential of predicting dynamic features from static features. In this thesis, we have created a database that utilizes a 3D laser scanner for capturing accurate shape and volumes of a person, and a motion capture system to accurately record motion data. The first analysis focused on analyzing the correlation between twenty-one 2D static features and eight dynamic features. Eleven pairs of features were regarded as significant with the criterion of a P-value less than 0.05. Other features also showed a strong correlation that indicated the potential of their predictive power. The second analysis focused on 3D static and dynamic features. Through the correlation analysis, 1196 pairs of features were found to be significantly correlated. Based on these results, a linear regression analysis was used to predict a dynamic gait signature. The predictors chosen were based on two adaptive methods that were developed in this thesis: "the top-x" method and the "mixed method". The predictions were assessed for both for their accuracy and their classification potential that would be used for gait recognition. The top results produced a 59.21% mean matching percentile. This result will act as baseline for future research in predicting a dynamic gait signature from static features. The results of this thesis bare potential for applications in biomechanics, biometrics, forensics, and 3D animation
Articulated human tracking and behavioural analysis in video sequences
Recently, there has been a dramatic growth of interest in the observation and tracking
of human subjects through video sequences. Arguably, the principal impetus has come
from the perceived demand for technological surveillance, however applications in entertainment,
intelligent domiciles and medicine are also increasing. This thesis examines
human articulated tracking and the classi cation of human movement, rst separately
and then as a sequential process.
First, this thesis considers the development and training of a 3D model of human body
structure and dynamics. To process video sequences, an observation model is also designed
with a multi-component likelihood based on edge, silhouette and colour. This is de ned on
the articulated limbs, and visible from a single or multiple cameras, each of which may be
calibrated from that sequence. Second, for behavioural analysis, we develop a methodology
in which actions and activities are described by semantic labels generated from a Movement
Cluster Model (MCM). Third, a Hierarchical Partitioned Particle Filter (HPPF) was
developed for human tracking that allows multi-level parameter search consistent with the
body structure. This tracker relies on the articulated motion prediction provided by the
MCM at pose or limb level. Fourth, tracking and movement analysis are integrated to
generate a probabilistic activity description with action labels.
The implemented algorithms for tracking and behavioural analysis are tested extensively
and independently against ground truth on human tracking and surveillance
datasets. Dynamic models are shown to predict and generate synthetic motion, while
MCM recovers both periodic and non-periodic activities, de ned either on the whole body
or at the limb level. Tracking results are comparable with the state of the art, however
the integrated behaviour analysis adds to the value of the approach.Overseas Research Students Awards Scheme (ORSAS
Extending quality and covariate analyses for gait biometrics
Recognising humans by the way they walk has attracted a significant interest in recent years due to its potential use in a number of applications such as automated visual surveillance. Technologies utilising gait biometrics have the potential to provide safer society and improve quality of life. However, automated gait recognition is a very challenging research problem and some fundamental issues remain unsolved.At the moment, gait recognition performs well only when samples acquired in similar conditions are matched. An operational automated gait recognition system does not yet exist. The primary aim of the research presented in this thesis is to understand the main challenges associated with deployment of gait recognition and to propose novel solutions to some of the most fundamental issues. There has been lack of understanding of the effect of some subject dependent covariates on gait recognition performance. We have proposed a novel dataset that allows analyses of various covariates in a principled manner. The results of the database evaluation revealed that elapsed time does not affect recognition in the short to medium term, contrary to what other studies have concluded. The analyses show how other factors related to the subject affect recognition performance.Only few gait recognition approaches have been validated in real world conditions. We have collected a new dataset at two realistic locations. Using the database we have shown that there are many environment related factors that can affect performance. The quality of silhouettes has been identified as one of the most important issues for translating gait recognition research to the ‘real-world’. The existing quality algorithms proved insufficient and therefore we extended quality metrics and proposed new ways of improving signature quality and therefore performance. A new fully working automated system has been implemented.Experiments using the system in ‘real-world’ conditions have revealed additional challenges not present when analysing datasets of fixed size. In conclusion, the research has investigated many of the factors that affect current gait recognition algorithms and has presented novel approaches of dealing with some of the most important issues related to translating gait recognition to real-world environments
Carried baggage detection and recognition in video surveillance with foreground segmentation
Security cameras installed in public spaces or in private organizations continuously
record video data with the aim of detecting and preventing crime. For that reason,
video content analysis applications, either for real time (i.e. analytic) or post-event
(i.e. forensic) analysis, have gained high interest in recent years. In this thesis,
the primary focus is on two key aspects of video analysis, reliable moving object
segmentation and carried object detection & identification.
A novel moving object segmentation scheme by background subtraction is presented
in this thesis. The scheme relies on background modelling which is based
on multi-directional gradient and phase congruency. As a post processing step,
the detected foreground contours are refined by classifying the edge segments as
either belonging to the foreground or background. Further contour completion
technique by anisotropic diffusion is first introduced in this area. The proposed
method targets cast shadow removal, gradual illumination change invariance, and
closed contour extraction.
A state of the art carried object detection method is employed as a benchmark
algorithm. This method includes silhouette analysis by comparing human temporal
templates with unencumbered human models. The implementation aspects of
the algorithm are improved by automatically estimating the viewing direction of
the pedestrian and are extended by a carried luggage identification module. As
the temporal template is a frequency template and the information that it provides
is not sufficient, a colour temporal template is introduced. The standard
steps followed by the state of the art algorithm are approached from a different
extended (by colour information) perspective, resulting in more accurate carried
object segmentation.
The experiments conducted in this research show that the proposed closed
foreground segmentation technique attains all the aforementioned goals. The incremental
improvements applied to the state of the art carried object detection
algorithm revealed the full potential of the scheme. The experiments demonstrate
the ability of the proposed carried object detection algorithm to supersede the
state of the art method
- …