171 research outputs found

    GAIT Technology for Human Recognition using CNN

    Get PDF
    Gait is a distinctive biometric characteristic that can be detected from a distance; as a result, it has several uses in social security, forensic identification, and crime prevention. Existing gait identification techniques use a gait template, which makes it difficult to keep temporal information, or a gait sequence, which maintains pointless sequential limitations and loses the ability to portray a gait. Our technique, which is based on this deep set viewpoint, is immune to frame permutations and can seamlessly combine frames from many videos that were taken in various contexts, such as diversified watching, angles, various outfits, or various situations for transporting something. According to experiments, our single-model strategy obtains an average rank-1 accuracy of 96.1% on the CASIA-B gait dataset and an accuracy of 87.9% on the OU-MVLP gait dataset when used under typical walking conditions. Our model also demonstrates a great degree of robustness under numerous challenging circumstances. When carrying bags and wearing a coat while walking, it obtains accuracy on the CASIA-B of 90.8% and 70.3%, respectively, greatly surpassing the best approach currently in use. Additionally, the suggested method achieves a satisfactory level of accuracy even when there are few frames available in the test samples; for instance, it achieves 85.0% on the CASIA-B even with only 7 frames

    Using the Microsoft Kinect to assess human bimanual coordination

    Get PDF
    Optical marker-based systems are the gold-standard for capturing three-dimensional (3D) human kinematics. However, these systems have various drawbacks including time consuming marker placement, soft tissue movement artifact, and are prohibitively expensive and non-portable. The Microsoft Kinect is an inexpensive, portable, depth camera that can be used to capture 3D human movement kinematics. Numerous investigations have assessed the Kinect\u27s ability to capture postural control and gait, but to date, no study has evaluated it\u27s capabilities for measuring spatiotemporal coordination. In order to investigate human coordination and coordination stability with the Kinect, a well-studied bimanual coordination paradigm (Kelso, 1984, Kelso; Scholz, & Schöner, 1986) was adapted. ^ Nineteen participants performed ten trials of coordinated hand movements in either in-phase or anti-phase patterns of coordination to the beat of a metronome which was incrementally sped up and slowed down. Continuous relative phase (CRP) and the standard deviation of CRP were used to assess coordination and coordination stability, respectively.^ Data from the Kinect were compared to a Vicon motion capture system using a mixed-model, repeated measures analysis of variance and intraclass correlation coefficients (2,1) (ICC(2,1)).^ Kinect significantly underestimated CRP for the the anti-phase coordination pattern (p \u3c.0001) and overestimated the in-phase pattern (p\u3c.0001). However, a high ICC value (r=.097) was found between the systems. For the standard deviation of CRP, the Kinect exhibited significantly higher variability than the Vicon (p \u3c .0001) but was able to distinguish significant differences between patterns of coordination with anti-phase variability being higher than in-phase (p \u3c .0001). Additionally, the Kinect was unable to accurately capture the structure of coordination stability for the anti-phase pattern. Finally, agreement was found between systems using the ICC (r=.37).^ In conclusion, the Kinect was unable to accurately capture mean CRP. However, the high ICC between the two systems is promising and the Kinect was able to distinguish between the coordination stability of in-phase and anti-phase coordination. However, the structure of variability as movement speed increased was dissimilar to the Vicon, particularly for the anti-phase pattern. Some aspects of coordination are nicely captured by the Kinect while others are not. Detecting differences between bimanual coordination patterns and the stability of those patterns can be achieved using the Kinect. However, researchers interested in the structure of coordination stability should exercise caution since poor agreement was found between systems

    Automatic learning of gait signatures for people identification

    Get PDF
    This work targets people identification in video based on the way they walk (i.e. gait). While classical methods typically derive gait signatures from sequences of binary silhouettes, in this work we explore the use of convolutional neural networks (CNN) for learning high-level descriptors from low-level motion features (i.e. optical flow components). We carry out a thorough experimental evaluation of the proposed CNN architecture on the challenging TUM-GAID dataset. The experimental results indicate that using spatio-temporal cuboids of optical flow as input data for CNN allows to obtain state-of-the-art results on the gait task with an image resolution eight times lower than the previously reported results (i.e. 80x60 pixels).Comment: Proof of concept paper. Technical report on the use of ConvNets (CNN) for gait recognition. Data and code: http://www.uco.es/~in1majim/research/cnngaitof.htm

    Kinectin käyttö kävelyntunnistuksen tutkimuksessa

    Get PDF
    Kävelyntunnistus on biometrinen tunnistusmenetelmä, jonka avulla ihminen voidaan tunnistaa ja yksilöidä. Kävelyntunnistusta voidaan hyödyntää autentikoinnin ja kulunvalvonnan lisäksi esimerkiksi kliinisessä diagnostiikassa tai virtuaalitodellisuuden sovelluksissa. Tutkielmassa todetaan, että kävelyntunnistuksen hyödyt biometrisenä tunnistus-menetelmänä liittyvät siihen, että henkilö voidaan tunnistaa kävelyn perusteella pitkienkin etäisyyksien päästä. Henkilö voidaan tunnistaa kävelyn perusteella myös henkilön tietämättä asiasta, joten tunnistus voidaan suorittaa ilman, että henkilö on tietoisesti vuorovaikutuksessa järjestelmän kanssa. Kinect on Microsoftin vuonna 2010 Xbox-pelikonsolille julkaisema liikkeentunnistussensori, jonka avulla voidaan kerätä esimerkiksi kävelyntunnistuksessa käytettävää dataa. Kinectin halvan hinnan, hyvän saatavuuden ja valmiin ohjelmistokehityspaketin myötä siitä on tullut suosittu työkalu tutkijoiden keskuudessa. Tässä tutkielmassa tarkastellaan kävelyntunnistusta, kävelyntunnistusmenetelmiä, Kinect-sensoria ja Kinectin käyttöä kävelyntunnistuksen tutkimuksessa. Tutkielma vastaa kysymykseen, millaisissa tutkimuksissa Kinectiä on hyödynnetty kävelyntunnistuksessa. Tutkielmassa käsitellään myös kävelyntunnistustutkimusten tuloksia. Tutkielmassa esitellään tutkimuksia, joissa käytetyt kävelyntunnistusmenetelmät jakautuvat malliperusteisiin, mallittomiin, kamerakulmasta riippuvaisiin ja kamerakulmasta riippumattomiin kävelyntunnistusmenetelmiin. Tutkielmassa käsitellyissä tutkimuksissa tutkitaan kävelyntunnistusta pääasiassa henkilöntunnistuksen näkökulmasta, mutta joukossa on myös esimerkiksi epätyypillisen kävelyn tunnistamista käsittelevä tutkimus. Tutkielmassa osoitetaan, että Kinect on käytännöllinen työkalu kävelyntunnistukseen, koska sen sisältämillä sensoreilla ja kameroilla voidaan kuvata monipuolista dataa, jota voidaan helposti hyödyntää kävelyntunnistuksessa. Lisäksi Kinectin mukana tulevan ohjelmistokehityspaketin avulla dataa voidaan muuttaa käytettävään muotoon, joten tutkijoiden ei tarvitse luoda monimutkaisia ohjelmistoja datan käsittelyä varten. Tutkielmassa kuitenkin todetaan, että Kinectin heikkouksiin lukeutuu esimerkiksi virhetilanteet datavirran analysoinnissa, joten se ei ole aivan täydellinen työkalu kävelyntunnistusta varten

    Analysis of 3D human gait reconstructed with a depth camera and mirrors

    Full text link
    L'évaluation de la démarche humaine est l'une des composantes essentielles dans les soins de santé. Les systèmes à base de marqueurs avec plusieurs caméras sont largement utilisés pour faire cette analyse. Cependant, ces systèmes nécessitent généralement des équipements spécifiques à prix élevé et/ou des moyens de calcul intensif. Afin de réduire le coût de ces dispositifs, nous nous concentrons sur un système d'analyse de la marche qui utilise une seule caméra de profondeur. Le principe de notre travail est similaire aux systèmes multi-caméras, mais l'ensemble de caméras est remplacé par un seul capteur de profondeur et des miroirs. Chaque miroir dans notre configuration joue le rôle d'une caméra qui capture la scène sous un point de vue différent. Puisque nous n'utilisons qu'une seule caméra, il est ainsi possible d'éviter l'étape de synchronisation et également de réduire le coût de l'appareillage. Notre thèse peut être divisée en deux sections: reconstruction 3D et analyse de la marche. Le résultat de la première section est utilisé comme entrée de la seconde. Notre système pour la reconstruction 3D est constitué d'une caméra de profondeur et deux miroirs. Deux types de capteurs de profondeur, qui se distinguent sur la base du mécanisme d'estimation de profondeur, ont été utilisés dans nos travaux. Avec la technique de lumière structurée (SL) intégrée dans le capteur Kinect 1, nous effectuons la reconstruction 3D à partir des principes de l'optique géométrique. Pour augmenter le niveau des détails du modèle reconstruit en 3D, la Kinect 2 qui estime la profondeur par temps de vol (ToF), est ensuite utilisée pour l'acquisition d'images. Cependant, en raison de réflections multiples sur les miroirs, il se produit une distorsion de la profondeur dans notre système. Nous proposons donc une approche simple pour réduire cette distorsion avant d'appliquer les techniques d'optique géométrique pour reconstruire un nuage de points de l'objet 3D. Pour l'analyse de la démarche, nous proposons diverses alternatives centrées sur la normalité de la marche et la mesure de sa symétrie. Cela devrait être utile lors de traitements cliniques pour évaluer, par exemple, la récupération du patient après une intervention chirurgicale. Ces méthodes se composent d'approches avec ou sans modèle qui ont des inconvénients et avantages différents. Dans cette thèse, nous présentons 3 méthodes qui traitent directement les nuages de points reconstruits dans la section précédente. La première utilise la corrélation croisée des demi-corps gauche et droit pour évaluer la symétrie de la démarche, tandis que les deux autres methodes utilisent des autoencodeurs issus de l'apprentissage profond pour mesurer la normalité de la démarche.The problem of assessing human gaits has received a great attention in the literature since gait analysis is one of key components in healthcare. Marker-based and multi-camera systems are widely employed to deal with this problem. However, such systems usually require specific equipments with high price and/or high computational cost. In order to reduce the cost of devices, we focus on a system of gait analysis which employs only one depth sensor. The principle of our work is similar to multi-camera systems, but the collection of cameras is replaced by one depth sensor and mirrors. Each mirror in our setup plays the role of a camera which captures the scene at a different viewpoint. Since we use only one camera, the step of synchronization can thus be avoided and the cost of devices is also reduced. Our studies can be separated into two categories: 3D reconstruction and gait analysis. The result of the former category is used as the input of the latter one. Our system for 3D reconstruction is built with a depth camera and two mirrors. Two types of depth sensor, which are distinguished based on the scheme of depth estimation, have been employed in our works. With the structured light (SL) technique integrated into the Kinect 1, we perform the 3D reconstruction based on geometrical optics. In order to increase the level of details of the 3D reconstructed model, the Kinect 2 with time-of-flight (ToF) depth measurement is used for image acquisition instead of the previous generation. However, due to multiple reflections on the mirrors, depth distortion occurs in our setup. We thus propose a simple approach for reducing such distortion before applying geometrical optics to reconstruct a point cloud of the 3D object. For the task of gait analysis, we propose various alternative approaches focusing on the problem of gait normality/symmetry measurement. They are expected to be useful for clinical treatments such as monitoring patient's recovery after surgery. These methods consist of model-free and model-based approaches that have different cons and pros. In this dissertation, we present 3 methods that directly process point clouds reconstructed from the previous work. The first one uses cross-correlation of left and right half-bodies to assess gait symmetry while the other ones employ deep auto-encoders to measure gait normality

    Spatial and Temporal Modeling for Human Activity Recognition from Multimodal Sequential Data

    Get PDF
    Human Activity Recognition (HAR) has been an intense research area for more than a decade. Different sensors, ranging from 2D and 3D cameras to accelerometers, gyroscopes, and magnetometers, have been employed to generate multimodal signals to detect various human activities. With the advancement of sensing technology and the popularity of mobile devices, depth cameras and wearable devices, such as Microsoft Kinect and smart wristbands, open a unprecedented opportunity to solve the challenging HAR problem by learning expressive representations from the multimodal signals recording huge amounts of daily activities which comprise a rich set of categories. Although competitive performance has been reported, existing methods focus on the statistical or spatial representation of the human activity sequence; while the internal temporal dynamics of the human activity sequence are not sufficiently exploited. As a result, they often face the challenge of recognizing visually similar activities composed of dynamic patterns in different temporal order. In addition, many model-driven methods based on sophisticated features and carefully-designed classifiers are computationally demanding and unable to scale to a large dataset. In this dissertation, we propose to address these challenges from three different perspectives; namely, 3D spatial relationship modeling, dynamic temporal quantization, and temporal order encoding. We propose a novel octree-based algorithm for computing the 3D spatial relationships between objects from a 3D point cloud captured by a Kinect sensor. A set of 26 3D spatial directions are defined to describe the spatial relationship of an object with respect to a reference object. These 3D directions are implemented as a set of spatial operators, such as AboveSouthEast and BelowNorthWest, of an event query language to query human activities in an indoor environment; for example, A person walks in the hallway from north to south. The performance is quantitatively evaluated in a public RGBD object dataset and qualitatively investigated in a live video computing platform. In order to address the challenge of temporal modeling in human action recognition, we introduce the dynamic temporal quantization, a clustering-like algorithm to quantize human action sequences of varied lengths into fixed-size quantized vectors. A two-step optimization algorithm is proposed to jointly optimize the quantization of the original sequence. In the aggregation step, frames falling into the sample segment are aggregated by max-polling and produce the quantized representation of the segment. During the assignment step, frame-segment assignment is updated according to dynamic time warping, while the temporal order of the entire sequence is preserved. The proposed technique is evaluated on three public 3D human action datasets and achieves state-of-the-art performance. Finally, we propose a novel temporal order encoding approach that models the temporal dynamics of the sequential data for human activity recognition. The algorithm encodes the temporal order of the latent patterns extracted by the subspace projection and generates a highly compact First-Take-All (FTA) feature vector representing the entire sequential data. An optimization algorithm is further introduced to learn the optimized projections in order to increase the discriminative power of the FTA feature. The compactness of the FTA feature makes it extremely efficient for human activity recognition with nearest neighbor search based on Hamming distance. Experimental results on two public human activity datasets demonstrate the advantages of the FTA feature over state-of-the-art methods in both accuracy and efficiency

    BEHAVE - Behavioral analysis of visual events for assisted living scenarios

    Get PDF
    International audienceThis paper proposes BEHAVE, a person-centered pipeline for probabilistic event recognition. The proposed pipeline firstly detects the set of people in a video frame, then it searches for correspondences between people in the current and previous frames (i.e., people tracking). Finally, event recognition is carried for each person using proba-bilistic logic models (PLMs, ProbLog2 language). PLMs represent interactions among people, home appliances and semantic regions. They also enable one to assess the probability of an event given noisy observations of the real world. BEHAVE was evaluated on the task of online (non-clipped videos) and open-set event recognition (e.g., target events plus none class) on video recordings of seniors carrying out daily tasks. Results have shown that BEHAVE improves event recognition accuracy by handling missed and partially satisfied logic models. Future work will investigate how to extend PLMs to represent temporal relations among events
    corecore