13 research outputs found

    A joint estimation of head and body orientation cues in surveillance video

    Full text link

    An adaptive motion model for person tracking with instantaneous head-pose features

    Get PDF

    Human behavior analysis in video surveillance: A Social Signal Processing perspective

    Get PDF
    The analysis of human activities is one of the most intriguing and important open issues for the automated video surveillance community. Since few years ago, it has been handled following a mere Computer Vision and Pattern Recognition perspective, where an activity corresponded to a temporal sequence of explicit actions (run, stop, sit, walk, etc.). Even under this simplistic assumption, the issue is hard, due to the strong diversity of the people appearance, the number of individuals considered (we may monitor single individuals, groups, crowd), the variability of the environmental conditions (indoor/outdoor, different weather conditions), and the kinds of sensors employed. More recently, the automated surveillance of human activities has been faced considering a new perspective, that brings in notions and principles from the social, affective, and psychological literature, and that is called Social Signal Processing (SSP). SSP employs primarily nonverbal cues, most of them are outside of conscious awareness, like face expressions and gazing, body posture and gestures, vocal characteristics, relative distances in the space and the like. This paper is the first review analyzing this new trend, proposing a structured snapshot of the state of the art and envisaging novel challenges in the surveillance domain where the cross-pollination of Computer Science technologies and Sociology theories may offer valid investigation strategies

    Calibration-free Pedestrian Partial Pose Estimation Using a High-mounted Kinect

    Get PDF
    Les applications de l’analyse du comportement humain ont subit de rapides développements durant les dernières décades, tant au niveau des systèmes de divertissements que pour des applications professionnelles comme les interfaces humain-machine, les systèmes d’assistance de conduite automobile ou des systèmes de protection des piétons. Cette thèse traite du problème de reconnaissance de piétons ainsi qu’à l’estimation de leur orientation en 3D. Cette estimation est faite dans l’optique que la connaissance de cette orientation est bénéfique tant au niveau de l’analyse que de la prédiction du comportement des piétons. De ce fait, cette thèse propose à la fois une nouvelle méthode pour détecter les piétons et une manière d’estimer leur orientation, par l’intégration séquentielle d’un module de détection et un module d’estimation d’orientation. Pour effectuer cette détection de piéton, nous avons conçu un classificateur en cascade qui génère automatiquement une boîte autour des piétons détectés dans l’image. Suivant cela, des régions sont extraites d’un nuage de points 3D afin de classifier l’orientation du torse du piéton. Cette classification se base sur une image synthétique grossière par tramage (rasterization) qui simule une caméra virtuelle placée immédiatement au-dessus du piéton détecté. Une machine à vecteurs de support effectue la classification à partir de cette image de synthèse, pour l’une des 10 orientations discrètes utilisées lors de l’entrainement (incréments de 30 degrés). Afin de valider les performances de notre approche d’estimation d’orientation, nous avons construit une base de données de référence contenant 764 nuages de points. Ces données furent capturées à l’aide d’une caméra Kinect de Microsoft pour 30 volontaires différents, et la vérité-terrain sur l’orientation fut établie par l’entremise d’un système de capture de mouvement Vicon. Finalement, nous avons démontré les améliorations apportées par notre approche. En particulier, nous pouvons détecter des piétons avec une précision de 95.29% et estimer l’orientation du corps (dans un intervalle de 30 degrés) avec une précision de 88.88%. Nous espérons ainsi que nos résultats de recherche puissent servir de point de départ à d’autres recherches futures.The application of human behavior analysis has undergone rapid development during the last decades from entertainment system to professional one, as Human Robot Interaction (HRI), Advanced Driver Assistance System (ADAS), Pedestrian Protection System (PPS), etc. Meanwhile, this thesis addresses the problem of recognizing pedestrians and estimating their body orientation in 3D based on the fact that estimating a person’s orientation is beneficial in determining their behavior. In this thesis, a new method is proposed for detecting and estimating the orientation, in which the result of a pedestrian detection module and a orientation estimation module are integrated sequentially. For the goal of pedestrian detection, a cascade classifier is designed to draw a bounding box around the detected pedestrian. Following this, extracted regions are given to a discrete orientation classifier to estimate pedestrian body’s orientation. This classification is based on a coarse, rasterized depth image simulating a top-view virtual camera, and uses a support vector machine classifier that was trained to distinguish 10 orientations (30 degrees increments). In order to test the performance of our approach, a new benchmark database contains 764 sets of point cloud for body-orientation classification was captured. For this benchmark, a Kinect recorded the point cloud of 30 participants and a marker-based motion capture system (Vicon) provided the ground truth on their orientation. Finally we demonstrated the improvements brought by our system, as it detected pedestrian with an accuracy of 95:29% and estimated the body orientation with an accuracy of 88:88%.We hope it can provide a new foundation for future researches

    A Joint Estimation of Head and Body Orientation Cues in Surveillance Video

    Get PDF
    The automatic analysis and understanding of behavior and interactions is a crucial task in the design of socially intelligent video surveillance systems. Such an analysis often relies on the extraction of people behavioral cues, amongst which body pose and head pose are probably the most important ones. In this paper, we propose an approach that jointly estimates these two cues from surveillance video. Given a human track, our algorithm works in two steps. First, a per-frame analysis is conducted, in which the head is localized, head and body features are extracted, and their likelihoods under different poses is evaluated. These likelihoods are then fused within a temporal filtering framework that jointly estimate the body position, body pose and head pose by taking advantage of the soft couplings between body position (movement direction), body pose and head pose. Quantitative as well as qualitative experiments show the benefit of several aspects of our approach and in particular the benefit of the joint estimation framework for tracking the behavior cues. Further analysis of behavior and interaction could then be conducted based on the output of our system. 1
    corecore