8 research outputs found

    A generic framework for video understanding applied to group behavior recognition

    Get PDF
    This paper presents an approach to detect and track groups of people in video-surveillance applications, and to automatically recognize their behavior. This method keeps track of individuals moving together by maintaining a spacial and temporal group coherence. First, people are individually detected and tracked. Second, their trajectories are analyzed over a temporal window and clustered using the Mean-Shift algorithm. A coherence value describes how well a set of people can be described as a group. Furthermore, we propose a formal event description language. The group events recognition approach is successfully validated on 4 camera views from 3 datasets: an airport, a subway, a shopping center corridor and an entrance hall.Comment: (20/03/2012

    Combining Multiple Sensors for Event Detection of Older People

    Get PDF
    International audienceWe herein present a hierarchical model-based framework for event detection using multiple sensors. Event models combine a priori knowledge of the scene (3D geometric and semantic information, such as contextual zones and equipment) with moving objects (e.g., a Person) detected by a video monitoring system. The event models follow a generic ontology based on natural language, which allows domain experts to easily adapt them. The framework novelty lies on combining multiple sensors at decision (event) level, and handling their conflict using a proba-bilistic approach. The event conflict handling consists of computing the reliability of each sensor before their fusion using an alternative combination rule for Dempster-Shafer Theory. The framework evaluation is performed on multisensor recording of instrumental activities of daily living (e.g., watching TV, writing a check, preparing tea, organizing week intake of prescribed medication) of participants of a clinical trial for Alzheimer's disease study. Two fusion cases are presented: the combination of events (or activities) from heterogeneous sensors (RGB ambient camera and a wearable inertial sensor) following a deterministic fashion, and the combination of conflicting events from video cameras with partially overlapped field of view (a RGB-and a RGB-D-camera, Kinect). Results showed the framework improves the event detection rate in both cases

    Event Recognition System for Older People Monitoring Using an RGB-D Camera

    Get PDF
    International audienceIn many domains such as health monitoring, the semantic information provided by automatic monitoring systems has become essential. These systems should be as robust, as easy to deploy and as affordable as possible. This paper presents a monitoring system for mid to long-term event recognition based on RGB-D (Red Green Blue + Depth) standard algorithms and on additional algorithms in order to address a real world application. Using a hierarchical modelbased approach, the robustness of this system is evaluated on the recognition of physical tasks (e.g., balance test) undertaken by older people (N = 30) during a clinical protocol devoted to dementia study. The performance of the system is demonstrated at recognizing, first, human postures, and second, complex events based on posture and 3D contextual information of the scene

    A Multi-Sensor Approach for Activity Recognition in Older Patients

    Get PDF
    in pressInternational audienceExisting surveillance systems for older people activity analysis are focused on video and sensors analysis (e.g., accelerometers, pressure, infrared) applied for frailty assessment, fall detection, and the automatic identification of self-maintenance activities (e.g., dressing, self-feeding) at home. This paper proposes a multi-sensor surveillance system (accelerometers and video-camera) for the automatic detection of instrumental activities of daily living (IADL, e.g., preparing coffee, making a phone call) in a lab-based clinical protocol. IADLs refer to more complex activities than self-maintenance which decline in performance has been highlighted as an indicator of early symptoms of dementia. Ambient video analysis is used to describe older people activity in the scene, and an accelerometer wearable device is used to complement visual information in body posture identification (e.g., standing, sitting). A generic constraint-based ontology language is used to model IADL events using sensors reading and semantic information of the scene (e.g., presence in goal-oriented zones of the environment, temporal relationship of events, estimated postures). The proposed surveillance system is tested with 9 participants (healthy: 4, MCI: 5) in an observation room equipped with home appliances at the Memory Center of Nice Hospital. Experiments are recorded using a 2D video camera (8 fps) and an accelerometer device (MotionPod®). The multi-sensor approach presents an average sensitivity of 93.51% and an average precision of 63.61%, while the vision-based approach has a sensitivity of 77.23%, and a precision of 57.65%. The results show an improvement of the multi-sensor approach over the vision-based at IADL detection. Future work will focus on system use to evaluate the differences between the activity profile of healthy participants and early to mild stage Alzheimer's patients

    Towards Unsupervised Sudden Group Movement Discovery for Video Surveillance

    Get PDF
    International audienceThis paper presents a novel and unsupervised approach for discovering "sudden" movements in video surveillance videos. The proposed approach automatically detects quick motions in a video, corresponding to any action. A set of possible actions is not required and the proposed method successfully detects potentially alarm-raising actions without training or camera calibration. Moreover, the system uses a group detection and event recognition framework to relate detected sudden movements and groups of people, and provide a semantical interpretation of the scene. We have tested our approach on a dataset of nearly 8 hours of videos recorded from two cameras in the Parisian subway for a European Project. For evaluation, we annotated 1 hour of sequences containing 50 sudden movements

    Combining Multiple Sensors for Event Recognition of Older People

    Get PDF
    MIRRH, held in conjunction with ACM MM 2013.International audienceWe herein present a hierarchical model-based framework for event recognition using multiple sensors. Event models combine a priori knowledge of the scene (3D geometric and semantic information, such as contextual zones and equipments) with moving objects (e.g., a Person) detected by a monitoring system. The event models follow a generic ontology based on natural language; which allows domain experts to easily adapt them. The framework novelty relies on combining multiple sensors (heterogeneous and homogeneous) at decision level explicitly or implicitly by handling their conflict using a probabilistic approach. The implicit event conflict handling works by computing the event reliabilities for each sensor, and then combine them using Dempster-Shafer Theory. The multi-sensor system is evaluated using multi-modal recording of instrumental daily living activities (e.g., watching TV, writing a check, preparing tea, organizing the week intake of prescribed medication) of participants of a clinical study of Alzheimer's disease. The evaluation presents the preliminary results of this approach on two cases: the combination of events from heterogeneous sensors (a RGB camera and a wearable inertial sensor); and the combination of conflicting events from video cameras with a partially overlapped field of view (a RGB- and a RGB-D-camera). The results show the framework improves the event recognition rate in both cases

    Combining Multiple Sensors for Event Detection of Older People

    Get PDF
    International audienceWe herein present a hierarchical model-based framework for event detection using multiple sensors. Event models combine a priori knowledge of the scene (3D geometric and semantic information, such as contextual zones and equipment) with moving objects (e.g., a Person) detected by a video monitoring system. The event models follow a generic ontology based on natural language, which allows domain experts to easily adapt them. The framework novelty lies on combining multiple sensors at decision (event) level, and handling their conflict using a proba-bilistic approach. The event conflict handling consists of computing the reliability of each sensor before their fusion using an alternative combination rule for Dempster-Shafer Theory. The framework evaluation is performed on multisensor recording of instrumental activities of daily living (e.g., watching TV, writing a check, preparing tea, organizing week intake of prescribed medication) of participants of a clinical trial for Alzheimer's disease study. Two fusion cases are presented: the combination of events (or activities) from heterogeneous sensors (RGB ambient camera and a wearable inertial sensor) following a deterministic fashion, and the combination of conflicting events from video cameras with partially overlapped field of view (a RGB-and a RGB-D-camera, Kinect). Results showed the framework improves the event detection rate in both cases

    Detección de acciones humanas a partir de información de profundidad mediante redes neuronales convolucionales

    Get PDF
    El objetivo principal del presente trabajo es la implementación de un sistema de detección de acciones humanas en el ámbito de la seguridad y la video-vigilancia a partir de la información de profundidad ("Depth") proporcionada por sensores RGB-D. El sistema se basa en el empleo de redes neuronales convolucionales 3D (3D-CNN) que permiten realizar de forma automática la extracción de características y clasificación de acciones a partir de la información espacial y temporal de las secuencias de profundidad. La propuesta se ha evaluado de forma exhaustiva, obteniendo como resultados experimentales, una precisión del 94% en la detección de acciones. Si tenéis problemas, sugerencias o comentarios sobre el mismo, dirigidlas por favor a Sergio de López Diz .The main objective of this work is the implementation of human actions detection system in the field of security and video-surveillance from depth information provided by RGB-D sensors. The system is based on 3D convolutional neural networks (3D-CNN) that allow the automatic features extraction and actions classification from spatial and temporal information of depth sequences. The proposal has been exhaustively evaluated, obtaining as experimental results, an accuracy of 94% in the actions detection. If you have problems, suggestions or comments on the document, please forward them to Sergio de López Diz .Grado en Ingeniería Electrónica de Comunicacione
    corecore