88,751 research outputs found

    People detection based on appearance and motion models

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. A. Garcia-Martin, A. Hauptmann, and J. M. Martínez "People detection based on appearance and motion models", in 8th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2011, p. 256-260The main contribution of this paper is a new people detection algorithm based on motion information. The algorithm builds a people motion model based on the Implicit Shape Model (ISM) Framework and the MoSIFT descriptor. We also propose a detection system that integrates appearance, motion and tracking information. Experimental results over sequences extracted from the TRECVID dataset show that our new people motion detector produces results comparable to the state of the art and that the proposed multimodal fusion system improves the obtained results combining the three information sources.This work has been partially supported by the Cátedra UAM-Infoglobal ("Nuevas tecnologías de vídeo aplicadas a sistemas de video-seguridad") and by the Universidad Autónoma de Madrid (“FPI-UAM: Programa propio de ayudas para la Formación de Personal Investigador”

    Robust real time moving people detection in surveillance scenarios

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. A. GarcĂ­a MartĂ­n, and J. M. MartĂ­nez, "Robust real time moving people detection in surveillance scenarios", in 2010 Seventh IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2010, p. 241 - 247In this paper an improved real time algorithm for detecting pedestrians in surveillance video is proposed. The algorithm is based on people appearance and defines a person model as the union of four models of body parts. Firstly, motion segmentation is performed to detect moving pixels. Then, moving regions are extracted and tracked. Finally, the detected moving objects are classified as human or nonhuman objects. In order to test and validate the algorithm, we have developed a dataset containing annotated surveillance sequences of different complexity levels focused on the pedestrians detection. Experimental results over this dataset show that our approach performs considerably well at real time and even better than other real and non-real time approaches from the state of art.This work has partially supported by the CĂĄtedra UAMInfoglobal ("Nuevas tecnologĂ­as de vĂ­deo aplicadas a sistemas de video-seguridad") and by the Spanish Government (TEC2007-65400 SemanticVideo)

    Egocentric Hand Detection Via Dynamic Region Growing

    Full text link
    Egocentric videos, which mainly record the activities carried out by the users of the wearable cameras, have drawn much research attentions in recent years. Due to its lengthy content, a large number of ego-related applications have been developed to abstract the captured videos. As the users are accustomed to interacting with the target objects using their own hands while their hands usually appear within their visual fields during the interaction, an egocentric hand detection step is involved in tasks like gesture recognition, action recognition and social interaction understanding. In this work, we propose a dynamic region growing approach for hand region detection in egocentric videos, by jointly considering hand-related motion and egocentric cues. We first determine seed regions that most likely belong to the hand, by analyzing the motion patterns across successive frames. The hand regions can then be located by extending from the seed regions, according to the scores computed for the adjacent superpixels. These scores are derived from four egocentric cues: contrast, location, position consistency and appearance continuity. We discuss how to apply the proposed method in real-life scenarios, where multiple hands irregularly appear and disappear from the videos. Experimental results on public datasets show that the proposed method achieves superior performance compared with the state-of-the-art methods, especially in complicated scenarios

    Tracking by Prediction: A Deep Generative Model for Mutli-Person localisation and Tracking

    Full text link
    Current multi-person localisation and tracking systems have an over reliance on the use of appearance models for target re-identification and almost no approaches employ a complete deep learning solution for both objectives. We present a novel, complete deep learning framework for multi-person localisation and tracking. In this context we first introduce a light weight sequential Generative Adversarial Network architecture for person localisation, which overcomes issues related to occlusions and noisy detections, typically found in a multi person environment. In the proposed tracking framework we build upon recent advances in pedestrian trajectory prediction approaches and propose a novel data association scheme based on predicted trajectories. This removes the need for computationally expensive person re-identification systems based on appearance features and generates human like trajectories with minimal fragmentation. The proposed method is evaluated on multiple public benchmarks including both static and dynamic cameras and is capable of generating outstanding performance, especially among other recently proposed deep neural network based approaches.Comment: To appear in IEEE Winter Conference on Applications of Computer Vision (WACV), 201
    • 

    corecore