550 research outputs found

    Coarse-to-Fine Adaptive People Detection for Video Sequences by Maximizing Mutual Information

    Full text link
    Applying people detectors to unseen data is challenging since patterns distributions, such as viewpoints, motion, poses, backgrounds, occlusions and people sizes, may significantly differ from the ones of the training dataset. In this paper, we propose a coarse-to-fine framework to adapt frame by frame people detectors during runtime classification, without requiring any additional manually labeled ground truth apart from the offline training of the detection model. Such adaptation make use of multiple detectors mutual information, i.e., similarities and dissimilarities of detectors estimated and agreed by pair-wise correlating their outputs. Globally, the proposed adaptation discriminates between relevant instants in a video sequence, i.e., identifies the representative frames for an adaptation of the system. Locally, the proposed adaptation identifies the best configuration (i.e., detection threshold) of each detector under analysis, maximizing the mutual information to obtain the detection threshold of each detector. The proposed coarse-to-fine approach does not require training the detectors for each new scenario and uses standard people detector outputs, i.e., bounding boxes. The experimental results demonstrate that the proposed approach outperforms state-of-the-art detectors whose optimal threshold configurations are previously determined and fixed from offline training dataThis work has been partially supported by the Spanish government under the project TEC2014-53176-R (HAVideo

    Toward an ethics of affinity: posthumanism and the question of the animal in two SF narratives of catastrophe

    Get PDF
    This article reads two narratives of catastrophe, Octavia Butler’s “Speech Sounds” (1983) and Ted Chiang’s “The Great Silence” (2015), in an attempt to explore how their concern with disaster destabilizes the binary human/nonhuman. Conjuring up visions of transformation and extinction before and after catastrophe, the stories interrogate humanist accounts of subjectivity through their focus on language and consciousness, prompting us to rethink the ontological divide between the human and the animal. This interrogation is carried out not only at the level of thematics, but also at a formal level, through the techniques of defamiliarization and extrapolation as well as through the choice of narrative voice and focalization. Thus, the two stories engage with some of the key issues addressed by the discourses originating from the fields of animal studies and critical posthumanism, which are currently gaining momentum in philosophy and literary criticism in the context of the Posthuman turn. As will be contended, the stories send a powerful message about the boundary between self and other, highlighting the necessity of a shift toward a posthumanist ethics of affinity

    A semantic-based probabilistic approach for real-time video event recognition

    Full text link
    This is the author’s version of a work that was accepted for publication in Journal Computer Vision and Image Understanding. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Journal Computer Vision and Image Understanding, 116, 9 (2012) DOI: 10.1016/j.cviu.2012.04.005This paper presents an approach for real-time video event recognition that combines the accuracy and descriptive capabilities of, respectively, probabilistic and semantic approaches. Based on a state-of-art knowledge representation, we define a methodology for building recognition strategies from event descriptions that consider the uncertainty of the low-level analysis. Then, we efficiently organize such strategies for performing the recognition according to the temporal characteristics of events. In particular, we use Bayesian Networks and probabilistically-extended Petri Nets for recognizing, respectively, simple and complex events. For demonstrating the proposed approach, a framework has been implemented for recognizing human-object interactions in the video monitoring domain. The experimental results show that our approach improves the event recognition performance as compared to the widely used deterministic approach.This work has been partially supported by the Spanish Administration agency CDTI (CENIT-VISION 2007- 1007), by the Spanish Government (TEC2011-25995 EventVideo), by the Consejería de Educación of the Comunidad de Madrid and by The European Social Fund

    On the effect of motion segmentation techniques in description based adaptive video transmission

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. C. San Miguel, and J. M. Martínez, "On the effect of motion segmentation techniques in description based adaptive video transmission", in AVSS '07: Proceedings of the 2007 IEEE Conference on Advanced Video and Signal Based Surveillance, 2007, p. 359-364This paper presents the results of analysing the effect of different motion segmentation techniques in a system that transmits the information captured by a static surveillance camera in an adaptative way based on the on-line generation of descriptions and their descriptions at different levels of detail. The video sequences are analyzed to detect the regions of activity (motion analysis) and to differentiate them from the background, and the corresponding descriptions (mainly MPEG-7 moving regions) are generated together with the textures of the moving regions and the associated background image. Depending on the available bandwidth, different levels of transmission are specified, ranging from just sending the descriptions generated to a transmission with all the associated images corresponding to the moving objects and background. We study the effect of three motion segmentation algorithms in several aspects such as accurate segmentation, size of the descriptions generated, computational efficiency and reconstructed data quality.This work is partially supported by Cátedra Infoglobal-UAM para Nuevas Tecnologías de video aplicadas a la seguridad. This work is also supported by the Ministerio de Ciencia y Tecnología of the Spanish Government under project TIN2004-07860 (MEDUSA) and by the Comunidad de Madrid under project P-TIC-0223-0505 (PROMULTIDIS)

    Shadow detection in video surveillance by maximizing agreement between independent detectors

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. C. SanMiguel, and J. M. Martínez, "Shadow detection in video surveillance by maximizing agreement between independent detectors", in 16th IEEE International Conference on Image Processing, ICIP 2009. p. 1141-1144This paper starts from the idea of automatically choosing the appropriate thresholds for a shadow detection algorithm. It is based on the maximization of the agreement between two independent shadow detectors without training data. Firstly, this shadow detection algorithm is described and then, it is adapted to analyze video surveillance sequences. Some modifications are introduced to increase its robustness in generic surveillance scenarios and to reduce its overall computational cost (critical in some video surveillance applications). Experimental results show that the proposed modifications increase the detection reliability as compared to some previous shadow detection algorithms and performs considerably well across a variety of multiple surveillance scenarios.Work supported by the Spanish Government (TEC2007- 65400 SemanticVideo), by Cátedra Infoglobal-UAM for “Nuevas Tecnologías de video aplicadas a la seguridad”, by the Spanish Administration agency CDTI (CENIT-VISION 2007-1007), by the Comunidad de Madrid (S-050/TIC-0223 - ProMultiDis), by the Consejería de Educación of the Comunidad de Madrid and by the European Social Fund

    On the evaluation of background subtraction algorithms without ground-truth

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. C. San Miguel, and J. M. Martínez, "On the evaluation of background subtraction algorithms without ground-truth" in 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance, 2013, 180 - 187In video-surveillance systems, the moving object segmentation stage (commonly based on background subtraction) has to deal with several issues like noise, shadows and multimodal backgrounds. Hence, its failure is inevitable and its automatic evaluation is a desirable requirement for online analysis. In this paper, we propose a hierarchy of existing performance measures not-based on ground-truth for video object segmentation. Then, four measures based on color and motion are selected and examined in detail with different segmentation algorithms and standard test sequences for video object segmentation. Experimental results show that color-based measures perform better than motion-based measures and background multimodality heavily reduces the accuracy of all obtained evaluation results.This work is partially supported by the Spanish Government (TEC2007- 65400 SemanticVideo), by Cátedra Infoglobal-UAM for “Nuevas Tecnologías de video aplicadas a la seguridad”, by the Consejería de Educación of the Comunidad de Madrid and by the European Social Fund

    A semantic-guided and self-configurable framework for video analysis

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s00138-011-0397-xThis paper presents a distributed and scalable framework for video analysis that automatically estimates the optimal workflow required for the analysis of different application domains. It integrates several technologies related with data acquisition, visual analysis tools, communication protocols, and data storage. Moreover, hierarchical semantic representations are included in the framework to describe the application domain, the analysis capabilities, and the user preferences. The automatic determination of the analysis workflow is performed by selecting the most appropriate tools for each domain among the available ones in the framework by means of exploiting the relations between the semantic descriptions. The experimental results in the video surveillance domain demonstrate that the proposed approach successfully composes optimal workflows for video analysis applications.This work has been partially supported by the Spanish Government (TEC2011-25995), by the Consejería de Educación of the Comunidad de Madrid and by The European Social Fund

    On-line adaptive video sequence transmission based on generation and transmisión of descriptions

    Full text link
    Proceedings of the 26th Picture Coding Symposium, PCS 2007, Lisbon, Portugal, November 2007This paper presents a system to transmit the information from a static surveillance camera in an adaptive way, from low to higher bit-rate, based on the on-line generation of descriptions. The proposed system is based on a server/client model: the server is placed in the surveillance area and the client is placed in a user side. The server analyzes the video sequence to detect the regions of activity (motion analysis) and the corresponding descriptions (mainly MPEG-7 moving regions) are generated together with the textures of moving regions and the associated background image. Depending on the available bandwidth, different levels of transmission are specified, ranging from just sending the descriptions generated to a transmission with all the associated images corresponding to the moving objects and background.This work is partially supported by Cátedra Infoglobal-UAM para Nuevas Tecnologías de video aplicadas a la seguridad. This work is also supported by the Ministerio de Ciencia y Tecnología of the Spanish Government under project TIN2004-07860 (MEDUSA) and by the Comunidad de Madrid under project P-TIC-0223-0505 (PROMULTIDIS)

    Robust unattended and stolen object detection by fusing simple algorithms

    Full text link
    Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. C. San Miguel, and J. M. Martínez, "Robust unattended and stolen object detection by fusing simple algorithms", in IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance, 2008. AVSS '08, 2008, p. 18 - 25In this paper a new approach for detecting unattended or stolen objects in surveillance video is proposed. It is based on the fusion of evidence provided by three simple detectors. As a first step, the moving regions in the scene are detected and tracked. Then, these regions are classified as static or dynamic objects and human or nonhuman objects. Finally, objects detected as static and nonhuman are analyzed with each detector. Data from these detectors are fused together to select the best detection hypotheses. Experimental results show that the fusion-based approach increases the detection reliability as compared to the detectors and performs considerably well across a variety of multiple scenarios operating at realtime.This work is supported by Cátedra Infoglobal-UAM for “Nuevas Tecnologías de video aplicadas a la seguridad”, by the Spanish Government (TEC2007-65400 SemanticVideo), by the Comunidad de Madrid (S- 050/TIC-0223 - ProMultiDis-CM), by the Consejería de Educación of the Comunidad de Madrid and by the European Social Fund

    Dynamic video surveillance systems guided by domain ontologies

    Full text link
    This paper is a postprint of a paper submitted to and accepted for publication in 3rd International Conference on Imaging for Crime Detection and Prevention (ICDP 2009), and is subject to Institution of Engineering and Technology Copyright. The copy of record is available at IET Digital Library and IEEE XploreIn this paper we describe how the knowledge related to a specific domain and the available visual analysis tools can be used to create dynamic visual analysis systems for video surveillance. Firstly, the knowledge is described in terms of application domain (types of objects, events... that can appear in such domain) and system capabilities (algorithms, detection procedures...) by using an existing ontology. Secondly, the ontology is integrated into a framework to create the visual analysis systems for each domain by inspecting the relations between the entities defined in the domain and system knowledge. Additionally, when necessary, analysis tools could be added or removed on-line. Experiments/Application of the framework show that the proposed approach for creating dynamic visual analysis systems is suitable for analyzing different video surveillance domains without decreasing the overall performance in terms of computational time or detection accuracy.This work was partially supported by the Spanish Administration agency CDTI (CENIT-VISION 2007-1007), by the Spanish Government (TEC2007- 65400 SemanticVideo), by the Comunidad de Madrid (S-050/TIC-0223 - ProMultiDis), by Cátedra Infoglobal-UAM for “Nuevas Tecnologías de video aplicadas a la seguridad”, by the Consejería de Educación of the Comunidad de Madrid and by The European Social Fund
    corecore