15 research outputs found

    Spatio-temporal Video Parsing for Abnormality Detection

    Get PDF
    Abnormality detection in video poses particular challenges due to the infinite size of the class of all irregular objects and behaviors. Thus no (or by far not enough) abnormal training samples are available and we need to find abnormalities in test data without actually knowing what they are. Nevertheless, the prevailing concept of the field is to directly search for individual abnormal local patches or image regions independent of another. To address this problem, we propose a method for joint detection of abnormalities in videos by spatio-temporal video parsing. The goal of video parsing is to find a set of indispensable normal spatio-temporal object hypotheses that jointly explain all the foreground of a video, while, at the same time, being supported by normal training samples. Consequently, we avoid a direct detection of abnormalities and discover them indirectly as those hypotheses which are needed for covering the foreground without finding an explanation for themselves by normal samples. Abnormalities are localized by MAP inference in a graphical model and we solve it efficiently by formulating it as a convex optimization problem. We experimentally evaluate our approach on several challenging benchmark sets, improving over the state-of-the-art on all standard benchmarks both in terms of abnormality classification and localization.Comment: 15 pages, 12 figures, 3 table

    Detection of emergency events in crowded scenes

    Full text link

    Towards Unsupervised Sudden Group Movement Discovery for Video Surveillance

    Get PDF
    International audienceThis paper presents a novel and unsupervised approach for discovering "sudden" movements in video surveillance videos. The proposed approach automatically detects quick motions in a video, corresponding to any action. A set of possible actions is not required and the proposed method successfully detects potentially alarm-raising actions without training or camera calibration. Moreover, the system uses a group detection and event recognition framework to relate detected sudden movements and groups of people, and provide a semantical interpretation of the scene. We have tested our approach on a dataset of nearly 8 hours of videos recorded from two cameras in the Parisian subway for a European Project. For evaluation, we annotated 1 hour of sequences containing 50 sudden movements

    Data Mining in a Video Data Base

    Get PDF
    International audienceIn this chapter, we first present the state of the art in the domain. After we discuss how we achieve the pre-processing of the data. Then activity analysis and automatic classification is presented and finally we provide some results and evaluations.Dans ce chapitre, nous faisons un état de l'art. Ensuite nous montrons comment nous faisons le pre processind des data. Cette phase nous permet da faire l'analyse des activités et leur classification automatique. Après, quelques résultats et des évaluations sont présentés

    Activity discovery from video employing soft computing relations

    Get PDF
    International audienceThe present work presents a novel approach for activity extraction and knowledge discovery from video. Spatial and temporal properties from detected mobile objects are modeled employing fuzzy relations. These can then be aggregated employing typical soft-computing algebra. A clustering algorithm based on the transitive closure calculation of the fuzzy relations allows finding spatio-temporal patterns of activity. We employ trajectory-based analysis of mobiles in the video to discover the points of entry and exit of mobiles appearing in the scene and ultimately deduce the different areas of activity in the scene. These areas can be reported as activity maps with different granularities thanks to the analysis of the transitive closure matrix of the mobile fuzzy spatial relations. Discovered activity zones and spatio-temporal patterns of activity can be labeled in a human-like language. We present results obtained on real videos corresponding to apron monitoring in the Toulouse airport in France

    Deep labeller: automatic bounding box generation for synthetic violence detection datasets

    Get PDF
    Manually labelling datasets for training violence detection systems is time-consuming, expensive, and labor-intensive. Mind wandering, boredom, and short attention span can also cause labelling errors. Moreover, collecting and distributing sensitive images containing violence has ethical implications. Automation is the future for labelling sensitive image datasets. Deep labeller is a two-stage Deep Learning (DL) method that uses pre-trained DL object detection methods on MS-COCO for automatic labelling. The Deep Labeller method labels violent and nonviolent images in WVD and USI. In stage 1, WVD generates weak labels using synthetic images. In stage 2, the Deep labeller method is retrained on weak labels. USI dataset is used to test our method on real-world violence. Deep labeller generated weak and strong labels with an IoU of 0.80036 in stage 1 and 0.95 in stage 2 on the WVD. Automatically generated labels. To test our method’s generalisation power, violent and nonviolent image labels on USI dataset had a mean IoU of 0.7450

    Extraction of activity patterns on large video recordings

    Get PDF
    International audienceExtracting the hidden and useful knowledge embedded within video sequences and thereby discovering relations between the various elements to help an efficient decision-making process is a challenging task. The task of knowledge discovery and information analysis is possible because of recent advancements in object detection and tracking. The authors present how video information is processed with the ultimate aim to achieve knowledge discovery of people activity and also extract the relationship between the people and contextual objects in the scene. First, the object of interest and its semantic characteristics are derived in real-time. The semantic information related to the objects is represented in a suitable format for knowledge discovery. Next, two clustering processes are applied to derive the knowledge from the video data. Agglomerative hierarchical clustering is used to find the main trajectory patterns of people and relational analysis clustering is employed to extract the relationship between people, contextual objects and events. Finally, the authors evaluate the proposed activity extraction model using real video sequences from underground metro networks (CARETAKER) and a building hall (CAVIAR)

    Dropped object detection in crowded scenes

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 83-85).In the last decade, the topic of automated surveillance has become very important in the computer vision community. Especially important is the protection of critical transportation places and infrastructure like airport and railway stations. As a step in that direction, we consider the problem of detecting abandoned objects in a crowded scene. Assuming that the scene is being captured through a mid-field static camera, our approach consists of segmenting the foreground from the background and then using a change analyzer to detect any objects which meet certain criteria. In this thesis, we describe a background model and a method of bootstrapping that model in the presence of foreign objects in the foreground. We then use a Markov Random Field formulation to segment the foreground in image frames sampled periodically from the video camera. We use a change analyzer to detect foreground blobs that remain static through the scene and based on certain rules decide if the blob could be a potentially abandoned object.by Deepti Bhatnagar.S.M
    corecore