250 research outputs found

    Uncertainty-aware video visual analytics of tracked moving objects

    Get PDF
    Vast amounts of video data render manual video analysis useless while recent automatic video analytics techniques suffer from insufficient performance. To alleviate these issues we present a scalable and reliable approach exploiting the visual analytics methodology. This involves the user in the iterative process of exploration hypotheses generation and their verification. Scalability is achieved by interactive filter definitions on trajectory features extracted by the automatic computer vision stage. We establish the interface between user and machine adopting the VideoPerpetuoGram (VPG) for visualization and enable users to provide filter-based relevance feedback. Additionally users are supported in deriving hypotheses by context-sensitive statistical graphics. To allow for reliable decision making we gather uncertainties introduced by the computer vision step communicate these information to users through uncertainty visualization and grant fuzzy hypothesis formulation to interact with the machine. Finally we demonstrate the effectiveness of our approach by the video analysis mini challenge which was part of the IEEE Symposium on Visual Analytics Science and Technology 2009

    Towards interactive, intelligent, and integrated multimedia analytics

    Get PDF

    EmotionCues: Emotion-oriented visual summarization of classroom videos

    Get PDF

    Searching surveillance video contents using convolutional neural network

    Get PDF
    Manual video inspection, searching, and analyzing is exhausting and inefficient. This paper presents an intelligent system to search surveillance video contents using deep learning. The proposed system reduced the amount of work that is needed to perform video searching and improved the speed and accuracy. A pre-trained VGG-16 CNNs model is used for dataset training. In addition, key frames of videos were extracted in order to save space, reduce the amount of work, and reduce the execution time. The extracted key frames were processed using the sobel operator edge detector and the max-pooling in order to eliminate redundancy. This increases compaction and avoids similarities between extracted frames. A text file, that contains key frame index, time of occurrence, and the classification of the VGG-16 model, is produced. The text file enables humans to easily search for objects of interest. VIRAT and IVY LAB datasets were used in the experiments. In addition, 128 different classes were identified in the datasets. The classes represent important objects for surveillance systems. However, users can identify other classes and utilize the proposed methodology. Experiments and evaluation showed that the proposed system outperformed existing methods in an order of magnitude. The system achieved the best results in speed while providing a high accuracy in classification

    A comprehensive survey of multi-view video summarization

    Full text link
    [EN] There has been an exponential growth in the amount of visual data on a daily basis acquired from single or multi-view surveillance camera networks. This massive amount of data requires efficient mechanisms such as video summarization to ensure that only significant data are reported and the redundancy is reduced. Multi-view video summarization (MVS) is a less redundant and more concise way of providing information from the video content of all the cameras in the form of either keyframes or video segments. This paper presents an overview of the existing strategies proposed for MVS, including their advantages and drawbacks. Our survey covers the genericsteps in MVS, such as the pre-processing of video data, feature extraction, and post-processing followed by summary generation. We also describe the datasets that are available for the evaluation of MVS. Finally, we examine the major current issues related to MVS and put forward the recommendations for future research(1). (C) 2020 Elsevier Ltd. All rights reserved.This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2019R1A2B5B01070067)Hussain, T.; Muhammad, K.; Ding, W.; Lloret, J.; Baik, SW.; De Albuquerque, VHC. (2021). A comprehensive survey of multi-view video summarization. Pattern Recognition. 109:1-15. https://doi.org/10.1016/j.patcog.2020.10756711510

    A combined multiple action recognition and summarization for surveillance video sequences

    Get PDF
    Human action recognition and video summarization represent challenging tasks for several computer vision applications including video surveillance, criminal investigations, and sports applications. For long videos, it is difficult to search within a video for a specific action and/or person. Usually, human action recognition approaches presented in the literature deal with videos that contain only a single person, and they are able to recognize his action. This paper proposes an effective approach to multiple human action detection, recognition, and summarization. The multiple action detection extracts human bodies’ silhouette, then generates a specific sequence for each one of them using motion detection and tracking method. Each of the extracted sequences is then divided into shots that represent homogeneous actions in the sequence using the similarity between each pair frames. Using the histogram of the oriented gradient (HOG) of the Temporal Difference Map (TDMap) of the frames of each shot, we recognize the action by performing a comparison between the generated HOG and the existed HOGs in the training phase which represents all the HOGs of many actions using a set of videos for training. Also, using the TDMap images we recognize the action using a proposed CNN model. Action summarization is performed for each detected person. The efficiency of the proposed approach is shown through the obtained results for mainly multi-action detection and recognition

    Asynchronous Visualization of Spatiotemporal Information for Multiple Moving Targets

    Get PDF
    In the modern information age, the quantity and complexity of spatiotemporal data is increasing both rapidly and continuously. Sensor systems with multiple feeds that gather multidimensional spatiotemporal data will result in information clusters and overload, as well as a high cognitive load for users of these systems. To meet future safety-critical situations and enhance time-critical decision-making missions in dynamic environments, and to support the easy and effective managing, browsing, and searching of spatiotemporal data in a dynamic environment, we propose an asynchronous, scalable, and comprehensive spatiotemporal data organization, display, and interaction method that allows operators to navigate through spatiotemporal information rather than through the environments being examined, and to maintain all necessary global and local situation awareness. To empirically prove the viability of our approach, we developed the Event-Lens system, which generates asynchronous prioritized images to provide the operator with a manageable, comprehensive view of the information that is collected by multiple sensors. The user study and interaction mode experiments were designed and conducted. The Event-Lens system was discovered to have a consistent advantage in multiple moving-target marking-task performance measures. It was also found that participants’ attentional control, spatial ability, and action video gaming experience affected their overall performance

    Simplified Video Surveillance Framework for Dynamic Object Detection under Challenging Environment

    Get PDF
    An effective video surveillance system is highly essential in order to ensure constructing better form of video analytics. Existing review of literatures pertaining to video analytics are found to directly implement algorithms on the top of the video file without much emphasis on following problems i.e. i) dynamic orientation of subject, ii)poor illumination condition, iii) identification and classification of subjects, and iv) faster response time. Therefore, the proposed system implements an analytical concept that uses depth-image of the video feed along with the original colored video feed to apply an algorithm for extracting significant information about the motion blob of the dynamic subjects. Implemented in MATLAB, the study outcome shows that it is capable of addressing all the above mentioned problems associated with existing research trends on video analytics by using a very simple and non-iterative process of implementation. The applicability of the proposed system in practical world is thereby proven
    • …
    corecore