386 research outputs found

    Video copy detection using multiple visual cues and MPEG-7 descriptors

    Get PDF
    We propose a video copy detection framework that detects copy segments by fusing the results of three different techniques: facial shot matching, activity subsequence matching, and non-facial shot matching using low-level features. In facial shot matching part, a high-level face detector identifies facial frames/shots in a video clip. Matching faces with extended body regions gives the flexibility to discriminate the same person (e.g., an anchor man or a political leader) in different events or scenes. In activity subsequence matching part, a spatio-temporal sequence matching technique is employed to match video clips/segments that are similar in terms of activity. Lastly, the non-facial shots are matched using low-level MPEG-7 descriptors and dynamic-weighted feature similarity calculation. The proposed framework is tested on the query and reference dataset of CBCD task of TRECVID 2008. Our results are compared with the results of top-8 most successful techniques submitted to this task. Promising results are obtained in terms of both effectiveness and efficiency. © 2010 Elsevier Inc. All rights reserved

    A user-centered approach to rushes summarisation via highlight-detected keyframes

    Get PDF
    We present our keyframe-based summary approach for BBC Rushes video as part of the TRECVid Summarisation benchmark evaluation carried out in 2007. We outline our approach to summarisation that uses video processing for feature extraction and is informed by human factors considerations for summary presentation. Based on the performance of our generated summaries as reported by NIST, we subsequently undertook detailed failure analysis of our approach. The findings of this investigation as well as recommendations for alterations to our keyframe-based summary generation method, and the evaluation methodology for Rushes summaries in general, are detailed within this paper

    A Deep Siamese Network for Scene Detection in Broadcast Videos

    Get PDF
    We present a model that automatically divides broadcast videos into coherent scenes by learning a distance measure between shots. Experiments are performed to demonstrate the effectiveness of our approach by comparing our algorithm against recent proposals for automatic scene segmentation. We also propose an improved performance measure that aims to reduce the gap between numerical evaluation and expected results, and propose and release a new benchmark dataset.Comment: ACM Multimedia 201

    An empirical study of inter-concept similarities in multimedia ontologies

    Get PDF
    Generic concept detection has been a widely studied topic in recent research on multimedia analysis and retrieval, but the issue of how to exploit the structure of a multimedia ontology as well as different inter-concept relations, has not received similar attention. In this paper, we present results from our empirical analysis of different types of similarity among semantic concepts in two multimedia ontologies, LSCOM-Lite and CDVP-206. The results show promise that the proposed methods may be helpful in providing insight into the existing inter-concept relations within an ontology and selecting the most facilitating set of concepts and hierarchical relations. Such an analysis as this can be utilized in various tasks such as building more reliable concept detectors and designing large-scale ontologies

    Towards responsive Sensitive Artificial Listeners

    Get PDF
    This paper describes work in the recently started project SEMAINE, which aims to build a set of Sensitive Artificial Listeners – conversational agents designed to sustain an interaction with a human user despite limited verbal skills, through robust recognition and generation of non-verbal behaviour in real-time, both when the agent is speaking and listening. We report on data collection and on the design of a system architecture in view of real-time responsiveness

    Graph Based Video Sequence Matching & BoF Method for Video Copy detection

    Get PDF
    In this paper we propose video copy detection method using Bag-of-Features and showing acyclic graph of matching frames of videos. This include use of both local (line, texture, color) and global (Scale Invariant Feature Transform i.e. SIFT) features. This process includes dividing video into small frames using dual threshold method which eliminates the redundant frames and select unique key frames. After that from each key frame binary features are extracted which known as Bag of Features (BoF) which are get stored into the database in format of matrix. When any query video is being uploading, same features are extracted and compared with stored database to detect copied video. If video detected as copied then using Graph Based Sequence Matching Method, actual matched sequence between key frames is displayed in acyclic graph. DOI: 10.17762/ijritcc2321-8169.15067

    MyPlaces: detecting important settings in a visual diary

    Get PDF
    We describe a novel approach to identifying specific settings in large collections of passively captured images corresponding to a visual diary. An algorithm developed for setting detection should be capable of detecting images captured at the same real world locations (e.g. in the dining room at home, in front of the computer in the office, in the park, etc.). This requires the selection and implementation of suitable methods to identify visually similar backgrounds in images using their visual features. We use a Bag of Keypoints approach. This method is based on the sampling and subsequent vector quantization of multiple image patches. The image patches are sampled and described using Scale Invariant Feature Transform (SIFT) features. We compare two different classifiers, K Nearest Neighbour and Multiclass Linear Perceptron, and present results for classifying ten different settings across one week’s worth of images. Our results demonstrate that the method produces good classification accuracy even without exploiting geometric or context based information. We also describe an early prototype of a visual diary browser that integrates the classification results

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
    corecore