57,397 research outputs found

    Using video objects and relevance feedback in video retrieval

    Get PDF
    Video retrieval is mostly based on using text from dialogue and this remains the most signi¯cant component, despite progress in other aspects. One problem with this is when a searcher wants to locate video based on what is appearing in the video rather than what is being spoken about. Alternatives such as automatically-detected features and image-based keyframe matching can be used, though these still need further improvement in quality. One other modality for video retrieval is based on segmenting objects from video and allowing end users to use these as part of querying. This uses similarity between query objects and objects from video, and in theory allows retrieval based on what is actually appearing on-screen. The main hurdles to greater use of this are the overhead of object segmentation on large amounts of video and the issue of whether we can actually achieve effective object-based retrieval. We describe a system to support object-based video retrieval where a user selects example video objects as part of the query. During a search a user builds up a set of these which are matched against objects previously segmented from a video library. This match is based on MPEG-7 Dominant Colour, Shape Compaction and Texture Browsing descriptors. We use a user-driven semi-automated segmentation process to segment the video archive which is very accurate and is faster than conventional video annotation

    A histogram-based approach for object-based query-by-shape-and-color in image and video databases

    Get PDF
    Cataloged from PDF version of article.Considering the fact that querying by low-level object features is essential in image and video data, an efficient approach for querying and retrieval by shape and color is proposed. The approach employs three specialized histograms, (i.e. distance, angle, and color histograms) to store feature-based information that is extracted from objects. The objects can be extracted from images or video frames. The proposed histogram-based approach is used as a component in the query-by-feature subsystem of a video database management system. The color and shape information is handled together to enrich the querying capabilities for content-based retrieval. The evaluation of the retrieval effectiveness and the robustness of the proposed approach is presented via performance experiments. (C) 2005 Elsevier Ltd All rights reserved

    Using segmented objects in ostensive video shot retrieval

    Get PDF
    This paper presents a system for video shot retrieval in which shots are retrieved based on matching video objects using a combination of colour, shape and texture. Rather than matching on individual objects, our system supports sets of query objects which in total reflect the user’s object-based information need. Our work also adapts to a shifting user information need by initiating the partitioning of a user’s search into two or more distinct search threads, which can be followed by the user in sequence. This is an automatic process which maps neatly to the ostensive model for information retrieval in that it allows a user to place a virtual checkpoint on their search, explore one thread or aspect of their information need and then return to that checkpoint to then explore an alternative thread. Our system is fully functional and operational and in this paper we illustrate several design decisions we have made in building it

    A histogram-based approach for object-based query-by-shape-and-color in image and video databases

    Get PDF
    Considering the fact that querying by low-level object features is essential in image and video data, an efficient approach for querying and retrieval by shape and color is proposed. The approach employs three specialized histograms, (i.e. distance, angle, and color histograms) to store feature-based information that is extracted from objects. The objects can be extracted from images or video frames. The proposed histogram-based approach is used as a component in the query-by-feature subsystem of a video database management system. The color and shape information is handled together to enrich the querying capabilities for content-based retrieval. The evaluation of the retrieval effectiveness and the robustness of the proposed approach is presented via performance experiments. © 2005 Elsevier Ltd. All rights reserved

    Spott : on-the-spot e-commerce for television using deep learning-based video analysis techniques

    Get PDF
    Spott is an innovative second screen mobile multimedia application which offers viewers relevant information on objects (e.g., clothing, furniture, food) they see and like on their television screens. The application enables interaction between TV audiences and brands, so producers and advertisers can offer potential consumers tailored promotions, e-shop items, and/or free samples. In line with the current views on innovation management, the technological excellence of the Spott application is coupled with iterative user involvement throughout the entire development process. This article discusses both of these aspects and how they impact each other. First, we focus on the technological building blocks that facilitate the (semi-) automatic interactive tagging process of objects in the video streams. The majority of these building blocks extensively make use of novel and state-of-the-art deep learning concepts and methodologies. We show how these deep learning based video analysis techniques facilitate video summarization, semantic keyframe clustering, and (similar) object retrieval. Secondly, we provide insights in user tests that have been performed to evaluate and optimize the application's user experience. The lessons learned from these open field tests have already been an essential input in the technology development and will further shape the future modifications to the Spott application

    Recognition of human activities and expressions in video sequences using shape context descriptor

    Get PDF
    The recognition of objects and classes of objects is of importance in the field of computer vision due to its applicability in areas such as video surveillance, medical imaging and retrieval of images and videos from large databases on the Internet. Effective recognition of object classes is still a challenge in vision; hence, there is much interest to improve the rate of recognition in order to keep up with the rising demands of the fields where these techniques are being applied. This thesis investigates the recognition of activities and expressions in video sequences using a new descriptor called the spatiotemporal shape context. The shape context is a well-known algorithm that describes the shape of an object based upon the mutual distribution of points in the contour of the object; however, it falls short when the distinctive property of an object is not just its shape but also its movement across frames in a video sequence. Since actions and expressions tend to have a motion component that enhances the capability of distinguishing them, the shape based information from the shape context proves insufficient. This thesis proposes new 3D and 4D spatiotemporal shape context descriptors that incorporate into the original shape context changes in motion across frames. Results of classification of actions and expressions demonstrate that the spatiotemporal shape context is better than the original shape context at enhancing recognition of classes in the activity and expression domains

    Associating low-level features with semantic concepts using video objects and relevance feedback

    Get PDF
    The holy grail of multimedia indexing and retrieval is developing algorithms capable of imitating human abilities in distinguishing and recognising semantic concepts within the content, so that retrieval can be based on ”real world” concepts that come naturally to users. In this paper, we discuss an approach to using segmented video objects as the midlevel connection between low-level features and semantic concept description. In this paper, we consider a video object as a particular instance of a semantic concept and we model the semantic concept as an average representation of its instances. A system supporting object-based search through a test corpus is presented that allows matching presegmented objects based on automatically extracted lowlevel features. In the system, relevance feedback is employed to drive the learning of the semantic model during a regular search process

    Video retrieval using dialogue, keyframe similarity and video objects

    Get PDF
    There are several different approaches to video retrieval which vary in sophistication, and in the level of their deployment. Some are well-known, others are not yet within our reach for any kind of large volumes of video. In particular, object-based video retrieval, where an object from within a video is used for retrieval, is often particularly desirable from a searcher's perspective. In this paper we introduce Fischlar-Simpsons, a system providing retrieval from an archive of video using any combination of text searching, keyframe image matching, shot-level browsing, as well as object-based retrieval. The system is driven by user feedback and interaction rather than having the conventional search/browse/search metaphor and the purpose of the system is to explore how users can use detected objects in a shot as part of a retrieval task

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
    corecore