57,718 research outputs found

    Interaction between high-level and low-level image analysis for semantic video object extraction

    Get PDF
    Authors of articles published in EURASIP Journal on Advances in Signal Processing are the copyright holders of their articles and have granted to any third party, in advance and in perpetuity, the right to use, reproduce or disseminate the article, according to the SpringerOpen copyright and license agreement (http://www.springeropen.com/authors/license)

    Video object segmentation and tracking.

    Get PDF
    Thesis (M.Sc.Eng.)-University of KwaZulu-Natal, 2005One of the more complex video processing problems currently vexing researchers is that of object segmentation. This involves identifying semantically meaningful objects in a scene and separating them from the background. While the human visual system is capable of performing this task with minimal effort, development and research in machine vision is yet to yield techniques that perform the task as effectively and efficiently. The problem is not only difficult due to the complexity of the mechanisms involved but also because it is an ill-posed problem. No unique segmentation of a scene exists as what is of interest as a segmented object depends very much on the application and the scene content. In most situations a priori knowledge of the nature of the problem is required, often depending on the specific application in which the segmentation tool is to be used. This research presents an automatic method of segmenting objects from a video sequence. The intent is to extract and maintain both the shape and contour information as the object changes dynamically over time in the sequence. A priori information is incorporated by requesting the user to tune a set of input parameters prior to execution of the algorithm. Motion is used as a semantic for video object extraction subject to the assumption that there is only one moving object in the scene and the only motion in the video sequence is that of the object of interest. It is further assumed that there is constant illumination and no occlusion of the object. A change detection mask is used to detect the moving object followed by morphological operators to refine the result. The change detection mask yields a model of the moving components; this is then compared to a contour map of the frame to extract a more accurate contour of the moving object and this is then used to extract the object of interest itself. Since the video object is moving as the sequence progresses, it is necessary to update the object over time. To accomplish this, an object tracker has been implemented based on the Hausdorff objectmatching algorithm. The dissertation begins with an overview of segmentation techniques and a discussion of the approach used in this research. This is followed by a detailed description of the algorithm covering initial segmentation, object tracking across frames and video object extraction. Finally, the semantic object extraction results for a variety of video sequences are presented and evaluated

    Content-based Video Retrieval by Integrating Spatio-Temporal and Stochastic Recognition of Events

    Get PDF
    As amounts of publicly available video data grow the need to query this data efficiently becomes significant. Consequently content-based retrieval of video data turns out to be a challenging and important problem. We address the specific aspect of inferring semantics automatically from raw video data. In particular, we introduce a new video data model that supports the integrated use of two different approaches for mapping low-level features to high-level concepts. Firstly, the model is extended with a rule-based approach that supports spatio-temporal formalization of high-level concepts, and then with a stochastic approach. Furthermore, results on real tennis video data are presented, demonstrating the validity of both approaches, as well us advantages of their integrated us

    Activity-driven content adaptation for effective video summarisation

    Get PDF
    In this paper, we present a novel method for content adaptation and video summarization fully implemented in compressed-domain. Firstly, summarization of generic videos is modeled as the process of extracted human objects under various activities/events. Accordingly, frames are classified into five categories via fuzzy decision including shot changes (cut and gradual transitions), motion activities (camera motion and object motion) and others by using two inter-frame measurements. Secondly, human objects are detected using Haar-like features. With the detected human objects and attained frame categories, activity levels for each frame are determined to adapt with video contents. Continuous frames belonging to same category are grouped to form one activity entry as content of interest (COI) which will convert the original video into a series of activities. An overall adjustable quota is used to control the size of generated summarization for efficient streaming purpose. Upon this quota, the frames selected for summarization are determined by evenly sampling the accumulated activity levels for content adaptation. Quantitative evaluations have proved the effectiveness and efficiency of our proposed approach, which provides a more flexible and general solution for this topic as domain-specific tasks such as accurate recognition of objects can be avoided

    Local Visual Microphones: Improved Sound Extraction from Silent Video

    Full text link
    Sound waves cause small vibrations in nearby objects. A few techniques exist in the literature that can extract sound from video. In this paper we study local vibration patterns at different image locations. We show that different locations in the image vibrate differently. We carefully aggregate local vibrations and produce a sound quality that improves state-of-the-art. We show that local vibrations could have a time delay because sound waves take time to travel through the air. We use this phenomenon to estimate sound direction. We also present a novel algorithm that speeds up sound extraction by two to three orders of magnitude and reaches real-time performance in a 20KHz video.Comment: Accepted to BMVC 201
    • …
    corecore