1,722 research outputs found

    Efficient contour-based shape representation and matching

    Get PDF
    This paper presents an efficient method for calculating the similarity between 2D closed shape contours. The proposed algorithm is invariant to translation, scale change and rotation. It can be used for database retrieval or for detecting regions with a particular shape in video sequences. The proposed algorithm is suitable for real-time applications. In the first stage of the algorithm, an ordered sequence of contour points approximating the shapes is extracted from the input binary images. The contours are translation and scale-size normalized, and small sets of the most likely starting points for both shapes are extracted. In the second stage, the starting points from both shapes are assigned into pairs and rotation alignment is performed. The dissimilarity measure is based on the geometrical distances between corresponding contour points. A fast sub-optimal method for solving the correspondence problem between contour points from two shapes is proposed. The dissimilarity measure is calculated for each pair of starting points. The lowest dissimilarity is taken as the final dissimilarity measure between two shapes. Three different experiments are carried out using the proposed approach: letter recognition using a web camera, our own simulation of Part B of the MPEG-7 core experiment “CE-Shape1” and detection of characters in cartoon video sequences. Results indicate that the proposed dissimilarity measure is aligned with human intuition

    Learning midlevel image features for natural scene and texture classification

    Get PDF
    This paper deals with coding of natural scenes in order to extract semantic information. We present a new scheme to project natural scenes onto a basis in which each dimension encodes statistically independent information. Basis extraction is performed by independent component analysis (ICA) applied to image patches culled from natural scenes. The study of the resulting coding units (coding filters) extracted from well-chosen categories of images shows that they adapt and respond selectively to discriminant features in natural scenes. Given this basis, we define global and local image signatures relying on the maximal activity of filters on the input image. Locally, the construction of the signature takes into account the spatial distribution of the maximal responses within the image. We propose a criterion to reduce the size of the space of representation for faster computation. The proposed approach is tested in the context of texture classification (111 classes), as well as natural scenes classification (11 categories, 2037 images). Using a common protocol, the other commonly used descriptors have at most 47.7% accuracy on average while our method obtains performances of up to 63.8%. We show that this advantage does not depend on the size of the signature and demonstrate the efficiency of the proposed criterion to select ICA filters and reduce the dimensio

    Using video objects and relevance feedback in video retrieval

    Get PDF
    Video retrieval is mostly based on using text from dialogue and this remains the most signiÂŻcant component, despite progress in other aspects. One problem with this is when a searcher wants to locate video based on what is appearing in the video rather than what is being spoken about. Alternatives such as automatically-detected features and image-based keyframe matching can be used, though these still need further improvement in quality. One other modality for video retrieval is based on segmenting objects from video and allowing end users to use these as part of querying. This uses similarity between query objects and objects from video, and in theory allows retrieval based on what is actually appearing on-screen. The main hurdles to greater use of this are the overhead of object segmentation on large amounts of video and the issue of whether we can actually achieve effective object-based retrieval. We describe a system to support object-based video retrieval where a user selects example video objects as part of the query. During a search a user builds up a set of these which are matched against objects previously segmented from a video library. This match is based on MPEG-7 Dominant Colour, Shape Compaction and Texture Browsing descriptors. We use a user-driven semi-automated segmentation process to segment the video archive which is very accurate and is faster than conventional video annotation

    2D shape classification and retrieval

    Get PDF
    We present a novel correspondence-based technique for efficient shape classification and retrieval. Shape boundaries are described by a set of (ad hoc) equally spaced points – avoiding the need to extract “landmark points”. By formulating the correspondence problem in terms of a simple generative model, we are able to efficiently compute matches that incorporate scale, translation, rotation and reflection invariance. A hierarchical scheme with likelihood cut-off provides additional speed-up. In contrast to many shape descriptors, the concept of a mean (prototype) shape follows naturally in this setting. This enables model based classification, greatly reducing the cost of the testing phase. Equal spacing of points can be defined in terms of either perimeter distance or radial angle. It is shown that combining the two leads to improved classification/retrieval performance.

    Human action recognition with MPEG-7 descriptors and architectures

    Full text link
    Modern video surveillance requires addressing high-level concepts such as humans' actions and activities. In addition, surveillance applications need to be portable over a variety of platforms, from servers to mobile devices. In this paper, we explore the potential of the MPEG-7 standard to provide interfaces, descriptors, and architectures for human action recognition from surveillance cameras. Two novel MPEG-7 descriptors, symbolic and feature-based, are presented alongside two different architectures, server-intensive and client-intensive. The descriptors and architectures are evaluated in the paper by way of a scenario analysis
    • 

    corecore