2,310 research outputs found

    Video Data Visualization System: Semantic Classification And Personalization

    Full text link
    We present in this paper an intelligent video data visualization tool, based on semantic classification, for retrieving and exploring a large scale corpus of videos. Our work is based on semantic classification resulting from semantic analysis of video. The obtained classes will be projected in the visualization space. The graph is represented by nodes and edges, the nodes are the keyframes of video documents and the edges are the relation between documents and the classes of documents. Finally, we construct the user's profile, based on the interaction with the system, to render the system more adequate to its references.Comment: graphic

    Deformable Prototypes for Encoding Shape Categories in Image Databases

    Full text link
    We describe a method for shape-based image database search that uses deformable prototypes to represent categories. Rather than directly comparing a candidate shape with all shape entries in the database, shapes are compared in terms of the types of nonrigid deformations (differences) that relate them to a small subset of representative prototypes. To solve the shape correspondence and alignment problem, we employ the technique of modal matching, an information-preserving shape decomposition for matching, describing, and comparing shapes despite sensor variations and nonrigid deformations. In modal matching, shape is decomposed into an ordered basis of orthogonal principal components. We demonstrate the utility of this approach for shape comparison in 2-D image databases.Office of Naval Research (Young Investigator Award N00014-06-1-0661

    Analysis of Using Metric Access Methods for Visual Search of Objects in Video Databases

    Get PDF
    This article presents an approach to object retrieval that searches for and localizes all the occurrences of an object in a video database, given a query image of the object. Our proposal is based on text-retrieval methods in which video key frames are represented by a dense set of viewpoint invariant region descriptors that enable recognition to proceed successfully despite changes in camera viewpoint, lighting, and partial occlusions. Vector quantizing these region descriptors provides a visual analogy of a word - a visual word. Those words are grouped into a visual vocabulary which is used to index all key frames from the video database. Efficient retrieval is then achieved by employing methods from statistical text retrieval, including inverted file systems, and text-document frequency weightings. Though works in the literature have only adopted a simple sequential scan during search, we investigate the use of different metric access methods (MAM): M-tree, Slim-tree, and D-index, in order to accelerate the processing of similarity queries. In addition, a ranking strategy based on the spatial layout of the regions (spatial consistency) is fully described and evaluated. Experimental results have shown that the adoption of MAMs not only has improved the search performance but also has reduced the influence of the vocabulary size over test results, which may improve the scalability of our proposal. Finally, the application of spatial consistency has produced a very significant improvement of the results

    MapSnapper: Engineering an Efficient Algorithm for Matching Images of Maps from Mobile Phones

    No full text
    The MapSnapper project aimed to develop a system for robust matching of low-quality images of a paper map taken from a mobile phone against a high quality digital raster representation of the same map. The paper presents a novel methodology for performing content-based image retrieval and object recognition from query images that have been degraded by noise and subjected to transformations through the imaging system. In addition the paper also provides an insight into the evaluation-driven development process that was used to incrementally improve the matching performance until the design specifications were met

    Saliency for Image Description and Retrieval

    Get PDF
    We live in a world where we are surrounded by ever increasing numbers of images. More often than not, these images have very little metadata by which they can be indexed and searched. In order to avoid information overload, techniques need to be developed to enable these image collections to be searched by their content. Much of the previous work on image retrieval has used global features such as colour and texture to describe the content of the image. However, these global features are insufficient to accurately describe the image content when different parts of the image have different characteristics. This thesis initially discusses how this problem can be circumvented by using salient interest regions to select the areas of the image that are most interesting and generate local descriptors to describe the image characteristics in that region. The thesis discusses a number of different saliency detectors that are suitable for robust retrieval purposes and performs a comparison between a number of these region detectors. The thesis then discusses how salient regions can be used for image retrieval using a number of techniques, but most importantly, two techniques inspired from the field of textual information retrieval. Using these robust retrieval techniques, a new paradigm in image retrieval is discussed, whereby the retrieval takes place on a mobile device using a query image captured by a built-in camera. This paradigm is demonstrated in the context of an art gallery, in which the device can be used to find more information about particular images. The final chapter of the thesis discusses some approaches to bridging the semantic gap in image retrieval. The chapter explores ways in which un-annotated image collections can be searched by keyword. Two techniques are discussed; the first explicitly attempts to automatically annotate the un-annotated images so that the automatically applied annotations can be used for searching. The second approach does not try to explicitly annotate images, but rather, through the use of linear algebra, it attempts to create a semantic space in which images and keywords are positioned such that images are close to the keywords that represent them within the space

    A Compact Sift-Based Strategy for Visual Information Retrieval in Large Image Databases

    Get PDF
    This paper applies the Standard Scale Invariant Feature Transform (S-SIFT) algorithm to accomplish the image descriptors of an eye region for a set of human eyes images from the UBIRIS database despite photometric transformations. The core assumption is that textured regions are locally planar and stationary. A descriptor with this type of invariance is sufficient to discern and describe a textured area regardless of the viewpoint and lighting in a perspective image, and it permits the identification of similar types of texture in a figure, such as an iris texture on an eye. It also enables to establish the correspondence between texture regions from distinct images acquired from different viewpoints (as, for example, two views of the front of a house), scales and/or subjected to linear transformations such as translation. Experiments have confirmed that the S-SIFT algorithm is a potent tool for a variety of problems in image identification
    corecore