14,533 research outputs found

    Structured Knowledge Representation for Image Retrieval

    Full text link
    We propose a structured approach to the problem of retrieval of images by content and present a description logic that has been devised for the semantic indexing and retrieval of images containing complex objects. As other approaches do, we start from low-level features extracted with image analysis to detect and characterize regions in an image. However, in contrast with feature-based approaches, we provide a syntax to describe segmented regions as basic objects and complex objects as compositions of basic ones. Then we introduce a companion extensional semantics for defining reasoning services, such as retrieval, classification, and subsumption. These services can be used for both exact and approximate matching, using similarity measures. Using our logical approach as a formal specification, we implemented a complete client-server image retrieval system, which allows a user to pose both queries by sketch and queries by example. A set of experiments has been carried out on a testbed of images to assess the retrieval capabilities of the system in comparison with expert users ranking. Results are presented adopting a well-established measure of quality borrowed from textual information retrieval

    Advanced content-based semantic scene analysis and information retrieval: the SCHEMA project

    Get PDF
    The aim of the SCHEMA Network of Excellence is to bring together a critical mass of universities, research centers, industrial partners and end users, in order to design a reference system for content-based semantic scene analysis, interpretation and understanding. Relevant research areas include: content-based multimedia analysis and automatic annotation of semantic multimedia content, combined textual and multimedia information retrieval, semantic -web, MPEG-7 and MPEG-21 standards, user interfaces and human factors. In this paper, recent advances in content-based analysis, indexing and retrieval of digital media within the SCHEMA Network are presented. These advances will be integrated in the SCHEMA module-based, expandable reference system

    Enhanced image annotations based on spatial information extraction and ontologies

    No full text
    Current research on image annotation often represents images in terms of labelled regions or objects, but pays little attention to the spatial positions or relationships between those regions or objects. To be effective, general purpose image retrieval systems require images with comprehensive annotations describing fully the content of the image. Much research is being done on automatic image annotation schemes but few authors address the issue of spatial annotations directly. This paper begins with a brief analysis of real picture queries to librarians showing how spatial terms are used to formulate queries. The paper is then concerned with the development of an enhanced automatic image annotation system, which extracts spatial information about objects in the image. The approach uses region boundaries and region labels to generate annotations describing absolute object positions and also relative positions between pairs of objects. A domain ontology and spatial information ontology are also used to extract more complex information about the relative closeness of objects to the viewer

    Visual Landmark Recognition from Internet Photo Collections: A Large-Scale Evaluation

    Full text link
    The task of a visual landmark recognition system is to identify photographed buildings or objects in query photos and to provide the user with relevant information on them. With their increasing coverage of the world's landmark buildings and objects, Internet photo collections are now being used as a source for building such systems in a fully automatic fashion. This process typically consists of three steps: clustering large amounts of images by the objects they depict; determining object names from user-provided tags; and building a robust, compact, and efficient recognition index. To this date, however, there is little empirical information on how well current approaches for those steps perform in a large-scale open-set mining and recognition task. Furthermore, there is little empirical information on how recognition performance varies for different types of landmark objects and where there is still potential for improvement. With this paper, we intend to fill these gaps. Using a dataset of 500k images from Paris, we analyze each component of the landmark recognition pipeline in order to answer the following questions: How many and what kinds of objects can be discovered automatically? How can we best use the resulting image clusters to recognize the object in a query? How can the object be efficiently represented in memory for recognition? How reliably can semantic information be extracted? And finally: What are the limiting factors in the resulting pipeline from query to semantics? We evaluate how different choices of methods and parameters for the individual pipeline steps affect overall system performance and examine their effects for different query categories such as buildings, paintings or sculptures

    Webly Supervised Learning of Convolutional Networks

    Full text link
    We present an approach to utilize large amounts of web data for learning CNNs. Specifically inspired by curriculum learning, we present a two-step approach for CNN training. First, we use easy images to train an initial visual representation. We then use this initial CNN and adapt it to harder, more realistic images by leveraging the structure of data and categories. We demonstrate that our two-stage CNN outperforms a fine-tuned CNN trained on ImageNet on Pascal VOC 2012. We also demonstrate the strength of webly supervised learning by localizing objects in web images and training a R-CNN style detector. It achieves the best performance on VOC 2007 where no VOC training data is used. Finally, we show our approach is quite robust to noise and performs comparably even when we use image search results from March 2013 (pre-CNN image search era)

    A study of spatial data models and their application to selecting information from pictorial databases

    Get PDF
    People have always used visual techniques to locate information in the space surrounding them. However with the advent of powerful computer systems and user-friendly interfaces it has become possible to extend such techniques to stored pictorial information. Pictorial database systems have in the past primarily used mathematical or textual search techniques to locate specific pictures contained within such databases. However these techniques have largely relied upon complex combinations of numeric and textual queries in order to find the required pictures. Such techniques restrict users of pictorial databases to expressing what is in essence a visual query in a numeric or character based form. What is required is the ability to express such queries in a form that more closely matches the user's visual memory or perception of the picture required. It is suggested in this thesis that spatial techniques of search are important and that two of the most important attributes of a picture are the spatial positions and the spatial relationships of objects contained within such pictures. It is further suggested that a database management system which allows users to indicate the nature of their query by visually placing iconic representations of objects on an interface in spatially appropriate positions, is a feasible method by which pictures might be found from a pictorial database. This thesis undertakes a detailed study of spatial techniques using a combination of historical evidence, psychological conclusions and practical examples to demonstrate that the spatial metaphor is an important concept and that pictures can be readily found by visually specifying the spatial positions and relationships between objects contained within them
    • 

    corecore