3,688 research outputs found

    An adaptive technique for content-based image retrieval

    Get PDF
    We discuss an adaptive approach towards Content-Based Image Retrieval. It is based on the Ostensive Model of developing information needs—a special kind of relevance feedback model that learns from implicit user feedback and adds a temporal notion to relevance. The ostensive approach supports content-assisted browsing through visualising the interaction by adding user-selected images to a browsing path, which ends with a set of system recommendations. The suggestions are based on an adaptive query learning scheme, in which the query is learnt from previously selected images. Our approach is an adaptation of the original Ostensive Model based on textual features only, to include content-based features to characterise images. In the proposed scheme textual and colour features are combined using the Dempster-Shafer theory of evidence combination. Results from a user-centred, work-task oriented evaluation show that the ostensive interface is preferred over a traditional interface with manual query facilities. This is due to its ability to adapt to the user's need, its intuitiveness and the fluid way in which it operates. Studying and comparing the nature of the underlying information need, it emerges that our approach elicits changes in the user's need based on the interaction, and is successful in adapting the retrieval to match the changes. In addition, a preliminary study of the retrieval performance of the ostensive relevance feedback scheme shows that it can outperform a standard relevance feedback strategy in terms of image recall in category search

    Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval

    Get PDF
    Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image. A comprehensive treatise of three closely linked problems, i.e., image tag assignment, refinement, and tag-based image retrieval is presented. While existing works vary in terms of their targeted tasks and methodology, they rely on the key functionality of tag relevance, i.e. estimating the relevance of a specific tag with respect to the visual content of a given image and its social context. By analyzing what information a specific method exploits to construct its tag relevance function and how such information is exploited, this paper introduces a taxonomy to structure the growing literature, understand the ingredients of the main works, clarify their connections and difference, and recognize their merits and limitations. For a head-to-head comparison between the state-of-the-art, a new experimental protocol is presented, with training sets containing 10k, 100k and 1m images and an evaluation on three test sets, contributed by various research groups. Eleven representative works are implemented and evaluated. Putting all this together, the survey aims to provide an overview of the past and foster progress for the near future.Comment: to appear in ACM Computing Survey

    A Data-Driven Approach for Tag Refinement and Localization in Web Videos

    Get PDF
    Tagging of visual content is becoming more and more widespread as web-based services and social networks have popularized tagging functionalities among their users. These user-generated tags are used to ease browsing and exploration of media collections, e.g. using tag clouds, or to retrieve multimedia content. However, not all media are equally tagged by users. Using the current systems is easy to tag a single photo, and even tagging a part of a photo, like a face, has become common in sites like Flickr and Facebook. On the other hand, tagging a video sequence is more complicated and time consuming, so that users just tag the overall content of a video. In this paper we present a method for automatic video annotation that increases the number of tags originally provided by users, and localizes them temporally, associating tags to keyframes. Our approach exploits collective knowledge embedded in user-generated tags and web sources, and visual similarity of keyframes and images uploaded to social sites like YouTube and Flickr, as well as web sources like Google and Bing. Given a keyframe, our method is able to select on the fly from these visual sources the training exemplars that should be the most relevant for this test sample, and proceeds to transfer labels across similar images. Compared to existing video tagging approaches that require training classifiers for each tag, our system has few parameters, is easy to implement and can deal with an open vocabulary scenario. We demonstrate the approach on tag refinement and localization on DUT-WEBV, a large dataset of web videos, and show state-of-the-art results.Comment: Preprint submitted to Computer Vision and Image Understanding (CVIU

    An explorative study of interface support for image searching

    Get PDF
    In this paper we study interfaces for image retrieval systems. Current image retrieval interfaces are limited to providing query facilities and result presentation. The user can inspect the results and possibly provide feedback on their relevance for the current query. Our approach, in contrast, encourages the user to group and organise their search results and thus provide more fine-grained feedback for the system. It combines the search and management process, which - according to our hypothesis - helps the user to onceptualise their search tasks and to overcome the query formulation problem. An evaluation, involving young design-professionals and di®erent types of information seeking scenarios, shows that the proposed approach succeeds in encouraging the user to conceptualise their tasks and that it leads to increased user satisfaction. However, it could not be shown to increase performance. We identify the problems in the current setup, which when eliminated should lead to more effective searching overall

    Approche probabiliste hybride pour la recherche d'images par le contenu avec pondération des caractéristiques

    Get PDF
    Durant la dernière décennie, des quantités énormes de documents visuels (images et vidéos) sont produites chaque jour par les scientifiques, les journalistes, les amateurs, etc. Cette quantité a vite démontré la limite des systèmes de recherche d'images par mots clés, d'où la naissance du paradigme qu'on nomme Système de Recherche d'Images par le Contenu, en anglais Content-Based Image Retrieval (CBIR). Ces systèmes visent à localiser les images similaires à une requête constituée d'une ou plusieurs images, à l'aide des caractéristiques visuelles telles que la couleur, la forme et la texture. Ces caractéristiques sont dites de bas-niveau car elles ne reflètent pas la sémantique de l'image. En d'autres termes deux images sémantiquement différentes peuvent produire des caractéristiques bas-niveau similaires. Un des principaux défis de cette nouvelle vision des systèmes est l'organisation de la collection d'images pour avoir un temps de recherche acceptable. Pour faire face à ce défi, les techniques développées pour l'indexation des bases de données textuelles telles que les arbres sont massivement utilisées. Ces arbres ne sont pas adaptés aux données de grandes dimensions, comme c'est le cas des caractéristiques de bas-niveau des images. Dans ce mémoire, nous nous intéressons à ce défi. Nous introduisons une nouvelle approche probabiliste hybride pour l'organisation des collections d'images. Sur une collection d'images organisée hiérarchiquement en noeuds selon la sémantique des images, nous utilisons une approche générative pour l'estimation des mélanges de probabilités qui représentent l'apparence visuelle de chaque noeud dans la collection. Ensuite nous appliquons une approche discriminative pour l'estimation des poids des caractéristiques visuelles. L'idée dans notre travail, est de limiter la recherche seulement aux noeuds qui représentent mieux la sémantique de la requête, ce qui donne une propriété sémantique à la recherche et diminue le fossé sémantique causé par les caractéristiques de bas-niveau

    Boosting Image Database Retrieval

    Get PDF
    We present an approach for image database retrieval using a very large number of highly-selective features and simple on-line learning. Our approach is predicated on the assumption that each image is generated by a sparse set of visual "causes" and that images which are visually similar share causes. We propose a mechanism for generating a large number of complex features which capture some aspects of this causal structure. Boosting is used to learn simple and efficient classifiers in this complex feature space. Finally we will describe a practical implementation of our retrieval system on a database of 3000 images
    corecore