1,031 research outputs found

    Automatic tagging and geotagging in video collections and communities

    Get PDF
    Automatically generated tags and geotags hold great promise to improve access to video collections and online communi- ties. We overview three tasks offered in the MediaEval 2010 benchmarking initiative, for each, describing its use scenario, definition and the data set released. For each task, a reference algorithm is presented that was used within MediaEval 2010 and comments are included on lessons learned. The Tagging Task, Professional involves automatically matching episodes in a collection of Dutch television with subject labels drawn from the keyword thesaurus used by the archive staff. The Tagging Task, Wild Wild Web involves automatically predicting the tags that are assigned by users to their online videos. Finally, the Placing Task requires automatically assigning geo-coordinates to videos. The specification of each task admits the use of the full range of available information including user-generated metadata, speech recognition transcripts, audio, and visual features

    Tools for image annotation Using context-awareness, NFC and image clustering

    Get PDF
    Annotation of images is crucial for enabling keyword based imagesearch. However, the enormous amount of available digital photos makesmanual annotation impractical, and requires methods for automaticimage annotation. This paper describes two complementary approachesto automatic annotation of images depicting some public attraction. TheLoTagr system provides annotation information for already captured,geo-positioned images, by selecting nearby, previously tagged imagesfrom a source image collection, and subsequently collect the mostfrequently used tags from these images. The NfcAnnotate systemenables annotation at image capture time, by using NFC (Near FieldCommunication) and NFC information tags provided at the site ofthe attraction. NfcAnnotate enables clustering of topically relatedimages, which makes it possible annotate a set of images in oneannotation operation. In cases when NFC information tags are notavailable, NfcAnnotate image clustering can be combined with LoTagrto conveniently annotate every image in the cluster in a single operation

    Focused image search in the social Web.

    Get PDF
    Recently, social multimedia-sharing websites, which allow users to upload, annotate, and share online photo or video collections, have become increasingly popular. The user tags or annotations constitute the new multimedia meta-data . We present an image search system that exploits both image textual and visual information. First, we use focused crawling and DOM Tree based web data extraction methods to extract image textual features from social networking image collections. Second, we propose the concept of visual words to handle the image\u27s visual content for fast indexing and searching. We also develop several user friendly search options to allow users to query the index using words and image feature descriptions (visual words). The developed image search system tries to bridge the gap between the scalable industrial image search engines, which are based on keyword search, and the slower content based image retrieval systems developed mostly in the academic field and designed to search based on image content only. We have implemented a working prototype by crawling and indexing over 16,056 images from flickr.com, one of the most popular image sharing websites. Our experimental results on a working prototype confirm the efficiency and effectiveness of the methods, that we proposed

    Joint Intermodal and Intramodal Label Transfers for Extremely Rare or Unseen Classes

    Full text link
    In this paper, we present a label transfer model from texts to images for image classification tasks. The problem of image classification is often much more challenging than text classification. On one hand, labeled text data is more widely available than the labeled images for classification tasks. On the other hand, text data tends to have natural semantic interpretability, and they are often more directly related to class labels. On the contrary, the image features are not directly related to concepts inherent in class labels. One of our goals in this paper is to develop a model for revealing the functional relationships between text and image features as to directly transfer intermodal and intramodal labels to annotate the images. This is implemented by learning a transfer function as a bridge to propagate the labels between two multimodal spaces. However, the intermodal label transfers could be undermined by blindly transferring the labels of noisy texts to annotate images. To mitigate this problem, we present an intramodal label transfer process, which complements the intermodal label transfer by transferring the image labels instead when relevant text is absent from the source corpus. In addition, we generalize the inter-modal label transfer to zero-shot learning scenario where there are only text examples available to label unseen classes of images without any positive image examples. We evaluate our algorithm on an image classification task and show the effectiveness with respect to the other compared algorithms.Comment: The paper has been accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence. It will apear in a future issu

    Evaluating tag-based information access in image collections

    Get PDF
    The availability of social tags has greatly enhanced access to information. Tag clouds have emerged as a new "social" way to find and visualize information, providing both one-click access to information and a snapshot of the "aboutness" of a tagged collection. A range of research projects explored and compared different tag artifacts for information access ranging from regular tag clouds to tag hierarchies. At the same time, there is a lack of user studies that compare the effectiveness of different types of tag-based browsing interfaces from the users point of view. This paper contributes to the research on tag-based information access by presenting a controlled user study that compared three types of tag-based interfaces on two recognized types of search tasks - lookup and exploratory search. Our results demonstrate that tag-based browsing interfaces significantly outperform traditional search interfaces in both performance and user satisfaction. At the same time, the differences between the two types of tag-based browsing interfaces explored in our study are not as clear. Copyright 2012 ACM

    Finding cultural heritage images through a Dual-Perspective Navigation Framework

    Get PDF
    With the increasing volume of digital images, improving techniques for image findability is receiving heightened attention. The cultural heritage sector, with its vast resource of images, has realized the value of social tags and started using tags in parallel with controlled vocabularies to increase the odds of users finding images of interest. The research presented in this paper develops the Dual-Perspective Navigation Framework (DPNF), which integrates controlled vocabularies and social tags to represent the aboutness of an item more comprehensively, in order that the information scent can be maximized to facilitate resource findability. DPNF utilizes the mechanisms of faceted browsing and tag-based navigation to offer a seamless interaction between experts’ subject headings and public tags during image search. In a controlled user study, participants effectively completed more exploratory tasks with the DPNF interface than with the tag-only interface. DPNF is more efficient than both single descriptor interfaces (subject heading-only and tag-only interfaces). Participants spent significantly less time, fewer interface interactions, and less back tracking to complete an exploratory task without an extra workload. In addition, participants were more satisfied with the DPNF interface than with the others. The findings of this study can assist interface designers struggling with what information is most helpful to users and facilitate searching tasks. It also maximizes end users’ chances of finding target images by engaging image information from two sources: the professionals’ description of items in a collection and the crowd's assignment of social tags
    • 

    corecore