38,310 research outputs found

    Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval

    Get PDF
    Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image. A comprehensive treatise of three closely linked problems, i.e., image tag assignment, refinement, and tag-based image retrieval is presented. While existing works vary in terms of their targeted tasks and methodology, they rely on the key functionality of tag relevance, i.e. estimating the relevance of a specific tag with respect to the visual content of a given image and its social context. By analyzing what information a specific method exploits to construct its tag relevance function and how such information is exploited, this paper introduces a taxonomy to structure the growing literature, understand the ingredients of the main works, clarify their connections and difference, and recognize their merits and limitations. For a head-to-head comparison between the state-of-the-art, a new experimental protocol is presented, with training sets containing 10k, 100k and 1m images and an evaluation on three test sets, contributed by various research groups. Eleven representative works are implemented and evaluated. Putting all this together, the survey aims to provide an overview of the past and foster progress for the near future.Comment: to appear in ACM Computing Survey

    Content-based image retrieval: reading one's mind and helping people share.

    Get PDF
    Sia Ka Cheung.Thesis (M.Phil.)--Chinese University of Hong Kong, 2003.Includes bibliographical references (leaves 85-91).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Problem Statement --- p.1Chapter 1.2 --- Contributions --- p.3Chapter 1.3 --- Thesis Organization --- p.4Chapter 2 --- Background --- p.5Chapter 2.1 --- Content-Based Image Retrieval --- p.5Chapter 2.1.1 --- Feature Extraction --- p.6Chapter 2.1.2 --- Indexing and Retrieval --- p.7Chapter 2.2 --- Relevance Feedback --- p.7Chapter 2.2.1 --- Weight Updating --- p.9Chapter 2.2.2 --- Bayesian Formulation --- p.11Chapter 2.2.3 --- Statistical Approaches --- p.12Chapter 2.2.4 --- Inter-query Feedback --- p.12Chapter 2.3 --- Peer-to-Peer Information Retrieval --- p.14Chapter 2.3.1 --- Distributed Hash Table Techniques --- p.16Chapter 2.3.2 --- Routing Indices and Shortcuts --- p.17Chapter 2.3.3 --- Content-Based Retrieval in P2P Systems --- p.18Chapter 3 --- Parameter Estimation-Based Relevance Feedback --- p.21Chapter 3.1 --- Parameter Estimation of Target Distribution --- p.21Chapter 3.1.1 --- Motivation --- p.21Chapter 3.1.2 --- Model --- p.23Chapter 3.1.3 --- Relevance Feedback --- p.24Chapter 3.1.4 --- Maximum Entropy Display --- p.26Chapter 3.2 --- Self-Organizing Map Based Inter-Query Feedback --- p.27Chapter 3.2.1 --- Motivation --- p.27Chapter 3.2.2 --- Initialization and Replication of SOM --- p.29Chapter 3.2.3 --- SOM Training for Inter-query Feedback --- p.31Chapter 3.2.4 --- Target Estimation and Display Set Selection for Intra- query Feedback --- p.33Chapter 3.3 --- Experiment --- p.35Chapter 3.3.1 --- Study of Parameter Estimation Method Using Synthetic Data --- p.35Chapter 3.3.2 --- Performance Study in Intra- and Inter- Query Feedback . --- p.40Chapter 3.4 --- Conclusion --- p.42Chapter 4 --- Distributed COntent-based Visual Information Retrieval --- p.44Chapter 4.1 --- Introduction --- p.44Chapter 4.2 --- Peer Clustering --- p.45Chapter 4.2.1 --- Basic Version --- p.45Chapter 4.2.2 --- Single Cluster Version --- p.47Chapter 4.2.3 --- Multiple Clusters Version --- p.51Chapter 4.3 --- Firework Query Model --- p.53Chapter 4.4 --- Implementation and System Architecture --- p.57Chapter 4.4.1 --- Gnutella Message Modification --- p.57Chapter 4.4.2 --- Architecture of DISCOVIR --- p.59Chapter 4.4.3 --- Flow of Operations --- p.60Chapter 4.5 --- Experiments --- p.62Chapter 4.5.1 --- Simulation Model of the Peer-to-Peer Network --- p.62Chapter 4.5.2 --- Number of Peers --- p.66Chapter 4.5.3 --- TTL of Query Message --- p.70Chapter 4.5.4 --- Effects of Data Resolution on Query Efficiency --- p.73Chapter 4.5.5 --- Discussion --- p.74Chapter 4.6 --- Conclusion --- p.77Chapter 5 --- Future Works and Conclusion --- p.79Chapter A --- Derivation of Update Equation --- p.81Chapter B --- An Efficient Discovery of Signatures --- p.82Bibliography --- p.8

    Automatic Query Image Disambiguation for Content-Based Image Retrieval

    Full text link
    Query images presented to content-based image retrieval systems often have various different interpretations, making it difficult to identify the search objective pursued by the user. We propose a technique for overcoming this ambiguity, while keeping the amount of required user interaction at a minimum. To achieve this, the neighborhood of the query image is divided into coherent clusters from which the user may choose the relevant ones. A novel feedback integration technique is then employed to re-rank the entire database with regard to both the user feedback and the original query. We evaluate our approach on the publicly available MIRFLICKR-25K dataset, where it leads to a relative improvement of average precision by 23% over the baseline retrieval, which does not distinguish between different image senses.Comment: VISAPP 2018 paper, 8 pages, 5 figures. Source code: https://github.com/cvjena/ai

    Query generation from multiple media examples

    Get PDF
    This paper exploits an unified media document representation called feature terms for query generation from multiple media examples, e.g. images. A feature term refers to a value interval of a media feature. A media document is therefore represented by a frequency vector about feature term appearance. This approach (1) facilitates feature accumulation from multiple examples; (2) enables the exploration of text-based retrieval models for multimedia retrieval. Three statistical criteria, minimised chi-squared, minimised AC/DC rate and maximised entropy, are proposed to extract feature terms from a given media document collection. Two textual ranking functions, KL divergence and a BM25-like retrieval model, are adapted to estimate media document relevance. Experiments on the Corel photo collection and the TRECVid 2006 collection show the effectiveness of feature term based query in image and video retrieval

    The University of Glasgow at ImageClefPhoto 2009

    Get PDF
    In this paper we describe the approaches adopted to generate the five runs submitted to ImageClefPhoto 2009 by the University of Glasgow. The aim of our methods is to exploit document diversity in the rankings. All our runs used text statistics extracted from the captions associated to each image in the collection, except one run which combines the textual statistics with visual features extracted from the provided images. The results suggest that our methods based on text captions significantly improve the performance of the respective baselines, while the approach that combines visual features with text statistics shows lower levels of improvements

    Multitraining support vector machine for image retrieval

    Get PDF
    Relevance feedback (RF) schemes based on support vector machines (SVMs) have been widely used in content-based image retrieval (CBIR). However, the performance of SVM-based RF approaches is often poor when the number of labeled feedback samples is small. This is mainly due to 1) the SVM classifier being unstable for small-size training sets because its optimal hyper plane is too sensitive to the training examples; and 2) the kernel method being ineffective because the feature dimension is much greater than the size of the training samples. In this paper, we develop a new machine learning technique, multitraining SVM (MTSVM), which combines the merits of the cotraining technique and a random sampling method in the feature space. Based on the proposed MTSVM algorithm, the above two problems can be mitigated. Experiments are carried out on a large image set of some 20 000 images, and the preliminary results demonstrate that the developed method consistently improves the performance over conventional SVM-based RFs in terms of precision and standard deviation, which are used to evaluate the effectiveness and robustness of a RF algorithm, respectively

    Video browsing interfaces and applications: a review

    Get PDF
    We present a comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications. There has been a significant increase in activity (e.g., storage, retrieval, and sharing) employing video data in the past decade, both for personal and professional use. The ever-growing amount of video content available for human consumption and the inherent characteristics of video data—which, if presented in its raw format, is rather unwieldy and costly—have become driving forces for the development of more effective solutions to present video contents and allow rich user interaction. As a result, there are many contemporary research efforts toward developing better video browsing solutions, which we summarize. We review more than 40 different video browsing and retrieval interfaces and classify them into three groups: applications that use video-player-like interaction, video retrieval applications, and browsing solutions based on video surrogates. For each category, we present a summary of existing work, highlight the technical aspects of each solution, and compare them against each other
    corecore