Search CORE

38,310 research outputs found

Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval

Author: Ballan Lamberto
Bertini Marco
Del Bimbo Alberto
Li Xirong
Snoek Cees G. M.
Uricchio Tiberio
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image. A comprehensive treatise of three closely linked problems, i.e., image tag assignment, refinement, and tag-based image retrieval is presented. While existing works vary in terms of their targeted tasks and methodology, they rely on the key functionality of tag relevance, i.e. estimating the relevance of a specific tag with respect to the visual content of a given image and its social context. By analyzing what information a specific method exploits to construct its tag relevance function and how such information is exploited, this paper introduces a taxonomy to structure the growing literature, understand the ingredients of the main works, clarify their connections and difference, and recognize their merits and limitations. For a head-to-head comparison between the state-of-the-art, a new experimental protocol is presented, with training sets containing 10k, 100k and 1m images and an evaluation on three test sets, contributed by various research groups. Eleven representative works are implemented and evaluated. Putting all this together, the survey aims to provide an overview of the past and foster progress for the near future.Comment: to appear in ACM Computing Survey

arXiv.org e-Print Archive

Crossref

Florence Research

Archivio istituzionale della ricerca - Università di Macerata

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Archivio istituzionale della ricerca - Università di Padova

Content-based image retrieval: reading one's mind and helping people share.

Author
Publication venue
Publication date: 01/01/2003
Field of study

Sia Ka Cheung.Thesis (M.Phil.)--Chinese University of Hong Kong, 2003.Includes bibliographical references (leaves 85-91).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Problem Statement --- p.1Chapter 1.2 --- Contributions --- p.3Chapter 1.3 --- Thesis Organization --- p.4Chapter 2 --- Background --- p.5Chapter 2.1 --- Content-Based Image Retrieval --- p.5Chapter 2.1.1 --- Feature Extraction --- p.6Chapter 2.1.2 --- Indexing and Retrieval --- p.7Chapter 2.2 --- Relevance Feedback --- p.7Chapter 2.2.1 --- Weight Updating --- p.9Chapter 2.2.2 --- Bayesian Formulation --- p.11Chapter 2.2.3 --- Statistical Approaches --- p.12Chapter 2.2.4 --- Inter-query Feedback --- p.12Chapter 2.3 --- Peer-to-Peer Information Retrieval --- p.14Chapter 2.3.1 --- Distributed Hash Table Techniques --- p.16Chapter 2.3.2 --- Routing Indices and Shortcuts --- p.17Chapter 2.3.3 --- Content-Based Retrieval in P2P Systems --- p.18Chapter 3 --- Parameter Estimation-Based Relevance Feedback --- p.21Chapter 3.1 --- Parameter Estimation of Target Distribution --- p.21Chapter 3.1.1 --- Motivation --- p.21Chapter 3.1.2 --- Model --- p.23Chapter 3.1.3 --- Relevance Feedback --- p.24Chapter 3.1.4 --- Maximum Entropy Display --- p.26Chapter 3.2 --- Self-Organizing Map Based Inter-Query Feedback --- p.27Chapter 3.2.1 --- Motivation --- p.27Chapter 3.2.2 --- Initialization and Replication of SOM --- p.29Chapter 3.2.3 --- SOM Training for Inter-query Feedback --- p.31Chapter 3.2.4 --- Target Estimation and Display Set Selection for Intra- query Feedback --- p.33Chapter 3.3 --- Experiment --- p.35Chapter 3.3.1 --- Study of Parameter Estimation Method Using Synthetic Data --- p.35Chapter 3.3.2 --- Performance Study in Intra- and Inter- Query Feedback . --- p.40Chapter 3.4 --- Conclusion --- p.42Chapter 4 --- Distributed COntent-based Visual Information Retrieval --- p.44Chapter 4.1 --- Introduction --- p.44Chapter 4.2 --- Peer Clustering --- p.45Chapter 4.2.1 --- Basic Version --- p.45Chapter 4.2.2 --- Single Cluster Version --- p.47Chapter 4.2.3 --- Multiple Clusters Version --- p.51Chapter 4.3 --- Firework Query Model --- p.53Chapter 4.4 --- Implementation and System Architecture --- p.57Chapter 4.4.1 --- Gnutella Message Modification --- p.57Chapter 4.4.2 --- Architecture of DISCOVIR --- p.59Chapter 4.4.3 --- Flow of Operations --- p.60Chapter 4.5 --- Experiments --- p.62Chapter 4.5.1 --- Simulation Model of the Peer-to-Peer Network --- p.62Chapter 4.5.2 --- Number of Peers --- p.66Chapter 4.5.3 --- TTL of Query Message --- p.70Chapter 4.5.4 --- Effects of Data Resolution on Query Efficiency --- p.73Chapter 4.5.5 --- Discussion --- p.74Chapter 4.6 --- Conclusion --- p.77Chapter 5 --- Future Works and Conclusion --- p.79Chapter A --- Derivation of Update Equation --- p.81Chapter B --- An Efficient Discovery of Signatures --- p.82Bibliography --- p.8

CUHK Digital Repository

Automatic Query Image Disambiguation for Content-Based Image Retrieval

Author: Barz Björn
Denzler Joachim
Publication venue
Publication date: 02/11/2017
Field of study

Query images presented to content-based image retrieval systems often have various different interpretations, making it difficult to identify the search objective pursued by the user. We propose a technique for overcoming this ambiguity, while keeping the amount of required user interaction at a minimum. To achieve this, the neighborhood of the query image is divided into coherent clusters from which the user may choose the relevant ones. A novel feedback integration technique is then employed to re-rank the entire database with regard to both the user feedback and the original query. We evaluate our approach on the publicly available MIRFLICKR-25K dataset, where it leads to a relative improvement of average precision by 23% over the baseline retrieval, which does not distinguish between different image senses.Comment: VISAPP 2018 paper, 8 pages, 5 figures. Source code: https://github.com/cvjena/ai

arXiv.org e-Print Archive

Crossref

Query generation from multiple media examples

Author: Jose J.M.
Ren R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2009
Field of study

This paper exploits an unified media document representation called feature terms for query generation from multiple media examples, e.g. images. A feature term refers to a value interval of a media feature. A media document is therefore represented by a frequency vector about feature term appearance. This approach (1) facilitates feature accumulation from multiple examples; (2) enables the exploration of text-based retrieval models for multimedia retrieval. Three statistical criteria, minimised chi-squared, minimised AC/DC rate and maximised entropy, are proposed to extract feature terms from a given media document collection. Two textual ranking functions, KL divergence and a BM25-like retrieval model, are adapted to estimate media document relevance. Experiments on the Corel photo collection and the TRECVid 2006 collection show the effectiveness of feature term based query in image and video retrieval

CiteSeerX

Crossref

Enlighten

The University of Glasgow at ImageClefPhoto 2009

Author: Goyal A.
Halvey M.
Jose J.M.
Leelanupab T.
Punitha P.
Zuccon G.
Publication venue
Publication date: 01/01/2009
Field of study

In this paper we describe the approaches adopted to generate the five runs submitted to ImageClefPhoto 2009 by the University of Glasgow. The aim of our methods is to exploit document diversity in the rankings. All our runs used text statistics extracted from the captions associated to each image in the collection, except one run which combines the textual statistics with visual features extracted from the provided images. The results suggest that our methods based on text captions significantly improve the performance of the respective baselines, while the approach that combines visual features with text statistics shows lower levels of improvements

CiteSeerX

Queensland University of Technology ePrints Archive

Enlighten

Multitraining support vector machine for image retrieval

Author: Allinson N.
Li J.
Li Xuelong
Tao D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Relevance feedback (RF) schemes based on support vector machines (SVMs) have been widely used in content-based image retrieval (CBIR). However, the performance of SVM-based RF approaches is often poor when the number of labeled feedback samples is small. This is mainly due to 1) the SVM classifier being unstable for small-size training sets because its optimal hyper plane is too sensitive to the training examples; and 2) the kernel method being ineffective because the feature dimension is much greater than the size of the training samples. In this paper, we develop a new machine learning technique, multitraining SVM (MTSVM), which combines the merits of the cotraining technique and a random sampling method in the feature space. Based on the proposed MTSVM algorithm, the above two problems can be mitigated. Experiments are carried out on a large image set of some 20 000 images, and the preliminary results demonstrate that the developed method consistently improves the performance over conventional SVM-based RFs in terms of precision and standard deviation, which are used to evaluate the effectiveness and robustness of a RF algorithm, respectively

University of Lincoln Institutional Repository

Crossref

OPUS - University of Technology Sydney

Birkbeck Institutional Research Online

Video browsing interfaces and applications: a review

Author: Boeszoermenyi L.
Hopfgartner F.
Jose J.
Marques O.
Schoeffmann K.
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/02/2010
Field of study

We present a comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications. There has been a significant increase in activity (e.g., storage, retrieval, and sharing) employing video data in the past decade, both for personal and professional use. The ever-growing amount of video content available for human consumption and the inherent characteristics of video data—which, if presented in its raw format, is rather unwieldy and costly—have become driving forces for the development of more effective solutions to present video contents and allow rich user interaction. As a result, there are many contemporary research efforts toward developing better video browsing solutions, which we summarize. We review more than 40 different video browsing and retrieval interfaces and classify them into three groups: applications that use video-player-like interaction, video retrieval applications, and browsing solutions based on video surrogates. For each category, we present a summary of existing work, highlight the technical aspects of each solution, and compare them against each other

Enlighten

White Rose Research Online