246 research outputs found
A large scale system for searching and browsing images from the World Wide Web
This paper outlines the technical details of a prototype system for searching and browsing over a million images from the World Wide Web using their visual contents. The system relies on two modalities for accessing images â automated image annotation and NNk image network browsing. The user supplies the initial query in the form of one or more keywords and is then able to locate the desired images more precisely using a browsing interface
Recommended from our members
NNk networks and automated annotation for browsing large image collections from the world wide web
This paper outlines a system for searching and browsing 1.14 million images from the World Wide Web (WWW) based on their visual content. At the heart of the system lies an automatically constructed network of images that can be navigated quickly by following its edges. The browsing experience is enhanced in a number of ways including multidimensional scaling of the graph neighbourhood for display purposes, Markov clustering of the image network to provide summaries of its content, and automated annotation of the images to allow users to access the network through text queries
Interactive context-aware user-driven metadata correction in digital libraries
Personal name variants are a common problem in digital libraries, reducing the precision of searches and complicating browsing-based interaction. The book-centric approach of name authority control has not scaled to match the growth and diversity of digital repositories. In this paper, we present a novel system for user-driven integration of name variants when interacting with web-based information-in particular digital library-systems. We approach these issues via a client-side JavaScript browser extension that can reorganize web content and also integrate remote data sources. Designed to be agnostic towards the web sites it is applied to, we illustrate the developed proof-of-concept system through worked examples using three different digital libraries. We discuss the extensibility of the approach in the context of other user-driven information systems and the growth of the Semantic Web
Unsupervised Generative Adversarial Cross-modal Hashing
Cross-modal hashing aims to map heterogeneous multimedia data into a common
Hamming space, which can realize fast and flexible retrieval across different
modalities. Unsupervised cross-modal hashing is more flexible and applicable
than supervised methods, since no intensive labeling work is involved. However,
existing unsupervised methods learn hashing functions by preserving inter and
intra correlations, while ignoring the underlying manifold structure across
different modalities, which is extremely helpful to capture meaningful nearest
neighbors of different modalities for cross-modal retrieval. To address the
above problem, in this paper we propose an Unsupervised Generative Adversarial
Cross-modal Hashing approach (UGACH), which makes full use of GAN's ability for
unsupervised representation learning to exploit the underlying manifold
structure of cross-modal data. The main contributions can be summarized as
follows: (1) We propose a generative adversarial network to model cross-modal
hashing in an unsupervised fashion. In the proposed UGACH, given a data of one
modality, the generative model tries to fit the distribution over the manifold
structure, and select informative data of another modality to challenge the
discriminative model. The discriminative model learns to distinguish the
generated data and the true positive data sampled from correlation graph to
achieve better retrieval accuracy. These two models are trained in an
adversarial way to improve each other and promote hashing function learning.
(2) We propose a correlation graph based approach to capture the underlying
manifold structure across different modalities, so that data of different
modalities but within the same manifold can have smaller Hamming distance and
promote retrieval accuracy. Extensive experiments compared with 6
state-of-the-art methods verify the effectiveness of our proposed approach.Comment: 8 pages, accepted by 32th AAAI Conference on Artificial Intelligence
(AAAI), 201
Interactive retrieval of video using pre-computed shot-shot similarities
A probabilistic framework for content-based interactive video retrieval is described. The developed indexing of video fragments originates from the probability of the user's positive judgment about key-frames of video shots. Initial estimates of the probabilities are obtained from low-level feature representation. Only statistically significant estimates are picked out, the rest are replaced by an appropriate constant allowing efficient access at search time without loss of search quality and leading to improvement in most experiments. With time, these probability estimates are updated from the relevance judgment of users performing searches, resulting in further substantial increases in mean average precision
- âŚ