355 research outputs found
Weakly-Supervised Online Hashing
With the rapid development of social websites, recent years have witnessed an
explosive growth of social images with user-provided tags which continuously
arrive in a streaming fashion. Due to the fast query speed and low storage
cost, hashing-based methods for image search have attracted increasing
attention. However, existing hashing methods for social image retrieval are
based on batch mode which violates the nature of social images, i.e., social
images are usually generated periodically or collected in a stream fashion.
Although there exist many online image hashing methods, they either adopt
unsupervised learning which ignore the relevant tags, or are designed in the
supervised manner which needs high-quality labels. In this paper, to overcome
the above limitations, we propose a new method named Weakly-supervised Online
Hashing (WOH). In order to learn high-quality hash codes, WOH exploits the weak
supervision by considering the semantics of tags and removing the noise.
Besides, We develop a discrete online optimization algorithm for WOH, which is
efficient and scalable. Extensive experiments conducted on two real-world
datasets demonstrate the superiority of WOH compared with several
state-of-the-art hashing baselines.Comment: Accepted by ICME 202
Dual-level Semantic Transfer Deep Hashing for Efficient Social Image Retrieval
Social network stores and disseminates a tremendous amount of user shared
images. Deep hashing is an efficient indexing technique to support large-scale
social image retrieval, due to its deep representation capability, fast
retrieval speed and low storage cost. Particularly, unsupervised deep hashing
has well scalability as it does not require any manually labelled data for
training. However, owing to the lacking of label guidance, existing methods
suffer from severe semantic shortage when optimizing a large amount of deep
neural network parameters. Differently, in this paper, we propose a Dual-level
Semantic Transfer Deep Hashing (DSTDH) method to alleviate this problem with a
unified deep hash learning framework. Our model targets at learning the
semantically enhanced deep hash codes by specially exploiting the
user-generated tags associated with the social images. Specifically, we design
a complementary dual-level semantic transfer mechanism to efficiently discover
the potential semantics of tags and seamlessly transfer them into binary hash
codes. On the one hand, instance-level semantics are directly preserved into
hash codes from the associated tags with adverse noise removing. Besides, an
image-concept hypergraph is constructed for indirectly transferring the latent
high-order semantic correlations of images and tags into hash codes. Moreover,
the hash codes are obtained simultaneously with the deep representation
learning by the discrete hash optimization strategy. Extensive experiments on
two public social image retrieval datasets validate the superior performance of
our method compared with state-of-the-art hashing methods. The source codes of
our method can be obtained at https://github.com/research2020-1/DSTDHComment: Accepted by IEEE TCSV
A Decade Survey of Content Based Image Retrieval using Deep Learning
The content based image retrieval aims to find the similar images from a
large scale dataset against a query image. Generally, the similarity between
the representative features of the query image and dataset images is used to
rank the images for retrieval. In early days, various hand designed feature
descriptors have been investigated based on the visual cues such as color,
texture, shape, etc. that represent the images. However, the deep learning has
emerged as a dominating alternative of hand-designed feature engineering from a
decade. It learns the features automatically from the data. This paper presents
a comprehensive survey of deep learning based developments in the past decade
for content based image retrieval. The categorization of existing
state-of-the-art methods from different perspectives is also performed for
greater understanding of the progress. The taxonomy used in this survey covers
different supervision, different networks, different descriptor type and
different retrieval type. A performance analysis is also performed using the
state-of-the-art methods. The insights are also presented for the benefit of
the researchers to observe the progress and to make the best choices. The
survey presented in this paper will help in further research progress in image
retrieval using deep learning
A Survey on Learning to Hash
Nearest neighbor search is a problem of finding the data points from the
database such that the distances from them to the query point are the smallest.
Learning to hash is one of the major solutions to this problem and has been
widely studied recently. In this paper, we present a comprehensive survey of
the learning to hash algorithms, categorize them according to the manners of
preserving the similarities into: pairwise similarity preserving, multiwise
similarity preserving, implicit similarity preserving, as well as quantization,
and discuss their relations. We separate quantization from pairwise similarity
preserving as the objective function is very different though quantization, as
we show, can be derived from preserving the pairwise similarities. In addition,
we present the evaluation protocols, and the general performance analysis, and
point out that the quantization algorithms perform superiorly in terms of
search accuracy, search time cost, and space cost. Finally, we introduce a few
emerging topics.Comment: To appear in IEEE Transactions On Pattern Analysis and Machine
Intelligence (TPAMI
Social Anchor-Unit Graph Regularized Tensor Completion for Large-Scale Image Retagging
Image retagging aims to improve tag quality of social images by refining
their original tags or assigning new high-quality tags. Recent approaches
simultaneously explore visual, user and tag information to improve the
performance of image retagging by constructing and exploring an image-tag-user
graph. However, such methods will become computationally infeasible with the
rapidly increasing number of images, tags and users. It has been proven that
Anchor Graph Regularization (AGR) can significantly accelerate large-scale
graph learning model by exploring only a small number of anchor points.
Inspired by this, we propose a novel Social anchor-Unit GrAph Regularized
Tensor Completion (SUGAR-TC) method to effectively refine the tags of social
images, which is insensitive to the scale of the applied data. First, we
construct an anchor-unit graph across multiple domains (e.g., image and user
domains) rather than traditional anchor graph in a single domain. Second, a
tensor completion based on SUGAR is implemented on the original image-tag-user
tensor to refine the tags of the anchor images. Third, we efficiently assign
tags to non-anchor images by leveraging the relationship between the non-anchor
images and the anchor units. Experimental results on a real-world social image
database well demonstrate the effectiveness of SUGAR-TC, outperforming several
related methods
Learning to Hash for Indexing Big Data - A Survey
The explosive growth in big data has attracted much attention in designing
efficient indexing and search methods recently. In many critical applications
such as large-scale search and pattern matching, finding the nearest neighbors
to a query is a fundamental research problem. However, the straightforward
solution using exhaustive comparison is infeasible due to the prohibitive
computational complexity and memory requirement. In response, Approximate
Nearest Neighbor (ANN) search based on hashing techniques has become popular
due to its promising performance in both efficiency and accuracy. Prior
randomized hashing methods, e.g., Locality-Sensitive Hashing (LSH), explore
data-independent hash functions with random projections or permutations.
Although having elegant theoretic guarantees on the search quality in certain
metric spaces, performance of randomized hashing has been shown insufficient in
many real-world applications. As a remedy, new approaches incorporating
data-driven learning methods in development of advanced hash functions have
emerged. Such learning to hash methods exploit information such as data
distributions or class labels when optimizing the hash codes or functions.
Importantly, the learned hash codes are able to preserve the proximity of
neighboring data in the original feature spaces in the hash code spaces. The
goal of this paper is to provide readers with systematic understanding of
insights, pros and cons of the emerging techniques. We provide a comprehensive
survey of the learning to hash framework and representative techniques of
various types, including unsupervised, semi-supervised, and supervised. In
addition, we also summarize recent hashing approaches utilizing the deep
learning models. Finally, we discuss the future direction and trends of
research in this area
Learning to Learn from Web Data through Deep Semantic Embeddings
In this paper we propose to learn a multimodal image and text embedding from
Web and Social Media data, aiming to leverage the semantic knowledge learnt in
the text domain and transfer it to a visual model for semantic image retrieval.
We demonstrate that the pipeline can learn from images with associated text
without supervision and perform a thourough analysis of five different text
embeddings in three different benchmarks. We show that the embeddings learnt
with Web and Social Media data have competitive performances over supervised
methods in the text based image retrieval task, and we clearly outperform state
of the art in the MIRFlickr dataset when training in the target data. Further
we demonstrate how semantic multimodal image retrieval can be performed using
the learnt embeddings, going beyond classical instance-level retrieval
problems. Finally, we present a new dataset, InstaCities1M, composed by
Instagram images and their associated texts that can be used for fair
comparison of image-text embeddings.Comment: ECCV MULA Workshop 201
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
The "cvpaper.challenge" is a group composed of members from AIST, Tokyo Denki
Univ. (TDU), and Univ. of Tsukuba that aims to systematically summarize papers
on computer vision, pattern recognition, and related fields. For this
particular review, we focused on reading the ALL 602 conference papers
presented at the CVPR2015, the premier annual computer vision event held in
June 2015, in order to grasp the trends in the field. Further, we are proposing
"DeepSurvey" as a mechanism embodying the entire process from the reading
through all the papers, the generation of ideas, and to the writing of paper.Comment: Survey Pape
Shared Predictive Cross-Modal Deep Quantization
With explosive growth of data volume and ever-increasing diversity of data
modalities, cross-modal similarity search, which conducts nearest neighbor
search across different modalities, has been attracting increasing interest.
This paper presents a deep compact code learning solution for efficient
cross-modal similarity search. Many recent studies have proven that
quantization-based approaches perform generally better than hashing-based
approaches on single-modal similarity search. In this paper, we propose a deep
quantization approach, which is among the early attempts of leveraging deep
neural networks into quantization-based cross-modal similarity search. Our
approach, dubbed shared predictive deep quantization (SPDQ), explicitly
formulates a shared subspace across different modalities and two private
subspaces for individual modalities, and representations in the shared subspace
and the private subspaces are learned simultaneously by embedding them to a
reproducing kernel Hilbert space, where the mean embedding of different
modality distributions can be explicitly compared. In addition, in the shared
subspace, a quantizer is learned to produce the semantics preserving compact
codes with the help of label alignment. Thanks to this novel network
architecture in cooperation with supervised quantization training, SPDQ can
preserve intramodal and intermodal similarities as much as possible and greatly
reduce quantization error. Experiments on two popular benchmarks corroborate
that our approach outperforms state-of-the-art methods
Weakly Supervised Deep Image Hashing through Tag Embeddings
Many approaches to semantic image hashing have been formulated as supervised
learning problems that utilize images and label information to learn the binary
hash codes. However, large-scale labeled image data is expensive to obtain,
thus imposing a restriction on the usage of such algorithms. On the other hand,
unlabelled image data is abundant due to the existence of many Web image
repositories. Such Web images may often come with images tags that contain
useful information, although raw tags, in general, do not readily lead to
semantic labels. Motivated by this scenario, we formulate the problem of
semantic image hashing as a weakly-supervised learning problem. We utilize the
information contained in the user-generated tags associated with the images to
learn the hash codes. More specifically, we extract the word2vec semantic
embeddings of the tags and use the information contained in them for
constraining the learning. Accordingly, we name our model Weakly Supervised
Deep Hashing using Tag Embeddings (WDHT). WDHT is tested for the task of
semantic image retrieval and is compared against several state-of-art models.
Results show that our approach sets a new state-of-art in the area of weekly
supervised image hashing.Comment: 11 pages, 4 figure
- …