1,690 research outputs found
Zero-Shot Hashing via Transferring Supervised Knowledge
Hashing has shown its efficiency and effectiveness in facilitating
large-scale multimedia applications. Supervised knowledge e.g. semantic labels
or pair-wise relationship) associated to data is capable of significantly
improving the quality of hash codes and hash functions. However, confronted
with the rapid growth of newly-emerging concepts and multimedia data on the
Web, existing supervised hashing approaches may easily suffer from the scarcity
and validity of supervised information due to the expensive cost of manual
labelling. In this paper, we propose a novel hashing scheme, termed
\emph{zero-shot hashing} (ZSH), which compresses images of "unseen" categories
to binary codes with hash functions learned from limited training data of
"seen" categories. Specifically, we project independent data labels i.e.
0/1-form label vectors) into semantic embedding space, where semantic
relationships among all the labels can be precisely characterized and thus seen
supervised knowledge can be transferred to unseen classes. Moreover, in order
to cope with the semantic shift problem, we rotate the embedded space to more
suitably align the embedded semantics with the low-level visual feature space,
thereby alleviating the influence of semantic gap. In the meantime, to exert
positive effects on learning high-quality hash functions, we further propose to
preserve local structural property and discrete nature in binary codes.
Besides, we develop an efficient alternating algorithm to solve the ZSH model.
Extensive experiments conducted on various real-life datasets show the superior
zero-shot image retrieval performance of ZSH as compared to several
state-of-the-art hashing methods.Comment: 11 page
LiveSketch: Query Perturbations for Guided Sketch-based Visual Search
LiveSketch is a novel algorithm for searching large image collections using
hand-sketched queries. LiveSketch tackles the inherent ambiguity of sketch
search by creating visual suggestions that augment the query as it is drawn,
making query specification an iterative rather than one-shot process that helps
disambiguate users' search intent. Our technical contributions are: a triplet
convnet architecture that incorporates an RNN based variational autoencoder to
search for images using vector (stroke-based) queries; real-time clustering to
identify likely search intents (and so, targets within the search embedding);
and the use of backpropagation from those targets to perturb the input stroke
sequence, so suggesting alterations to the query in order to guide the search.
We show improvements in accuracy and time-to-task over contemporary baselines
using a 67M image corpus.Comment: Accepted to CVPR 201
Structure fusion based on graph convolutional networks for semi-supervised classification
Suffering from the multi-view data diversity and complexity for
semi-supervised classification, most of existing graph convolutional networks
focus on the networks architecture construction or the salient graph structure
preservation, and ignore the the complete graph structure for semi-supervised
classification contribution. To mine the more complete distribution structure
from multi-view data with the consideration of the specificity and the
commonality, we propose structure fusion based on graph convolutional networks
(SF-GCN) for improving the performance of semi-supervised classification.
SF-GCN can not only retain the special characteristic of each view data by
spectral embedding, but also capture the common style of multi-view data by
distance metric between multi-graph structures. Suppose the linear relationship
between multi-graph structures, we can construct the optimization function of
structure fusion model by balancing the specificity loss and the commonality
loss. By solving this function, we can simultaneously obtain the fusion
spectral embedding from the multi-view data and the fusion structure as
adjacent matrix to input graph convolutional networks for semi-supervised
classification. Experiments demonstrate that the performance of SF-GCN
outperforms that of the state of the arts on three challenging datasets, which
are Cora,Citeseer and Pubmed in citation networks
Deep Lesion Graphs in the Wild: Relationship Learning and Organization of Significant Radiology Image Findings in a Diverse Large-scale Lesion Database
Radiologists in their daily work routinely find and annotate significant
abnormalities on a large number of radiology images. Such abnormalities, or
lesions, have collected over years and stored in hospitals' picture archiving
and communication systems. However, they are basically unsorted and lack
semantic annotations like type and location. In this paper, we aim to organize
and explore them by learning a deep feature representation for each lesion. A
large-scale and comprehensive dataset, DeepLesion, is introduced for this task.
DeepLesion contains bounding boxes and size measurements of over 32K lesions.
To model their similarity relationship, we leverage multiple supervision
information including types, self-supervised location coordinates and sizes.
They require little manual annotation effort but describe useful attributes of
the lesions. Then, a triplet network is utilized to learn lesion embeddings
with a sequential sampling strategy to depict their hierarchical similarity
structure. Experiments show promising qualitative and quantitative results on
lesion retrieval, clustering, and classification. The learned embeddings can be
further employed to build a lesion graph for various clinically useful
applications. We propose algorithms for intra-patient lesion matching and
missing annotation mining. Experimental results validate their effectiveness.Comment: Accepted by CVPR2018. DeepLesion url adde
An Efficient Approximate kNN Graph Method for Diffusion on Image Retrieval
The application of the diffusion in many computer vision and artificial
intelligence projects has been shown to give excellent improvements in
performance. One of the main bottlenecks of this technique is the quadratic
growth of the kNN graph size due to the high-quantity of new connections
between nodes in the graph, resulting in long computation times. Several
strategies have been proposed to address this, but none are effective and
efficient. Our novel technique, based on LSH projections, obtains the same
performance as the exact kNN graph after diffusion, but in less time
(approximately 18 times faster on a dataset of a hundred thousand images). The
proposed method was validated and compared with other state-of-the-art on
several public image datasets, including Oxford5k, Paris6k, and Oxford105k
- …