223 research outputs found
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Zero-Shot Sketch-Image Hashing
Recent studies show that large-scale sketch-based image retrieval (SBIR) can
be efficiently tackled by cross-modal binary representation learning methods,
where Hamming distance matching significantly speeds up the process of
similarity search. Providing training and test data subjected to a fixed set of
pre-defined categories, the cutting-edge SBIR and cross-modal hashing works
obtain acceptable retrieval performance. However, most of the existing methods
fail when the categories of query sketches have never been seen during
training. In this paper, the above problem is briefed as a novel but realistic
zero-shot SBIR hashing task. We elaborate the challenges of this special task
and accordingly propose a zero-shot sketch-image hashing (ZSIH) model. An
end-to-end three-network architecture is built, two of which are treated as the
binary encoders. The third network mitigates the sketch-image heterogeneity and
enhances the semantic relations among data by utilizing the Kronecker fusion
layer and graph convolution, respectively. As an important part of ZSIH, we
formulate a generative hashing scheme in reconstructing semantic knowledge
representations for zero-shot retrieval. To the best of our knowledge, ZSIH is
the first zero-shot hashing work suitable for SBIR and cross-modal search.
Comprehensive experiments are conducted on two extended datasets, i.e., Sketchy
and TU-Berlin with a novel zero-shot train-test split. The proposed model
remarkably outperforms related works.Comment: Accepted as spotlight at CVPR 201
Texture Synthesis Guided Deep Hashing for Texture Image Retrieval
With the large-scale explosion of images and videos over the internet,
efficient hashing methods have been developed to facilitate memory and time
efficient retrieval of similar images. However, none of the existing works uses
hashing to address texture image retrieval mostly because of the lack of
sufficiently large texture image databases. Our work addresses this problem by
developing a novel deep learning architecture that generates binary hash codes
for input texture images. For this, we first pre-train a Texture Synthesis
Network (TSN) which takes a texture patch as input and outputs an enlarged view
of the texture by injecting newer texture content. Thus it signifies that the
TSN encodes the learnt texture specific information in its intermediate layers.
In the next stage, a second network gathers the multi-scale feature
representations from the TSN's intermediate layers using channel-wise
attention, combines them in a progressive manner to a dense continuous
representation which is finally converted into a binary hash code with the help
of individual and pairwise label information. The new enlarged texture patches
also help in data augmentation to alleviate the problem of insufficient texture
data and are used to train the second stage of the network. Experiments on
three public texture image retrieval datasets indicate the superiority of our
texture synthesis guided hashing approach over current state-of-the-art
methods.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV),
2019 Video Presentation: https://www.youtube.com/watch?v=tXaXTGhzaJ
Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-based Image Retrieval
This is the author accepted manuscript. The final version is available from IEEE via the DOI in this recordZero-shot sketch-based image retrieval (SBIR) is an emerging task in computer vision, allowing to retrieve natural images relevant to sketch queries that might not been seen in the training phase. Existing works either require aligned sketch-image pairs or inefficient memory fusion layer for mapping the visual information to a semantic space. In this work, we propose a semantically aligned paired cycle-consistent generative (SEM-PCYC) model for zero-shot SBIR, where each branch maps the visual information to a common semantic space via an adversarial training. Each of these branches maintains a cycle consistency that only requires supervision at category levels, and avoids the need of highly-priced aligned sketch-image pairs. A classification criteria on the generators' outputs ensures the visual to semantic space mapping to be discriminating. Furthermore, we propose to combine textual and hierarchical side information via a feature selection auto-encoder that selects discriminating side information within a same end-to-end model. Our results demonstrate a significant boost in zero-shot SBIR performance over the state-of-the-art on the challenging Sketchy and TU-Berlin datasets.European Union Horizon 202
- …