638 research outputs found
Texture Synthesis Guided Deep Hashing for Texture Image Retrieval
With the large-scale explosion of images and videos over the internet,
efficient hashing methods have been developed to facilitate memory and time
efficient retrieval of similar images. However, none of the existing works uses
hashing to address texture image retrieval mostly because of the lack of
sufficiently large texture image databases. Our work addresses this problem by
developing a novel deep learning architecture that generates binary hash codes
for input texture images. For this, we first pre-train a Texture Synthesis
Network (TSN) which takes a texture patch as input and outputs an enlarged view
of the texture by injecting newer texture content. Thus it signifies that the
TSN encodes the learnt texture specific information in its intermediate layers.
In the next stage, a second network gathers the multi-scale feature
representations from the TSN's intermediate layers using channel-wise
attention, combines them in a progressive manner to a dense continuous
representation which is finally converted into a binary hash code with the help
of individual and pairwise label information. The new enlarged texture patches
also help in data augmentation to alleviate the problem of insufficient texture
data and are used to train the second stage of the network. Experiments on
three public texture image retrieval datasets indicate the superiority of our
texture synthesis guided hashing approach over current state-of-the-art
methods.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV),
2019 Video Presentation: https://www.youtube.com/watch?v=tXaXTGhzaJ
Progressive Domain-Independent Feature Decomposition Network for Zero-Shot Sketch-Based Image Retrieval
Zero-shot sketch-based image retrieval (ZS-SBIR) is a specific cross-modal
retrieval task for searching natural images given free-hand sketches under the
zero-shot scenario. Most existing methods solve this problem by simultaneously
projecting visual features and semantic supervision into a low-dimensional
common space for efficient retrieval. However, such low-dimensional projection
destroys the completeness of semantic knowledge in original semantic space, so
that it is unable to transfer useful knowledge well when learning semantic from
different modalities. Moreover, the domain information and semantic information
are entangled in visual features, which is not conducive for cross-modal
matching since it will hinder the reduction of domain gap between sketch and
image. In this paper, we propose a Progressive Domain-independent Feature
Decomposition (PDFD) network for ZS-SBIR. Specifically, with the supervision of
original semantic knowledge, PDFD decomposes visual features into domain
features and semantic ones, and then the semantic features are projected into
common space as retrieval features for ZS-SBIR. The progressive projection
strategy maintains strong semantic supervision. Besides, to guarantee the
retrieval features to capture clean and complete semantic information, the
cross-reconstruction loss is introduced to encourage that any combinations of
retrieval features and domain features can reconstruct the visual features.
Extensive experiments demonstrate the superiority of our PDFD over
state-of-the-art competitors
Novel hybrid generative adversarial network for synthesizing image from sketch
In the area of sketch-based image retrieval process, there is a potential difference between retrieving the match images from defined dataset and constructing the synthesized image. The former process is quite easier while the latter process requires more faster, accurate, and intellectual decision making by the processor. After reviewing open-end research problems from existing approaches, the proposed scheme introduces a computational framework of hybrid generative adversarial network (GAN) as a solution to address the identified research problem. The model takes the input of query image which is processed by generator module running 3 different deep learning modes of ResNet, MobileNet, and U-Net. The discriminator module processes the input of real images as well as output from generator. With a novel interactive communication between generator and discriminator, the proposed model offers optimal retrieval performance along with an inclusion of optimizer. The study outcome shows significant performance improvement
- …