638 research outputs found

    Texture Synthesis Guided Deep Hashing for Texture Image Retrieval

    Full text link
    With the large-scale explosion of images and videos over the internet, efficient hashing methods have been developed to facilitate memory and time efficient retrieval of similar images. However, none of the existing works uses hashing to address texture image retrieval mostly because of the lack of sufficiently large texture image databases. Our work addresses this problem by developing a novel deep learning architecture that generates binary hash codes for input texture images. For this, we first pre-train a Texture Synthesis Network (TSN) which takes a texture patch as input and outputs an enlarged view of the texture by injecting newer texture content. Thus it signifies that the TSN encodes the learnt texture specific information in its intermediate layers. In the next stage, a second network gathers the multi-scale feature representations from the TSN's intermediate layers using channel-wise attention, combines them in a progressive manner to a dense continuous representation which is finally converted into a binary hash code with the help of individual and pairwise label information. The new enlarged texture patches also help in data augmentation to alleviate the problem of insufficient texture data and are used to train the second stage of the network. Experiments on three public texture image retrieval datasets indicate the superiority of our texture synthesis guided hashing approach over current state-of-the-art methods.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV), 2019 Video Presentation: https://www.youtube.com/watch?v=tXaXTGhzaJ

    Progressive Domain-Independent Feature Decomposition Network for Zero-Shot Sketch-Based Image Retrieval

    Full text link
    Zero-shot sketch-based image retrieval (ZS-SBIR) is a specific cross-modal retrieval task for searching natural images given free-hand sketches under the zero-shot scenario. Most existing methods solve this problem by simultaneously projecting visual features and semantic supervision into a low-dimensional common space for efficient retrieval. However, such low-dimensional projection destroys the completeness of semantic knowledge in original semantic space, so that it is unable to transfer useful knowledge well when learning semantic from different modalities. Moreover, the domain information and semantic information are entangled in visual features, which is not conducive for cross-modal matching since it will hinder the reduction of domain gap between sketch and image. In this paper, we propose a Progressive Domain-independent Feature Decomposition (PDFD) network for ZS-SBIR. Specifically, with the supervision of original semantic knowledge, PDFD decomposes visual features into domain features and semantic ones, and then the semantic features are projected into common space as retrieval features for ZS-SBIR. The progressive projection strategy maintains strong semantic supervision. Besides, to guarantee the retrieval features to capture clean and complete semantic information, the cross-reconstruction loss is introduced to encourage that any combinations of retrieval features and domain features can reconstruct the visual features. Extensive experiments demonstrate the superiority of our PDFD over state-of-the-art competitors

    Novel hybrid generative adversarial network for synthesizing image from sketch

    Get PDF
    In the area of sketch-based image retrieval process, there is a potential difference between retrieving the match images from defined dataset and constructing the synthesized image. The former process is quite easier while the latter process requires more faster, accurate, and intellectual decision making by the processor. After reviewing open-end research problems from existing approaches, the proposed scheme introduces a computational framework of hybrid generative adversarial network (GAN) as a solution to address the identified research problem. The model takes the input of query image which is processed by generator module running 3 different deep learning modes of ResNet, MobileNet, and U-Net. The discriminator module processes the input of real images as well as output from generator. With a novel interactive communication between generator and discriminator, the proposed model offers optimal retrieval performance along with an inclusion of optimizer. The study outcome shows significant performance improvement
    • …
    corecore