24,889 research outputs found

    Texture Synthesis Guided Deep Hashing for Texture Image Retrieval

    Full text link
    With the large-scale explosion of images and videos over the internet, efficient hashing methods have been developed to facilitate memory and time efficient retrieval of similar images. However, none of the existing works uses hashing to address texture image retrieval mostly because of the lack of sufficiently large texture image databases. Our work addresses this problem by developing a novel deep learning architecture that generates binary hash codes for input texture images. For this, we first pre-train a Texture Synthesis Network (TSN) which takes a texture patch as input and outputs an enlarged view of the texture by injecting newer texture content. Thus it signifies that the TSN encodes the learnt texture specific information in its intermediate layers. In the next stage, a second network gathers the multi-scale feature representations from the TSN's intermediate layers using channel-wise attention, combines them in a progressive manner to a dense continuous representation which is finally converted into a binary hash code with the help of individual and pairwise label information. The new enlarged texture patches also help in data augmentation to alleviate the problem of insufficient texture data and are used to train the second stage of the network. Experiments on three public texture image retrieval datasets indicate the superiority of our texture synthesis guided hashing approach over current state-of-the-art methods.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV), 2019 Video Presentation: https://www.youtube.com/watch?v=tXaXTGhzaJ

    Dynamic Facial Expression Generation on Hilbert Hypersphere with Conditional Wasserstein Generative Adversarial Nets

    Full text link
    In this work, we propose a novel approach for generating videos of the six basic facial expressions given a neutral face image. We propose to exploit the face geometry by modeling the facial landmarks motion as curves encoded as points on a hypersphere. By proposing a conditional version of manifold-valued Wasserstein generative adversarial network (GAN) for motion generation on the hypersphere, we learn the distribution of facial expression dynamics of different classes, from which we synthesize new facial expression motions. The resulting motions can be transformed to sequences of landmarks and then to images sequences by editing the texture information using another conditional Generative Adversarial Network. To the best of our knowledge, this is the first work that explores manifold-valued representations with GAN to address the problem of dynamic facial expression generation. We evaluate our proposed approach both quantitatively and qualitatively on two public datasets; Oulu-CASIA and MUG Facial Expression. Our experimental results demonstrate the effectiveness of our approach in generating realistic videos with continuous motion, realistic appearance and identity preservation. We also show the efficiency of our framework for dynamic facial expressions generation, dynamic facial expression transfer and data augmentation for training improved emotion recognition models

    Motion tubes for the representation of images sequences

    Get PDF
    International audienceIn this paper, we introduce a novel way to represent an image sequence, which naturally exhibits the temporal persistence of the textures. Standardized representations have been thoroughly optimized, and getting significant improvements has become more and more difficult. As an alternative, Analysis-Synthesis (AS) coders have focused on the use of texture within a video coder. We introduce here a new AS representation of image sequences that remains close to the classic block-based representation. By tracking textures throughout the sequence, we propose to reconstruct it from a set of moving textures which we call motion tubes. A new motion model is then proposed, which allows for motion field continuities and discontinuities, by hybridizing Block Matching and a low-computational mesh-based representation. Finally, we propose a bi-predictional framework for motion tubes management
    • …
    corecore