355 research outputs found

    Discrete Multi-modal Hashing with Canonical Views for Robust Mobile Landmark Search

    Full text link
    Mobile landmark search (MLS) recently receives increasing attention for its great practical values. However, it still remains unsolved due to two important challenges. One is high bandwidth consumption of query transmission, and the other is the huge visual variations of query images sent from mobile devices. In this paper, we propose a novel hashing scheme, named as canonical view based discrete multi-modal hashing (CV-DMH), to handle these problems via a novel three-stage learning procedure. First, a submodular function is designed to measure visual representativeness and redundancy of a view set. With it, canonical views, which capture key visual appearances of landmark with limited redundancy, are efficiently discovered with an iterative mining strategy. Second, multi-modal sparse coding is applied to transform visual features from multiple modalities into an intermediate representation. It can robustly and adaptively characterize visual contents of varied landmark images with certain canonical views. Finally, compact binary codes are learned on intermediate representation within a tailored discrete binary embedding model which preserves visual relations of images measured with canonical views and removes the involved noises. In this part, we develop a new augmented Lagrangian multiplier (ALM) based optimization method to directly solve the discrete binary codes. We can not only explicitly deal with the discrete constraint, but also consider the bit-uncorrelated constraint and balance constraint together. Experiments on real world landmark datasets demonstrate the superior performance of CV-DMH over several state-of-the-art methods

    Enhanced Discrete Multi-modal Hashing: More Constraints yet Less Time to Learn (Extended Abstract)

    Get PDF
    This paper proposes a novel method, Enhanced Discrete Multi-modal Hashing (EDMH), which learns binary codes and hash functions simultaneously from the pairwise similarity matrix of data for large-scale cross-view retrieval. EDMH distinguishes itself from existing methods by considering not just the binarization constraint but also the balance and decorrelation constraints. Although those additional discrete constraints make the optimization problem of EDMH look a lot more complicated, we are actually able to develop a fast iterative learning algorithm in the alternating optimization framework for it, as after introducing a couple of auxiliary variables each subproblem of optimization turns out to have closed-form solutions. It has been confirmed by extensive experiments that EDMH can consistently deliver better retrieval performances than state-of-the-art MH methods at lower computational costs

    MESH : a flexible manifold-embedded semantic hashing for cross-modal retrieval

    Get PDF
    Hashing based methods for cross-modal retrieval has been widely explored in recent years. However, most of them mainly focus on the preservation of neighborhood relationship and label consistency, while ignore the proximity of neighbors and proximity of classes, which degrades the discrimination of hash codes. And most of them learn hash codes and hashing functions simultaneously, which limits the flexibility of algorithms. To address these issues, in this article, we propose a two-step cross-modal retrieval method named Manifold-Embedded Semantic Hashing (MESH). It exploits Local Linear Embedding to model the neighborhood proximity and uses class semantic embeddings to consider the proximity of classes. By so doing, MESH can not only extract the manifold structure in different modalities, but also can embed the class semantic information into hash codes to further improve the discrimination of learned hash codes. Moreover, the two-step scheme makes MESH flexible to various hashing functions. Extensive experimental results on three datasets show that MESH is superior to 10 state-of-the-art cross-modal hashing methods. Moreover, MESH also demonstrates superiority on deep features compared with the deep cross-modal hashing method. © 2013 IEEE
    • …
    corecore