3,616 research outputs found

    Dynamic Multi-view Hashing for Online Image Retrieval

    Get PDF
    Advanced hashing technique is essential to facilitate effective large scale online image organization and retrieval, where image contents could be frequently changed. Traditional multi-view hashing methods are developed based on batch-based learning, which leads to very expensive updating cost. Meanwhile, existing online hashing methods mainly focus on single-view data and thus can not achieve promising performance when searching real online images, which are multiple view based data. Further, both types of hashing methods can only produce hash code with fixed length. Consequently they suffer from limited capability to comprehensive characterization of streaming image data in the real world. In this paper, we propose dynamic multi-view hashing (DMVH), which can adaptively augment hash codes according to dynamic changes of image. Meanwhile, DMVH leverages online learning to generate hash codes. It can increase the code length when current code is not able to represent new images effectively. Moreover, to gain further improvement on overall performance, each view is assigned with a weight, which can be efficiently updated during the online learning process. In order to avoid the frequent updating of code length and view weights, an intelligent buffering scheme is also specifically designed to preserve significant data to maintain good effectiveness of DMVH. Experimental results on two real-world image datasets demonstrate superior performance of DWVH over several state-of-the-art hashing methods

    Online Product Quantization

    Full text link
    Approximate nearest neighbor (ANN) search has achieved great success in many tasks. However, existing popular methods for ANN search, such as hashing and quantization methods, are designed for static databases only. They cannot handle well the database with data distribution evolving dynamically, due to the high computational effort for retraining the model based on the new database. In this paper, we address the problem by developing an online product quantization (online PQ) model and incrementally updating the quantization codebook that accommodates to the incoming streaming data. Moreover, to further alleviate the issue of large scale computation for the online PQ update, we design two budget constraints for the model to update partial PQ codebook instead of all. We derive a loss bound which guarantees the performance of our online PQ model. Furthermore, we develop an online PQ model over a sliding window with both data insertion and deletion supported, to reflect the real-time behaviour of the data. The experiments demonstrate that our online PQ model is both time-efficient and effective for ANN search in dynamic large scale databases compared with baseline methods and the idea of partial PQ codebook update further reduces the update cost.Comment: To appear in IEEE Transactions on Knowledge and Data Engineering (DOI: 10.1109/TKDE.2018.2817526

    Zero-Shot Hashing via Transferring Supervised Knowledge

    Full text link
    Hashing has shown its efficiency and effectiveness in facilitating large-scale multimedia applications. Supervised knowledge e.g. semantic labels or pair-wise relationship) associated to data is capable of significantly improving the quality of hash codes and hash functions. However, confronted with the rapid growth of newly-emerging concepts and multimedia data on the Web, existing supervised hashing approaches may easily suffer from the scarcity and validity of supervised information due to the expensive cost of manual labelling. In this paper, we propose a novel hashing scheme, termed \emph{zero-shot hashing} (ZSH), which compresses images of "unseen" categories to binary codes with hash functions learned from limited training data of "seen" categories. Specifically, we project independent data labels i.e. 0/1-form label vectors) into semantic embedding space, where semantic relationships among all the labels can be precisely characterized and thus seen supervised knowledge can be transferred to unseen classes. Moreover, in order to cope with the semantic shift problem, we rotate the embedded space to more suitably align the embedded semantics with the low-level visual feature space, thereby alleviating the influence of semantic gap. In the meantime, to exert positive effects on learning high-quality hash functions, we further propose to preserve local structural property and discrete nature in binary codes. Besides, we develop an efficient alternating algorithm to solve the ZSH model. Extensive experiments conducted on various real-life datasets show the superior zero-shot image retrieval performance of ZSH as compared to several state-of-the-art hashing methods.Comment: 11 page

    Hashing for Similarity Search: A Survey

    Full text link
    Similarity search (nearest neighbor search) is a problem of pursuing the data items whose distances to a query item are the smallest from a large database. Various methods have been developed to address this problem, and recently a lot of efforts have been devoted to approximate search. In this paper, we present a survey on one of the main solutions, hashing, which has been widely studied since the pioneering work locality sensitive hashing. We divide the hashing algorithms two main categories: locality sensitive hashing, which designs hash functions without exploring the data distribution and learning to hash, which learns hash functions according the data distribution, and review them from various aspects, including hash function design and distance measure and search scheme in the hash coding space

    Incremental hashing with sample selection using dominant sets

    Get PDF
    In the world of big data, large amounts of images are available in social media, corporate and even personal collections. A collection may grow quickly as new images are generated at high rates. The new images may cause changes in the distribution of existing classes or the emergence of new classes, resulting in the collection being dynamic and having concept drift. For efficient image retrieval from an image collection using a query, a hash table consisting of a set of hash functions is needed to transform images into binaryhash codeswhich are used as the basis to find similar images to the query. If the image collection is dynamic, the hash table built at one time step may not work well at the next due to changes in the collection as a result of new images being added. Therefore, the hash table needs to be rebuilt or updated at successive time steps. Incremental hashing (ICH) is the first effective method to deal with the concept drift problem in image retrieval from dynamic collections. In ICH, a new hash table is learned based on newly emerging images only which represent data distribution of the current data environment. The new hash table is used to generate hash codes for all images including old and new ones. Due to the dynamic nature, new images of one class may not be similar to old images of the same class. In order to learn new hash table that preserves within-class similarity in both old and new images,incremental hashing with sample selection using dominant sets(ICHDS) is proposed in this paper, which selects representative samples from each class for training the new hash table. Experimental results show that ICHDS yields better retrieval performance than existing dynamic and static hashing methods
    corecore