Search CORE

5 research outputs found

Enhanced Discrete Multi-modal Hashing: More Constraints yet Less Time to Learn (Extended Abstract)

Author: Chen Yong
Li Xuelong
Tian Zhibao
Wang Jun
Zhang Dell
Zhang Hui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/07/2023
Field of study

This paper proposes a novel method, Enhanced Discrete Multi-modal Hashing (EDMH), which learns binary codes and hash functions simultaneously from the pairwise similarity matrix of data for large-scale cross-view retrieval. EDMH distinguishes itself from existing methods by considering not just the binarization constraint but also the balance and decorrelation constraints. Although those additional discrete constraints make the optimization problem of EDMH look a lot more complicated, we are actually able to develop a fast iterative learning algorithm in the alternating optimization framework for it, as after introducing a couple of auxiliary variables each subproblem of optimization turns out to have closed-form solutions. It has been confirmed by extensive experiments that EDMH can consistently deliver better retrieval performances than state-of-the-art MH methods at lower computational costs

UCL Discovery

Enhanced Discrete Multi-modal Hashing: more constraints yet less time to learn

Author: Chen Y.
Li Xuelong
Tian Z.
Wang J.
Zhang Dell
Zhang H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/05/2020
Field of study

Due to the exponential growth of multimedia data, multi-modal hashing as a promising technique to make cross-view retrieval scalable is attracting more and more attention. However, most of the existing multi-modal hashing methods either divide the learning process unnaturally into two separate stages or treat the discrete optimization problem simplistically as a continuous one, which leads to suboptimal results. Recently, a few discrete multi-modal hashing methods that try to address such issues have emerged, but they still ignore several important discrete constraints (such as the balance and decorrelation of hash bits). In this paper, we overcome those limitations by proposing a novel method named "Enhanced Discrete Multi-modal Hashing (EDMH)" which learns binary codes and hashing functions simultaneously from the pairwise similarity matrix of data, under the aforementioned discrete constraints. Although the model of EDMH looks a lot more complex than the other models for multi-modal hashing, we are actually able to develop a fast iterative learning algorithm for it, since the subproblems of its optimization all have closed-form solutions after introducing two auxiliary variables. Our experimental results on three real-world datasets have revealed the usefulness of those previously ignored discrete constraints and demonstrated that EDMH not only performs much better than state-of-the-art competitors according to several retrieval metrics but also runs much faster than most of them

Birkbeck Institutional Research Online

New ideas and trends in deep multimodal content understanding: a review

Author: Chen W. Wang W., Liu L., Lew M.S.K.
Publication venue: 'Elsevier BV'
Publication date: 23/10/2020
Field of study

The focus of this survey is on the analysis of two modalities of multimodal deep learning: image and text. Unlike classic reviews of deep learning where monomodal image classifiers such as VGG, ResNet and Inception module are central topics, this paper will examine recent multimodal deep models and structures, including auto-encoders, generative adversarial nets and their variants. These models go beyond the simple image classifiers in which they can do uni-directional (e.g. image captioning, image generation) and bi-directional (e.g. cross-modal retrieval, visual question answering) multimodal tasks. Besides, we analyze two aspects of the challenge in terms of better content understanding in deep multimodal applications. We then introduce current ideas and trends in deep multimodal feature learning, such as feature embedding approaches and objective function design, which are crucial in overcoming the aforementioned challenges. Finally, we include several promising directions for future research.Computer Systems, Imagery and Medi

Leiden University Scholary Publications

Generalized Semantic Preserving Hashing for Cross-Modal Retrieval

Author: Biswas Soma
Chaudhury Kunal N
Mandal Devraj
Publication venue: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication date
Field of study

Cross-modal retrieval is gaining importance due to the availability of large amounts of multimedia data. Hashingbased techniques provide an attractive solution to this problem when the data size is large. For cross-modal retrieval, data from the two modalities may be associated with a single label or multiple labels, and in addition, may or may not have a one-to-one correspondence. This work proposes a simple hashing framework which has the capability to work with different scenarios while effectively capturing the semantic relationship between the data items. The work proceeds in two stages in which the first stage learns the optimum hash codes by factorizing an affinity matrix, constructed using the label information. In the second stage, ridge regression and kernel logistic regression is used to learn the hash functions for mapping the input data to the bit domain. We also propose a novel iterative solution for cases where the training data is very large, or when the whole training data is not available at once. Extensive experiments on single label data set like Wiki and multi-label datasets like MirFlickr, NUS-WIDE, Pascal, and LabelMe, and comparisons with the state-of-the-art, shows the usefulness of the proposed approach

Open Access Repository of IISc Research Publications