6 research outputs found
Discrete Multi-modal Hashing with Canonical Views for Robust Mobile Landmark Search
Mobile landmark search (MLS) recently receives increasing attention for its
great practical values. However, it still remains unsolved due to two important
challenges. One is high bandwidth consumption of query transmission, and the
other is the huge visual variations of query images sent from mobile devices.
In this paper, we propose a novel hashing scheme, named as canonical view based
discrete multi-modal hashing (CV-DMH), to handle these problems via a novel
three-stage learning procedure. First, a submodular function is designed to
measure visual representativeness and redundancy of a view set. With it,
canonical views, which capture key visual appearances of landmark with limited
redundancy, are efficiently discovered with an iterative mining strategy.
Second, multi-modal sparse coding is applied to transform visual features from
multiple modalities into an intermediate representation. It can robustly and
adaptively characterize visual contents of varied landmark images with certain
canonical views. Finally, compact binary codes are learned on intermediate
representation within a tailored discrete binary embedding model which
preserves visual relations of images measured with canonical views and removes
the involved noises. In this part, we develop a new augmented Lagrangian
multiplier (ALM) based optimization method to directly solve the discrete
binary codes. We can not only explicitly deal with the discrete constraint, but
also consider the bit-uncorrelated constraint and balance constraint together.
Experiments on real world landmark datasets demonstrate the superior
performance of CV-DMH over several state-of-the-art methods
Mobile multi-view object image search
High user interaction capability of mobile devices can help improve the accuracy of mobile visual search systems. At query time, it is possible to capture multiple views of an object from different viewing angles and at different scales with the mobile device camera to obtain richer information about the object compared to a single view and hence return more accurate results. Motivated by this, we propose a new multi-view visual query model on multi-view object image databases for mobile visual search. Multi-view images of objects acquired by the mobile clients are processed and local features are sent to a server, which combines the query image representations with early/late fusion methods and returns the query results. We performed a comprehensive analysis of early and late fusion approaches using various similarity functions, on an existing single view and a new multi-view object image database. The experimental results show that multi-view search provides significantly better retrieval accuracy compared to traditional single view search. © 2016, Springer Science+Business Media New York
Mobile landmark search with 3D models
Landmark search is crucial to improve the quality of travel experience. Smart phones make it possible to search landmarks anytime and anywhere. Most of the existing work computes image features on smart phones locally after taking a landmark image. Compared with sending original image to the remote server, sending computed features saves network bandwidth and consequently makes sending process fast. However, this scheme would be restricted by the limitations of phone battery power and computational ability. In this paper, we propose to send compressed (low resolution) images to remote server instead of computing image features locally for landmark recognition and search. To this end, a robust 3D model based method is proposed to recognize query images with corresponding landmarks. Using the proposed method, images with low resolution can be recognized accurately, even though images only contain a small part of the landmark or are taken under various conditions of lighting, zoom, occlusions and different viewpoints. In order to provide an attractive landmark search result, a 3D texture model is generated to respond to a landmark query. The proposed search approach, which opens up a new direction, starts from a 2D compressed image query input and ends with a 3D model search result. © 2014 IEEE