10 research outputs found
A machine learning approach for image retrieval tasks
International audienceSeveral methods based on visual methods (BoVW, VLAD, ...) or recent deep leaning methods try to solve the CBIR problem. Bag of visual words (BoVW) is one of most module used for both classification and image recognition. But, even with the high performance of BoVW, the problem of retrieving the image by content is still a challenge in computer vision. In this paper, we propose an improvement on a bag of visual words by increasing the accuracy of the retrieved candidates. In addition, we reduce the signature construction time by exploiting the powerful of the approximate nearest neighbor algorithms (ANNs). Experimental results will be applied to widely data sets (UKB, Wang, Corel 10K) and with different descriptors (CMI, SURF)
Leveraging semantic segmentation for hybrid image retrieval methods
International audienceContent Based Image Retrieval (CBIR) is the task of finding images in a database that are the most similar to the input query based on its visual characteristics. Several methods from the state of the art based on visual methods (Bag of visual words, VLAD, ...) or recent deep leaning methods try to solve the CBIR problem. In particular, Deep learning is a new field and used for several vision applications including CBIR. But, even with the increase of the performance of deep learning algorithms, this problem is still a challenge in computer vision. In this work, we propose three different methodologies combining deep learning based semantic segmentation and visual features. We show experimentally that by exploiting semantic information in the CBIR context leads to an increase in the retrieval accuracy. We study the performance of the proposed approach on eight different datasets (Wang, Corel-10k, Corel-5k, GHIM-10K, MSRC V1, MSRC V2, Linnaeus, NUS-WIDE
Geometric-visual descriptor for improved image based localization
International audienceThis paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains poses and 3D points associated to descriptors. In this paper we introduce a new method that leverages the stereo vision by adding geometric information to visual descriptors. This method can be used when the vertical direction of the camera is known (for example on a wheeled robot). This new geometric visual descriptor can be used with several image based localization algorithms based on visual words. We test the approach with different datasets (indoor, outdoor) and we show experimentally that the new geometric-visual descriptor improves standard image based localization approaches
Leveraging Semantic Segmentation For Improved Image Based Localization
International audienceIn this paper we address the problem of localizing a query image in a 3D map obtained using a Structure From Motion (SfM) or a visual SLAM algorithm. Many situations, such as lighting or viewpoint changes, make the estimation process very difficult. In this paper, we have tried to increase the pose accuracy by integrating semantic information in the matching step. The classical output in semantic segmentation is a single label l for each pixel. We propose either to assign more than one label to each keypoint by two different ways. We compare the proposed methods with the state of the art. For this, we use two public datasets (Dubrovnik, Rome). We show that by incorporating visual and semantic information, the pose estimation can be improved in terms of time and precision
A New CBIR Model Using Semantic Segmentation and Fast Spatial Binary Encoding
International audienceContent Based Image Retrieval(CBIR) is the task of finding similar images from a query one. Since the term similar means here "with the same semantic content", we propose to explore in this paper, a framework that uses Deep Neural Networks based semantic segmentation networks, coupled with a binary spatial encoding. Such simple representation has several relevant properties: 1) It takes advantage of the state of the art semantic segmentation networks and 2) the proposed binary encoding allows a Hamming distance that requests a very low computation budget resulting to a fast CBIR method. Several experiments achieved on public datasets show that our binary semantic signature leads to increase the CBIR accuracy and reduce the execution time. We study the performance of the proposed approach on six different public datasets
An efficient ir approach based semantic segmentation
International audienceBased Image Retrieval (CBIR) is the task of finding similar images from a query one. The state of the art mentions two main methods to solve the retrieval problem: (1) Methods dependent on visual description, for example, bag of visual words model (BoVW), Vector of Locally Aggregated Descriptors (VLAD) (2) Methods dependent on deep learning approaches in particular convolutional neural networks (CNN). In this article, we attempt to improve the CBIR algorithms with the proposition of two image signatures based on deep learning. In the first, we build a fast binary signature by utilizing a CNN based semantic segmentation. In the second, we combine the visual information with the semantic information to get a discriminative image signature denoted semantic bag of visual phrase.We study the performance of the proposed approach on six different public datasets: Wang, Corel 10k, GHIM-10K, MSRC-V1,MSRC-V2, Linnaeus. We significantly improve the mean of average precision scores (MAP) between 10% and 25% on almost all the datasets compared to state-of-the-art methods. Several experiments achieved on public datasets show that our proposal leads to increase the CBIR accuracy
A robust CBIR framework in between bags of visual words and phrases models for specific image datasets
International audienc