28,863 research outputs found
Approximate Image Matching using Strings of Bag-of-Visual Words Representation
International audienceThe Spatial Pyramid Matching approach has become very popular to model images as sets of local bag-of words. The image comparison is then done region-by-region with an intersection kernel. Despite its success, this model presents some limitations: the grid partitioning is predefined and identical for all images and the matching is sensitive to intra- and inter-class variations. In this paper, we propose a novel approach based on approximate string matching to overcome these limitations and improve the results. First, we introduce a new image representation as strings of ordered bag-of-words. Second, we present a new edit distance specifically adapted to strings of histograms in the context of image comparison. This distance identifies local alignments between subregions and allows to remove sequences of similar subregions to better match two images. Experiments on 15 Scenes and Caltech 101 show that the proposed approach outperforms the classical spatial pyramid representation and most existing concurrent methods for classification presented in recent years
Improving face gender classification by adding deliberately misaligned faces to the training data
A novel method of face gender classifier construction is proposed and evaluated. Previously, researchers have assumed that a computationally expensive face alignment step (in which the face image is transformed so that facial landmarks such as the eyes, nose, chin, etc, are in uniform locations in the image) is required in order to maximize the accuracy of predictions on new face images. We, however, argue that this step is not necessary, and that machine learning classifiers can be made robust to face misalignments by automatically expanding the training data with examples of faces that have been deliberately misaligned (for example, translated or rotated). To test our hypothesis, we evaluate this automatic training dataset expansion method with two types of image classifier, the first based on weak features such as Local Binary Pattern histograms, and the second based on SIFT keypoints. Using a benchmark face gender classification dataset recently proposed in the literature, we obtain a state-of-the-art accuracy of 92.5%, thus validating our approach
Improving Bag-of-Words model with spatial information
Bag-of-Words (BOW) models have recently become popular for the task of object recognition, owing to their good performance and simplicity. Much work has been proposed over the years to improve the BOW model, where the Spatial Pyramid Matching technique is the most notable. In this work, we propose three novel techniques to capture more re_ned spatial information between image features than that provided by the Spatial Pyramids. Our techniques demonstrate a performance gain over the Spatial Pyramid representation of the BOW model
Combining multiscale features for classification of hyperspectral images: a sequence based kernel approach
Nowadays, hyperspectral image classification widely copes with spatial
information to improve accuracy. One of the most popular way to integrate such
information is to extract hierarchical features from a multiscale segmentation.
In the classification context, the extracted features are commonly concatenated
into a long vector (also called stacked vector), on which is applied a
conventional vector-based machine learning technique (e.g. SVM with Gaussian
kernel). In this paper, we rather propose to use a sequence structured kernel:
the spectrum kernel. We show that the conventional stacked vector-based kernel
is actually a special case of this kernel. Experiments conducted on various
publicly available hyperspectral datasets illustrate the improvement of the
proposed kernel w.r.t. conventional ones using the same hierarchical spatial
features.Comment: 8th IEEE GRSS Workshop on Hyperspectral Image and Signal Processing:
Evolution in Remote Sensing (WHISPERS 2016), UCLA in Los Angeles, California,
U.
- âŠ