19,922 research outputs found
A Review of Codebook Models in Patch-Based Visual Object Recognition
The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods
Weakly supervised segment annotation via expectation kernel density estimation
Since the labelling for the positive images/videos is ambiguous in weakly
supervised segment annotation, negative mining based methods that only use the
intra-class information emerge. In these methods, negative instances are
utilized to penalize unknown instances to rank their likelihood of being an
object, which can be considered as a voting in terms of similarity. However,
these methods 1) ignore the information contained in positive bags, 2) only
rank the likelihood but cannot generate an explicit decision function. In this
paper, we propose a voting scheme involving not only the definite negative
instances but also the ambiguous positive instances to make use of the extra
useful information in the weakly labelled positive bags. In the scheme, each
instance votes for its label with a magnitude arising from the similarity, and
the ambiguous positive instances are assigned soft labels that are iteratively
updated during the voting. It overcomes the limitations of voting using only
the negative bags. We also propose an expectation kernel density estimation
(eKDE) algorithm to gain further insight into the voting mechanism.
Experimental results demonstrate the superiority of our scheme beyond the
baselines.Comment: 9 pages, 2 figure
Place recognition: An Overview of Vision Perspective
Place recognition is one of the most fundamental topics in computer vision
and robotics communities, where the task is to accurately and efficiently
recognize the location of a given query image. Despite years of wisdom
accumulated in this field, place recognition still remains an open problem due
to the various ways in which the appearance of real-world places may differ.
This paper presents an overview of the place recognition literature. Since
condition invariant and viewpoint invariant features are essential factors to
long-term robust visual place recognition system, We start with traditional
image description methodology developed in the past, which exploit techniques
from image retrieval field. Recently, the rapid advances of related fields such
as object detection and image classification have inspired a new technique to
improve visual place recognition system, i.e., convolutional neural networks
(CNNs). Thus we then introduce recent progress of visual place recognition
system based on CNNs to automatically learn better image representations for
places. Eventually, we close with discussions and future work of place
recognition.Comment: Applied Sciences (2018
Enhanced spatial pyramid matching using log-polar-based image subdivision and representation
This paper presents a new model for capturing spatial information for object categorization with bag-of-words (BOW). BOW models have recently become popular for the task of object recognition, owing to their good performance and simplicity. Much work has been proposed over the years to improve the BOW model, where the Spatial Pyramid Matching (SPM) technique is the most notable. We propose a new method to exploit spatial relationships between image features, based on binned log-polar grids. Our model works by partitioning the image into grids of different scales and orientations and computing histogram of local features within each grid. Experimental results show that our approach improves the results on three diverse datasets over the SPM technique
- âŠ