2,294 research outputs found
A Review of Codebook Models in Patch-Based Visual Object Recognition
The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods
Bags of Affine Subspaces for Robust Object Tracking
We propose an adaptive tracking algorithm where the object is modelled as a
continuously updated bag of affine subspaces, with each subspace constructed
from the object's appearance over several consecutive frames. In contrast to
linear subspaces, affine subspaces explicitly model the origin of subspaces.
Furthermore, instead of using a brittle point-to-subspace distance during the
search for the object in a new frame, we propose to use a subspace-to-subspace
distance by representing candidate image areas also as affine subspaces.
Distances between subspaces are then obtained by exploiting the non-Euclidean
geometry of Grassmann manifolds. Experiments on challenging videos (containing
object occlusions, deformations, as well as variations in pose and
illumination) indicate that the proposed method achieves higher tracking
accuracy than several recent discriminative trackers.Comment: in International Conference on Digital Image Computing: Techniques
and Applications, 201
DCTNet : A Simple Learning-free Approach for Face Recognition
PCANet was proposed as a lightweight deep learning network that mainly
leverages Principal Component Analysis (PCA) to learn multistage filter banks
followed by binarization and block-wise histograming. PCANet was shown worked
surprisingly well in various image classification tasks. However, PCANet is
data-dependence hence inflexible. In this paper, we proposed a
data-independence network, dubbed DCTNet for face recognition in which we adopt
Discrete Cosine Transform (DCT) as filter banks in place of PCA. This is
motivated by the fact that 2D DCT basis is indeed a good approximation for high
ranked eigenvectors of PCA. Both 2D DCT and PCA resemble a kind of modulated
sine-wave patterns, which can be perceived as a bandpass filter bank. DCTNet is
free from learning as 2D DCT bases can be computed in advance. Besides that, we
also proposed an effective method to regulate the block-wise histogram feature
vector of DCTNet for robustness. It is shown to provide surprising performance
boost when the probe image is considerably different in appearance from the
gallery image. We evaluate the performance of DCTNet extensively on a number of
benchmark face databases and being able to achieve on par with or often better
accuracy performance than PCANet.Comment: APSIPA ASC 201
- …