Search CORE

3,740 research outputs found

A Review of Codebook Models in Patch-Based Visual Object Recognition

Author: Niranjan Mahesan
Ramanan Amirthalingam
Publication venue
Publication date: 22/09/2011
Field of study

The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods

Southampton (e-Prints Soton)

Crossref

Recommended from our members

Improving "bag-of-keypoints" image categorisation: Generative Models and PDF-Kernels

Author: Farquhar J
Meng H
Shawe-Taylor J
Szedmak S
Publication venue
Publication date: 01/01/2005
Field of study

In this paper we propose two distinct enhancements to the basic ''bag-of-keypoints" image categorisation scheme proposed in [4]. In this approach images are represented as a variable sized set of local image features (keypoints). Thus, we require machine learning tools which can operate on sets of vectors. In [4] this is achieved by representing the set as a histogram over bins found by k-means. We show how this approach can be improved and generalised using Gaussian Mixture Models (GMMs). Alternatively, the set of keypoints can be represented directly as a probability density function, over which a kernel can be de ned. This approach is shown to give state of the art categorisation performance

Brunel University Research Archive

Sparse Coding on Symmetric Positive Definite Manifolds using Bregman Divergences

Author: Harandi Mehrtash
Hartley Richard
Lovell Brian
Sanderson Conrad
Publication venue
Publication date: 30/08/2014
Field of study

This paper introduces sparse coding and dictionary learning for Symmetric Positive Definite (SPD) matrices, which are often used in machine learning, computer vision and related areas. Unlike traditional sparse coding schemes that work in vector spaces, in this paper we discuss how SPD matrices can be described by sparse combination of dictionary atoms, where the atoms are also SPD matrices. We propose to seek sparse coding by embedding the space of SPD matrices into Hilbert spaces through two types of Bregman matrix divergences. This not only leads to an efficient way of performing sparse coding, but also an online and iterative scheme for dictionary learning. We apply the proposed methods to several computer vision tasks where images are represented by region covariance matrices. Our proposed algorithms outperform state-of-the-art methods on a wide range of classification tasks, including face recognition, action recognition, material classification and texture categorization

arXiv.org e-Print Archive

CiteSeerX

Kernel Cross-Correlator

Author: Wang Chen
Xie Lihua
Yuan Junsong
Zhang Le
Publication venue
Publication date: 26/02/2018
Field of study

Cross-correlator plays a significant role in many visual perception tasks, such as object detection and tracking. Beyond the linear cross-correlator, this paper proposes a kernel cross-correlator (KCC) that breaks traditional limitations. First, by introducing the kernel trick, the KCC extends the linear cross-correlation to non-linear space, which is more robust to signal noises and distortions. Second, the connection to the existing works shows that KCC provides a unified solution for correlation filters. Third, KCC is applicable to any kernel function and is not limited to circulant structure on training data, thus it is able to predict affine transformations with customized properties. Last, by leveraging the fast Fourier transform (FFT), KCC eliminates direct calculation of kernel vectors, thus achieves better performance yet still with a reasonable computational cost. Comprehensive experiments on visual tracking and human activity recognition using wearable devices demonstrate its robustness, flexibility, and efficiency. The source codes of both experiments are released at https://github.com/wang-chen/KCCComment: The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications