347 research outputs found
A Review of Codebook Models in Patch-Based Visual Object Recognition
The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods
Recommended from our members
Image features and learning algorithms for biological, generic and social object recognition
Automated recognition of object categories in images is a critical step for many real-world computer vision applications. Interest region detectors and region descriptors have been widely employed to tackle the variability of objects in pose, scale, lighting, texture, color, and so on. Different types of object recognition problems usually require different image features and corresponding learning algorithms. This dissertation focuses on the design, evaluation and application of new image features and learning algorithms for the recognition of biological, generic and social objects. The first part of the dissertation introduces a new structure-based interest region detector called the principal curvature-based region detector (PCBR) which detects stable watershed regions that are robust to local intensity perturbations. This detector is specifically designed for region detection for biological objects. Several recognition architectures are then developed that fuse visual information from disparate types of image features for the categorization of complex objects. The described image features and learning algorithms achieve excellent performance on the difficult stonefly larvae dataset. The second part of the dissertation presents studies of methods for visual codebook learning and their application to object recognition. The dissertation first introduces the methodology and application of generative visual codebooks for stonefly recognition and introduces a discriminative evaluation methodology based on a maximum mutual information criterion. Then a new generative/discriminative visual codebook learning algorithm, called iterative discriminative clustering (IDC), is presented that refines the centers and the shapes of the generative codewords for improved discriminative power. It is followed by a novel codebook learning algorithm that builds multiple codebooks that are non-redundant in discriminative power. All these visual codebook learning algorithms achieve high performance on both biological and generic object recognition tasks. The final part of the dissertation describes a socially-driven clothes recognition system for an intelligent fitting-room system. The dissertation presents the results of a user study to identify the key factors for clothes recognition. It then describes learning algorithms for recognizing these key factors from clothes images using various image features. The clothes recognition system successfully enables automated social fashion information retrieval for an enhanced clothes shopping experience
Learning Sparse Adversarial Dictionaries For Multi-Class Audio Classification
Audio events are quite often overlapping in nature, and more prone to noise
than visual signals. There has been increasing evidence for the superior
performance of representations learned using sparse dictionaries for
applications like audio denoising and speech enhancement. This paper
concentrates on modifying the traditional reconstructive dictionary learning
algorithms, by incorporating a discriminative term into the objective function
in order to learn class-specific adversarial dictionaries that are good at
representing samples of their own class at the same time poor at representing
samples belonging to any other class. We quantitatively demonstrate the
effectiveness of our learned dictionaries as a stand-alone solution for both
binary as well as multi-class audio classification problems.Comment: Accepted in Asian Conference of Pattern Recognition (ACPR-2017
Recommended from our members
Redundancy reduction in motor control
Research in machine learning and neuroscience has made remarkable progress by investigating statistical redundancy in representations of natural environments, but to date much of this work has focused on sensory information like images and sounds. This dissertation explores the notions of redundancy and efficiency in the motor domain, where several different forms of independence exist. The dissertation begins by discussing redundancy at a conceptual level and presents relevant background material. Next, three main branches of original research are described. The first branch consists of a novel control framework for integrating low-bandwidth sensory updates with model uncertainty and action selection for navigating complex, multi-task environments. The second branch of research applies existing machine learning techniques to movement information and explores the mismatch between these methods for extracting independent components and the forms of redundancy that exist in the motor domain. The third branch of work analyzes full-body, goal-directed reaching movements gathered in a novel laboratory experiment, using explicitly measured information about the goal of each movement to uncover patterns in the movement dynamics. Each branch of research explores redundancy reduction in movement from a different perspective, building up a sort of catalog of the types of information present in movements. Redundancy is discussed throughout as an an important aspect of movement in the natural world. The dissertation concludes by summarizing the contributions of these three branches of work, and discussing promising areas for future work spurred by these investigations. More detailed models of voluntary movements hold promise not only for better treatments, improved prosthetics, smoother animations, and more fluid robots, but also as an avenue for scientific insight into the very foundations of cognition.Computer Science
Deep Decision Trees for Discriminative Dictionary Learning with Adversarial Multi-Agent Trajectories
With the explosion in the availability of spatio-temporal tracking data in
modern sports, there is an enormous opportunity to better analyse, learn and
predict important events in adversarial group environments. In this paper, we
propose a deep decision tree architecture for discriminative dictionary
learning from adversarial multi-agent trajectories. We first build up a
hierarchy for the tree structure by adding each layer and performing feature
weight based clustering in the forward pass. We then fine tune the player role
weights using back propagation. The hierarchical architecture ensures the
interpretability and the integrity of the group representation. The resulting
architecture is a decision tree, with leaf-nodes capturing a dictionary of
multi-agent group interactions. Due to the ample volume of data available, we
focus on soccer tracking data, although our approach can be used in any
adversarial multi-agent domain. We present applications of proposed method for
simulating soccer games as well as evaluating and quantifying team strategies.Comment: To appear in 4th International Workshop on Computer Vision in Sports
(CVsports) at CVPR 201
- …