5,562 research outputs found
Semi-Supervised Sparse Coding
Sparse coding approximates the data sample as a sparse linear combination of
some basic codewords and uses the sparse codes as new presentations. In this
paper, we investigate learning discriminative sparse codes by sparse coding in
a semi-supervised manner, where only a few training samples are labeled. By
using the manifold structure spanned by the data set of both labeled and
unlabeled samples and the constraints provided by the labels of the labeled
samples, we learn the variable class labels for all the samples. Furthermore,
to improve the discriminative ability of the learned sparse codes, we assume
that the class labels could be predicted from the sparse codes directly using a
linear classifier. By solving the codebook, sparse codes, class labels and
classifier parameters simultaneously in a unified objective function, we
develop a semi-supervised sparse coding algorithm. Experiments on two
real-world pattern recognition problems demonstrate the advantage of the
proposed methods over supervised sparse coding methods on partially labeled
data sets
A Discriminative Representation of Convolutional Features for Indoor Scene Recognition
Indoor scene recognition is a multi-faceted and challenging problem due to
the diverse intra-class variations and the confusing inter-class similarities.
This paper presents a novel approach which exploits rich mid-level
convolutional features to categorize indoor scenes. Traditionally used
convolutional features preserve the global spatial structure, which is a
desirable property for general object recognition. However, we argue that this
structuredness is not much helpful when we have large variations in scene
layouts, e.g., in indoor scenes. We propose to transform the structured
convolutional activations to another highly discriminative feature space. The
representation in the transformed space not only incorporates the
discriminative aspects of the target dataset, but it also encodes the features
in terms of the general object categories that are present in indoor scenes. To
this end, we introduce a new large-scale dataset of 1300 object categories
which are commonly present in indoor scenes. Our proposed approach achieves a
significant performance boost over previous state of the art approaches on five
major scene classification datasets
Feature and Region Selection for Visual Learning
Visual learning problems such as object classification and action recognition
are typically approached using extensions of the popular bag-of-words (BoW)
model. Despite its great success, it is unclear what visual features the BoW
model is learning: Which regions in the image or video are used to discriminate
among classes? Which are the most discriminative visual words? Answering these
questions is fundamental for understanding existing BoW models and inspiring
better models for visual recognition.
To answer these questions, this paper presents a method for feature selection
and region selection in the visual BoW model. This allows for an intermediate
visualization of the features and regions that are important for visual
learning. The main idea is to assign latent weights to the features or regions,
and jointly optimize these latent variables with the parameters of a classifier
(e.g., support vector machine). There are four main benefits of our approach:
(1) Our approach accommodates non-linear additive kernels such as the popular
and intersection kernel; (2) our approach is able to handle both
regions in images and spatio-temporal regions in videos in a unified way; (3)
the feature selection problem is convex, and both problems can be solved using
a scalable reduced gradient method; (4) we point out strong connections with
multiple kernel learning and multiple instance learning approaches.
Experimental results in the PASCAL VOC 2007, MSR Action Dataset II and YouTube
illustrate the benefits of our approach
- …