45,803 research outputs found
Robust Image Analysis by L1-Norm Semi-supervised Learning
This paper presents a novel L1-norm semi-supervised learning algorithm for
robust image analysis by giving new L1-norm formulation of Laplacian
regularization which is the key step of graph-based semi-supervised learning.
Since our L1-norm Laplacian regularization is defined directly over the
eigenvectors of the normalized Laplacian matrix, we successfully formulate
semi-supervised learning as an L1-norm linear reconstruction problem which can
be effectively solved with sparse coding. By working with only a small subset
of eigenvectors, we further develop a fast sparse coding algorithm for our
L1-norm semi-supervised learning. Due to the sparsity induced by sparse coding,
the proposed algorithm can deal with the noise in the data to some extent and
thus has important applications to robust image analysis, such as noise-robust
image classification and noise reduction for visual and textual bag-of-words
(BOW) models. In particular, this paper is the first attempt to obtain robust
image representation by sparse co-refinement of visual and textual BOW models.
The experimental results have shown the promising performance of the proposed
algorithm.Comment: This is an extension of our long paper in ACM MM 201
Compositional Model based Fisher Vector Coding for Image Classification
Deriving from the gradient vector of a generative model of local features,
Fisher vector coding (FVC) has been identified as an effective coding method
for image classification. Most, if not all, FVC implementations employ the
Gaussian mixture model (GMM) to depict the generation process of local
features. However, the representative power of the GMM could be limited because
it essentially assumes that local features can be characterized by a fixed
number of feature prototypes and the number of prototypes is usually small in
FVC. To handle this limitation, in this paper we break the convention which
assumes that a local feature is drawn from one of few Gaussian distributions.
Instead, we adopt a compositional mechanism which assumes that a local feature
is drawn from a Gaussian distribution whose mean vector is composed as the
linear combination of multiple key components and the combination weight is a
latent random variable. In this way, we can greatly enhance the representative
power of the generative model of FVC. To implement our idea, we designed two
particular generative models with such a compositional mechanism.Comment: Fixed typos. 16 pages. Appearing in IEEE T. Pattern Analysis and
Machine Intelligence (TPAMI
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
Unsupervised Visual Feature Learning with Spike-timing-dependent Plasticity: How Far are we from Traditional Feature Learning Approaches?
Spiking neural networks (SNNs) equipped with latency coding and spike-timing
dependent plasticity rules offer an alternative to solve the data and energy
bottlenecks of standard computer vision approaches: they can learn visual
features without supervision and can be implemented by ultra-low power hardware
architectures. However, their performance in image classification has never
been evaluated on recent image datasets. In this paper, we compare SNNs to
auto-encoders on three visual recognition datasets, and extend the use of SNNs
to color images. The analysis of the results helps us identify some bottlenecks
of SNNs: the limits of on-center/off-center coding, especially for color
images, and the ineffectiveness of current inhibition mechanisms. These issues
should be addressed to build effective SNNs for image recognition
Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds
Sparsity-based representations have recently led to notable results in
various visual recognition tasks. In a separate line of research, Riemannian
manifolds have been shown useful for dealing with features and models that do
not lie in Euclidean spaces. With the aim of building a bridge between the two
realms, we address the problem of sparse coding and dictionary learning over
the space of linear subspaces, which form Riemannian structures known as
Grassmann manifolds. To this end, we propose to embed Grassmann manifolds into
the space of symmetric matrices by an isometric mapping. This in turn enables
us to extend two sparse coding schemes to Grassmann manifolds. Furthermore, we
propose closed-form solutions for learning a Grassmann dictionary, atom by
atom. Lastly, to handle non-linearity in data, we extend the proposed Grassmann
sparse coding and dictionary learning algorithms through embedding into Hilbert
spaces.
Experiments on several classification tasks (gender recognition, gesture
classification, scene analysis, face recognition, action recognition and
dynamic texture classification) show that the proposed approaches achieve
considerable improvements in discrimination accuracy, in comparison to
state-of-the-art methods such as kernelized Affine Hull Method and
graph-embedding Grassmann discriminant analysis.Comment: Appearing in International Journal of Computer Visio
- …