2,581 research outputs found
Improving Sparse Representation-Based Classification Using Local Principal Component Analysis
Sparse representation-based classification (SRC), proposed by Wright et al.,
seeks the sparsest decomposition of a test sample over the dictionary of
training samples, with classification to the most-contributing class. Because
it assumes test samples can be written as linear combinations of their
same-class training samples, the success of SRC depends on the size and
representativeness of the training set. Our proposed classification algorithm
enlarges the training set by using local principal component analysis to
approximate the basis vectors of the tangent hyperplane of the class manifold
at each training sample. The dictionary in SRC is replaced by a local
dictionary that adapts to the test sample and includes training samples and
their corresponding tangent basis vectors. We use a synthetic data set and
three face databases to demonstrate that this method can achieve higher
classification accuracy than SRC in cases of sparse sampling, nonlinear class
manifolds, and stringent dimension reduction.Comment: Published in "Computational Intelligence for Pattern Recognition,"
editors Shyi-Ming Chen and Witold Pedrycz. The original publication is
available at http://www.springerlink.co
KCRC-LCD: Discriminative Kernel Collaborative Representation with Locality Constrained Dictionary for Visual Categorization
We consider the image classification problem via kernel collaborative
representation classification with locality constrained dictionary (KCRC-LCD).
Specifically, we propose a kernel collaborative representation classification
(KCRC) approach in which kernel method is used to improve the discrimination
ability of collaborative representation classification (CRC). We then measure
the similarities between the query and atoms in the global dictionary in order
to construct a locality constrained dictionary (LCD) for KCRC. In addition, we
discuss several similarity measure approaches in LCD and further present a
simple yet effective unified similarity measure whose superiority is validated
in experiments. There are several appealing aspects associated with LCD. First,
LCD can be nicely incorporated under the framework of KCRC. The LCD similarity
measure can be kernelized under KCRC, which theoretically links CRC and LCD
under the kernel method. Second, KCRC-LCD becomes more scalable to both the
training set size and the feature dimension. Example shows that KCRC is able to
perfectly classify data with certain distribution, while conventional CRC fails
completely. Comprehensive experiments on many public datasets also show that
KCRC-LCD is a robust discriminative classifier with both excellent performance
and good scalability, being comparable or outperforming many other
state-of-the-art approaches
Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on n-Spheres
Many computer vision challenges require continuous outputs, but tend to be
solved by discrete classification. The reason is classification's natural
containment within a probability -simplex, as defined by the popular softmax
activation function. Regular regression lacks such a closed geometry, leading
to unstable training and convergence to suboptimal local minima. Starting from
this insight we revisit regression in convolutional neural networks. We observe
many continuous output problems in computer vision are naturally contained in
closed geometrical manifolds, like the Euler angles in viewpoint estimation or
the normals in surface normal estimation. A natural framework for posing such
continuous output problems are -spheres, which are naturally closed
geometric manifolds defined in the space. By introducing a
spherical exponential mapping on -spheres at the regression output, we
obtain well-behaved gradients, leading to stable training. We show how our
spherical regression can be utilized for several computer vision challenges,
specifically viewpoint estimation, surface normal estimation and 3D rotation
estimation. For all these problems our experiments demonstrate the benefit of
spherical regression. All paper resources are available at
https://github.com/leoshine/Spherical_Regression.Comment: CVPR 2019 camera read
Deep Grassmann Manifold Optimization for Computer Vision
In this work, we propose methods that advance four areas in the field of computer vision: dimensionality reduction, deep feature embeddings, visual domain adaptation, and deep neural network compression. We combine concepts from the fields of manifold geometry and deep learning to develop cutting edge methods in each of these areas. Each of the methods proposed in this work achieves state-of-the-art results in our experiments. We propose the Proxy Matrix Optimization (PMO) method for optimization over orthogonal matrix manifolds, such as the Grassmann manifold. This optimization technique is designed to be highly flexible enabling it to be leveraged in many situations where traditional manifold optimization methods cannot be used.
We first use PMO in the field of dimensionality reduction, where we propose an iterative optimization approach to Principal Component Analysis (PCA) in a framework called Proxy Matrix optimization based PCA (PM-PCA). We also demonstrate how PM-PCA can be used to solve the general -PCA problem, a variant of PCA that uses arbitrary fractional norms, which can be more robust to outliers. We then present Cascaded Projection (CaP), a method which uses tensor compression based on PMO, to reduce the number of filters in deep neural networks. This, in turn, reduces the number of computational operations required to process each image with the network. Cascaded Projection is the first end-to-end trainable method for network compression that uses standard backpropagation to learn the optimal tensor compression. In the area of deep feature embeddings, we introduce Deep Euclidean Feature Representations through Adaptation on the Grassmann manifold (DEFRAG), that leverages PMO. The DEFRAG method improves the feature embeddings learned by deep neural networks through the use of auxiliary loss functions and Grassmann manifold optimization. Lastly, in the area of visual domain adaptation, we propose the Manifold-Aligned Label Transfer for Domain Adaptation (MALT-DA) to transfer knowledge from samples in a known domain to an unknown domain based on cross-domain cluster correspondences
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Positive/Negative Emotion Detection from RGB-D upper Body Images
International audienceThe ability to identify users'mental states represents a valu-able asset for improving human-computer interaction. Considering that spontaneous emotions are conveyed mostly through facial expressions and the upper Body movements, we propose to use these modalities together for the purpose of negative/positive emotion classification. A method that allows the recognition of mental states from videos is pro-posed. Based on a dataset composed with RGB-D movies a set of indic-tors of positive and negative is extracted from 2D (RGB) information. In addition, a geometric framework to model the depth flows and capture human body dynamics from depth data is proposed. Due to temporal changes in pixel and depth intensity which characterize spontaneous emo-tions dataset, the depth features are used to define the relation between changes in upper body movements and the affect. We describe a space of depth and texture information to detect the mood of people using upper body postures and their evolution across time. The experimentation has been performed on Cam3D dataset and has showed promising results
Undersampled Phase Retrieval with Outliers
We propose a general framework for reconstructing transform-sparse images
from undersampled (squared)-magnitude data corrupted with outliers. This
framework is implemented using a multi-layered approach, combining multiple
initializations (to address the nonconvexity of the phase retrieval problem),
repeated minimization of a convex majorizer (surrogate for a nonconvex
objective function), and iterative optimization using the alternating
directions method of multipliers. Exploiting the generality of this framework,
we investigate using a Laplace measurement noise model better adapted to
outliers present in the data than the conventional Gaussian noise model. Using
simulations, we explore the sensitivity of the method to both the
regularization and penalty parameters. We include 1D Monte Carlo and 2D image
reconstruction comparisons with alternative phase retrieval algorithms. The
results suggest the proposed method, with the Laplace noise model, both
increases the likelihood of correct support recovery and reduces the mean
squared error from measurements containing outliers. We also describe exciting
extensions made possible by the generality of the proposed framework, including
regularization using analysis-form sparsity priors that are incompatible with
many existing approaches.Comment: 11 pages, 9 figure
- …