2,330 research outputs found
Kernel Cross-Correlator
Cross-correlator plays a significant role in many visual perception tasks,
such as object detection and tracking. Beyond the linear cross-correlator, this
paper proposes a kernel cross-correlator (KCC) that breaks traditional
limitations. First, by introducing the kernel trick, the KCC extends the linear
cross-correlation to non-linear space, which is more robust to signal noises
and distortions. Second, the connection to the existing works shows that KCC
provides a unified solution for correlation filters. Third, KCC is applicable
to any kernel function and is not limited to circulant structure on training
data, thus it is able to predict affine transformations with customized
properties. Last, by leveraging the fast Fourier transform (FFT), KCC
eliminates direct calculation of kernel vectors, thus achieves better
performance yet still with a reasonable computational cost. Comprehensive
experiments on visual tracking and human activity recognition using wearable
devices demonstrate its robustness, flexibility, and efficiency. The source
codes of both experiments are released at https://github.com/wang-chen/KCCComment: The Thirty-Second AAAI Conference on Artificial Intelligence
(AAAI-18
Learning midlevel image features for natural scene and texture classification
This paper deals with coding of natural scenes in order to extract semantic information. We present a new scheme to project natural scenes onto a basis in which each dimension encodes statistically independent information. Basis extraction is performed by independent component analysis (ICA) applied to image patches culled from natural scenes. The study of the resulting coding units (coding filters) extracted from well-chosen categories of images shows that they adapt and respond selectively to discriminant features in natural scenes. Given this basis, we define global and local image signatures relying on the maximal activity of filters on the input image. Locally, the construction of the signature takes into account the spatial distribution of the maximal responses within the image. We propose a criterion to reduce the size of the space of representation for faster computation. The proposed approach is tested in the context of texture classification (111 classes), as well as natural scenes classification (11 categories, 2037 images). Using a common protocol, the other commonly used descriptors have at most 47.7% accuracy on average while our method obtains performances of up to 63.8%. We show that this advantage does not depend on the size of the signature and demonstrate the efficiency of the proposed criterion to select ICA filters and reduce the dimensio
Support matrix machine: A review
Support vector machine (SVM) is one of the most studied paradigms in the
realm of machine learning for classification and regression problems. It relies
on vectorized input data. However, a significant portion of the real-world data
exists in matrix format, which is given as input to SVM by reshaping the
matrices into vectors. The process of reshaping disrupts the spatial
correlations inherent in the matrix data. Also, converting matrices into
vectors results in input data with a high dimensionality, which introduces
significant computational complexity. To overcome these issues in classifying
matrix input data, support matrix machine (SMM) is proposed. It represents one
of the emerging methodologies tailored for handling matrix input data. The SMM
method preserves the structural information of the matrix data by using the
spectral elastic net property which is a combination of the nuclear norm and
Frobenius norm. This article provides the first in-depth analysis of the
development of the SMM model, which can be used as a thorough summary by both
novices and experts. We discuss numerous SMM variants, such as robust, sparse,
class imbalance, and multi-class classification models. We also analyze the
applications of the SMM model and conclude the article by outlining potential
future research avenues and possibilities that may motivate academics to
advance the SMM algorithm
Discriminatively Trained Latent Ordinal Model for Video Classification
We study the problem of video classification for facial analysis and human
action recognition. We propose a novel weakly supervised learning method that
models the video as a sequence of automatically mined, discriminative
sub-events (eg. onset and offset phase for "smile", running and jumping for
"highjump"). The proposed model is inspired by the recent works on Multiple
Instance Learning and latent SVM/HCRF -- it extends such frameworks to model
the ordinal aspect in the videos, approximately. We obtain consistent
improvements over relevant competitive baselines on four challenging and
publicly available video based facial analysis datasets for prediction of
expression, clinical pain and intent in dyadic conversations and on three
challenging human action datasets. We also validate the method with qualitative
results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text
overlap with arXiv:1604.0150
Exemplar Based Deep Discriminative and Shareable Feature Learning for Scene Image Classification
In order to encode the class correlation and class specific information in
image representation, we propose a new local feature learning approach named
Deep Discriminative and Shareable Feature Learning (DDSFL). DDSFL aims to
hierarchically learn feature transformation filter banks to transform raw pixel
image patches to features. The learned filter banks are expected to: (1) encode
common visual patterns of a flexible number of categories; (2) encode
discriminative information; and (3) hierarchically extract patterns at
different visual levels. Particularly, in each single layer of DDSFL, shareable
filters are jointly learned for classes which share the similar patterns.
Discriminative power of the filters is achieved by enforcing the features from
the same category to be close, while features from different categories to be
far away from each other. Furthermore, we also propose two exemplar selection
methods to iteratively select training data for more efficient and effective
learning. Based on the experimental results, DDSFL can achieve very promising
performance, and it also shows great complementary effect to the
state-of-the-art Caffe features.Comment: Pattern Recognition, Elsevier, 201
- âŠ