489 research outputs found
Compact Bilinear Pooling
Bilinear models has been shown to achieve impressive performance on a wide
range of visual tasks, such as semantic segmentation, fine grained recognition
and face recognition. However, bilinear features are high dimensional,
typically on the order of hundreds of thousands to a few million, which makes
them impractical for subsequent analysis. We propose two compact bilinear
representations with the same discriminative power as the full bilinear
representation but with only a few thousand dimensions. Our compact
representations allow back-propagation of classification errors enabling an
end-to-end optimization of the visual recognition system. The compact bilinear
representations are derived through a novel kernelized analysis of bilinear
pooling which provide insights into the discriminative power of bilinear
pooling, and a platform for further research in compact pooling methods.
Experimentation illustrate the utility of the proposed representations for
image classification and few-shot learning across several datasets.Comment: Camera ready version for CVP
A weighted subspace exponential kernel for support tensor machines
High-dimensional data in the form of tensors are challenging for kernel
classification methods. To both reduce the computational complexity and extract
informative features, kernels based on low-rank tensor decompositions have been
proposed. However, what decisive features of the tensors are exploited by these
kernels is often unclear. In this paper we propose a novel kernel that is based
on the Tucker decomposition. For this kernel the Tucker factors are computed
based on re-weighting of the Tucker matrices with tuneable powers of singular
values from the HOSVD decomposition. This provides a mechanism to balance the
contribution of the Tucker core and factors of the data. We benchmark support
tensor machines with this new kernel on several datasets. First we generate
synthetic data where two classes differ in either Tucker factors or core, and
compare our novel and previously existing kernels. We show robustness of the
new kernel with respect to both classification scenarios. We further test the
new method on real-world datasets. The proposed kernel has demonstrated a
higher test accuracy than the state-of-the-art tensor train multi-way
multi-level kernel, and a significantly lower computational time
When Kernel Methods meet Feature Learning: Log-Covariance Network for Action Recognition from Skeletal Data
Human action recognition from skeletal data is a hot research topic and
important in many open domain applications of computer vision, thanks to
recently introduced 3D sensors. In the literature, naive methods simply
transfer off-the-shelf techniques from video to the skeletal representation.
However, the current state-of-the-art is contended between to different
paradigms: kernel-based methods and feature learning with (recurrent) neural
networks. Both approaches show strong performances, yet they exhibit heavy, but
complementary, drawbacks. Motivated by this fact, our work aims at combining
together the best of the two paradigms, by proposing an approach where a
shallow network is fed with a covariance representation. Our intuition is that,
as long as the dynamics is effectively modeled, there is no need for the
classification network to be deep nor recurrent in order to score favorably. We
validate this hypothesis in a broad experimental analysis over 6 publicly
available datasets.Comment: 2017 IEEE Computer Vision and Pattern Recognition (CVPR) Workshop
- …