12,984 research outputs found
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
On Robust Face Recognition via Sparse Encoding: the Good, the Bad, and the Ugly
In the field of face recognition, Sparse Representation (SR) has received
considerable attention during the past few years. Most of the relevant
literature focuses on holistic descriptors in closed-set identification
applications. The underlying assumption in SR-based methods is that each class
in the gallery has sufficient samples and the query lies on the subspace
spanned by the gallery of the same class. Unfortunately, such assumption is
easily violated in the more challenging face verification scenario, where an
algorithm is required to determine if two faces (where one or both have not
been seen before) belong to the same person. In this paper, we first discuss
why previous attempts with SR might not be applicable to verification problems.
We then propose an alternative approach to face verification via SR.
Specifically, we propose to use explicit SR encoding on local image patches
rather than the entire face. The obtained sparse signals are pooled via
averaging to form multiple region descriptors, which are then concatenated to
form an overall face descriptor. Due to the deliberate loss spatial relations
within each region (caused by averaging), the resulting descriptor is robust to
misalignment & various image deformations. Within the proposed framework, we
evaluate several SR encoding techniques: l1-minimisation, Sparse Autoencoder
Neural Network (SANN), and an implicit probabilistic technique based on
Gaussian Mixture Models. Thorough experiments on AR, FERET, exYaleB, BANCA and
ChokePoint datasets show that the proposed local SR approach obtains
considerably better and more robust performance than several previous
state-of-the-art holistic SR methods, in both verification and closed-set
identification problems. The experiments also show that l1-minimisation based
encoding has a considerably higher computational than the other techniques, but
leads to higher recognition rates
Non-negative mixtures
This is the author's accepted pre-print of the article, first published as M. D. Plumbley, A. Cichocki and R. Bro. Non-negative mixtures. In P. Comon and C. Jutten (Ed), Handbook of Blind Source Separation: Independent Component Analysis and Applications. Chapter 13, pp. 515-547. Academic Press, Feb 2010. ISBN 978-0-12-374726-6 DOI: 10.1016/B978-0-12-374726-6.00018-7file: Proof:p\PlumbleyCichockiBro10-non-negative.pdf:PDF owner: markp timestamp: 2011.04.26file: Proof:p\PlumbleyCichockiBro10-non-negative.pdf:PDF owner: markp timestamp: 2011.04.2
Going Deeper into Action Recognition: A Survey
Understanding human actions in visual data is tied to advances in
complementary research areas including object recognition, human dynamics,
domain adaptation and semantic segmentation. Over the last decade, human action
analysis evolved from earlier schemes that are often limited to controlled
environments to nowadays advanced solutions that can learn from millions of
videos and apply to almost all daily activities. Given the broad range of
applications from video surveillance to human-computer interaction, scientific
milestones in action recognition are achieved more rapidly, eventually leading
to the demise of what used to be good in a short time. This motivated us to
provide a comprehensive review of the notable steps taken towards recognizing
human actions. To this end, we start our discussion with the pioneering methods
that use handcrafted representations, and then, navigate into the realm of deep
learning based approaches. We aim to remain objective throughout this survey,
touching upon encouraging improvements as well as inevitable fallbacks, in the
hope of raising fresh questions and motivating new research directions for the
reader
Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution
Convolutional neural networks have recently demonstrated high-quality
reconstruction for single-image super-resolution. In this paper, we propose the
Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively
reconstruct the sub-band residuals of high-resolution images. At each pyramid
level, our model takes coarse-resolution feature maps as input, predicts the
high-frequency residuals, and uses transposed convolutions for upsampling to
the finer level. Our method does not require the bicubic interpolation as the
pre-processing step and thus dramatically reduces the computational complexity.
We train the proposed LapSRN with deep supervision using a robust Charbonnier
loss function and achieve high-quality reconstruction. Furthermore, our network
generates multi-scale predictions in one feed-forward pass through the
progressive reconstruction, thereby facilitates resource-aware applications.
Extensive quantitative and qualitative evaluations on benchmark datasets show
that the proposed algorithm performs favorably against the state-of-the-art
methods in terms of speed and accuracy.Comment: This work is accepted in CVPR 2017. The code and datasets are
available on http://vllab.ucmerced.edu/wlai24/LapSRN
- …