27,897 research outputs found
Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling
We frame the task of predicting a semantic labeling as a sparse
reconstruction procedure that applies a target-specific learned transfer
function to a generic deep sparse code representation of an image. This
strategy partitions training into two distinct stages. First, in an
unsupervised manner, we learn a set of generic dictionaries optimized for
sparse coding of image patches. We train a multilayer representation via
recursive sparse dictionary learning on pooled codes output by earlier layers.
Second, we encode all training images with the generic dictionaries and learn a
transfer function that optimizes reconstruction of patches extracted from
annotated ground-truth given the sparse codes of their corresponding image
patches. At test time, we encode a novel image using the generic dictionaries
and then reconstruct using the transfer function. The output reconstruction is
a semantic labeling of the test image.
Applying this strategy to the task of contour detection, we demonstrate
performance competitive with state-of-the-art systems. Unlike almost all prior
work, our approach obviates the need for any form of hand-designed features or
filters. To illustrate general applicability, we also show initial results on
semantic part labeling of human faces.
The effectiveness of our approach opens new avenues for research on deep
sparse representations. Our classifiers utilize this representation in a novel
manner. Rather than acting on nodes in the deepest layer, they attach to nodes
along a slice through multiple layers of the network in order to make
predictions about local patches. Our flexible combination of a generatively
learned sparse representation with discriminatively trained transfer
classifiers extends the notion of sparse reconstruction to encompass arbitrary
semantic labeling tasks.Comment: to appear in Asian Conference on Computer Vision (ACCV), 201
Low rank tensor recovery via iterative hard thresholding
We study extensions of compressive sensing and low rank matrix recovery
(matrix completion) to the recovery of low rank tensors of higher order from a
small number of linear measurements. While the theoretical understanding of low
rank matrix recovery is already well-developed, only few contributions on the
low rank tensor recovery problem are available so far. In this paper, we
introduce versions of the iterative hard thresholding algorithm for several
tensor decompositions, namely the higher order singular value decomposition
(HOSVD), the tensor train format (TT), and the general hierarchical Tucker
decomposition (HT). We provide a partial convergence result for these
algorithms which is based on a variant of the restricted isometry property of
the measurement operator adapted to the tensor decomposition at hand that
induces a corresponding notion of tensor rank. We show that subgaussian
measurement ensembles satisfy the tensor restricted isometry property with high
probability under a certain almost optimal bound on the number of measurements
which depends on the corresponding tensor format. These bounds are extended to
partial Fourier maps combined with random sign flips of the tensor entries.
Finally, we illustrate the performance of iterative hard thresholding methods
for tensor recovery via numerical experiments where we consider recovery from
Gaussian random measurements, tensor completion (recovery of missing entries),
and Fourier measurements for third order tensors.Comment: 34 page
Going Deeper into Action Recognition: A Survey
Understanding human actions in visual data is tied to advances in
complementary research areas including object recognition, human dynamics,
domain adaptation and semantic segmentation. Over the last decade, human action
analysis evolved from earlier schemes that are often limited to controlled
environments to nowadays advanced solutions that can learn from millions of
videos and apply to almost all daily activities. Given the broad range of
applications from video surveillance to human-computer interaction, scientific
milestones in action recognition are achieved more rapidly, eventually leading
to the demise of what used to be good in a short time. This motivated us to
provide a comprehensive review of the notable steps taken towards recognizing
human actions. To this end, we start our discussion with the pioneering methods
that use handcrafted representations, and then, navigate into the realm of deep
learning based approaches. We aim to remain objective throughout this survey,
touching upon encouraging improvements as well as inevitable fallbacks, in the
hope of raising fresh questions and motivating new research directions for the
reader
A Survey on Deep Learning in Medical Image Analysis
Deep learning algorithms, in particular convolutional networks, have rapidly
become a methodology of choice for analyzing medical images. This paper reviews
the major deep learning concepts pertinent to medical image analysis and
summarizes over 300 contributions to the field, most of which appeared in the
last year. We survey the use of deep learning for image classification, object
detection, segmentation, registration, and other tasks and provide concise
overviews of studies per application area. Open challenges and directions for
future research are discussed.Comment: Revised survey includes expanded discussion section and reworked
introductory section on common deep architectures. Added missed papers from
before Feb 1st 201
- …