1,805 research outputs found
Classification of Occluded Objects using Fast Recurrent Processing
Recurrent neural networks are powerful tools for handling incomplete data
problems in computer vision, thanks to their significant generative
capabilities. However, the computational demand for these algorithms is too
high to work in real time, without specialized hardware or software solutions.
In this paper, we propose a framework for augmenting recurrent processing
capabilities into a feedforward network without sacrificing much from
computational efficiency. We assume a mixture model and generate samples of the
last hidden layer according to the class decisions of the output layer, modify
the hidden layer activity using the samples, and propagate to lower layers. For
visual occlusion problem, the iterative procedure emulates feedforward-feedback
loop, filling-in the missing hidden layer activity with meaningful
representations. The proposed algorithm is tested on a widely used dataset, and
shown to achieve 2 improvement in classification accuracy for occluded
objects. When compared to Restricted Boltzmann Machines, our algorithm shows
superior performance for occluded object classification.Comment: arXiv admin note: text overlap with arXiv:1409.8576 by other author
Learning Generative Models with Visual Attention
Attention has long been proposed by psychologists as important for
effectively dealing with the enormous sensory stimulus available in the
neocortex. Inspired by the visual attention models in computational
neuroscience and the need of object-centric data for generative models, we
describe for generative learning framework using attentional mechanisms.
Attentional mechanisms can propagate signals from region of interest in a scene
to an aligned canonical representation, where generative modeling takes place.
By ignoring background clutter, generative models can concentrate their
resources on the object of interest. Our model is a proper graphical model
where the 2D Similarity transformation is a part of the top-down process. A
ConvNet is employed to provide good initializations during posterior inference
which is based on Hamiltonian Monte Carlo. Upon learning images of faces, our
model can robustly attend to face regions of novel test subjects. More
importantly, our model can learn generative models of new faces from a novel
dataset of large images where the face locations are not known.Comment: In the proceedings of Neural Information Processing Systems, 201
MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction
In this work we propose a novel model-based deep convolutional autoencoder
that addresses the highly challenging problem of reconstructing a 3D human face
from a single in-the-wild color image. To this end, we combine a convolutional
encoder network with an expert-designed generative model that serves as
decoder. The core innovation is our new differentiable parametric decoder that
encapsulates image formation analytically based on a generative model. Our
decoder takes as input a code vector with exactly defined semantic meaning that
encodes detailed face pose, shape, expression, skin reflectance and scene
illumination. Due to this new way of combining CNN-based with model-based face
reconstruction, the CNN-based encoder learns to extract semantically meaningful
parameters from a single monocular input image. For the first time, a CNN
encoder and an expert-designed generative model can be trained end-to-end in an
unsupervised manner, which renders training on very large (unlabeled) real
world data feasible. The obtained reconstructions compare favorably to current
state-of-the-art approaches in terms of quality and richness of representation.Comment: International Conference on Computer Vision (ICCV) 2017 (Oral), 13
page
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
- …