13,424 research outputs found
Delving Deeper into Convolutional Networks for Learning Video Representations
We propose an approach to learn spatio-temporal features in videos from
intermediate visual representations we call "percepts" using
Gated-Recurrent-Unit Recurrent Networks (GRUs).Our method relies on percepts
that are extracted from all level of a deep convolutional network trained on
the large ImageNet dataset. While high-level percepts contain highly
discriminative information, they tend to have a low-spatial resolution.
Low-level percepts, on the other hand, preserve a higher spatial resolution
from which we can model finer motion patterns. Using low-level percepts can
leads to high-dimensionality video representations. To mitigate this effect and
control the model number of parameters, we introduce a variant of the GRU model
that leverages the convolution operations to enforce sparse connectivity of the
model units and share parameters across the input spatial locations.
We empirically validate our approach on both Human Action Recognition and
Video Captioning tasks. In particular, we achieve results equivalent to
state-of-art on the YouTube2Text dataset using a simpler text-decoder model and
without extra 3D CNN features.Comment: ICLR 201
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
- …