11,733 research outputs found
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Do Convolutional Networks need to be Deep for Text Classification ?
We study in this work the importance of depth in convolutional models for
text classification, either when character or word inputs are considered. We
show on 5 standard text classification and sentiment analysis tasks that deep
models indeed give better performances than shallow networks when the text
input is represented as a sequence of characters. However, a simple
shallow-and-wide network outperforms deep models such as DenseNet with word
inputs. Our shallow word model further establishes new state-of-the-art
performances on two datasets: Yelp Binary (95.9\%) and Yelp Full (64.9\%)
Manitest: Are classifiers really invariant?
Invariance to geometric transformations is a highly desirable property of
automatic classifiers in many image recognition tasks. Nevertheless, it is
unclear to which extent state-of-the-art classifiers are invariant to basic
transformations such as rotations and translations. This is mainly due to the
lack of general methods that properly measure such an invariance. In this
paper, we propose a rigorous and systematic approach for quantifying the
invariance to geometric transformations of any classifier. Our key idea is to
cast the problem of assessing a classifier's invariance as the computation of
geodesics along the manifold of transformed images. We propose the Manitest
method, built on the efficient Fast Marching algorithm to compute the
invariance of classifiers. Our new method quantifies in particular the
importance of data augmentation for learning invariance from data, and the
increased invariance of convolutional neural networks with depth. We foresee
that the proposed generic tool for measuring invariance to a large class of
geometric transformations and arbitrary classifiers will have many applications
for evaluating and comparing classifiers based on their invariance, and help
improving the invariance of existing classifiers.Comment: BMVC 201
Efficient Image Gallery Representations at Scale Through Multi-Task Learning
Image galleries provide a rich source of diverse information about a product
which can be leveraged across many recommendation and retrieval applications.
We study the problem of building a universal image gallery encoder through
multi-task learning (MTL) approach and demonstrate that it is indeed a
practical way to achieve generalizability of learned representations to new
downstream tasks. Additionally, we analyze the relative predictive performance
of MTL-trained solutions against optimal and substantially more expensive
solutions, and find signals that MTL can be a useful mechanism to address
sparsity in low-resource binary tasks.Comment: Proceedings of the 43rd International ACM SIGIR Conference on
Research and Development in Information Retrieva
KCRC-LCD: Discriminative Kernel Collaborative Representation with Locality Constrained Dictionary for Visual Categorization
We consider the image classification problem via kernel collaborative
representation classification with locality constrained dictionary (KCRC-LCD).
Specifically, we propose a kernel collaborative representation classification
(KCRC) approach in which kernel method is used to improve the discrimination
ability of collaborative representation classification (CRC). We then measure
the similarities between the query and atoms in the global dictionary in order
to construct a locality constrained dictionary (LCD) for KCRC. In addition, we
discuss several similarity measure approaches in LCD and further present a
simple yet effective unified similarity measure whose superiority is validated
in experiments. There are several appealing aspects associated with LCD. First,
LCD can be nicely incorporated under the framework of KCRC. The LCD similarity
measure can be kernelized under KCRC, which theoretically links CRC and LCD
under the kernel method. Second, KCRC-LCD becomes more scalable to both the
training set size and the feature dimension. Example shows that KCRC is able to
perfectly classify data with certain distribution, while conventional CRC fails
completely. Comprehensive experiments on many public datasets also show that
KCRC-LCD is a robust discriminative classifier with both excellent performance
and good scalability, being comparable or outperforming many other
state-of-the-art approaches
- …