88,962 research outputs found
Urban Land Cover Classification with Missing Data Modalities Using Deep Convolutional Neural Networks
Automatic urban land cover classification is a fundamental problem in remote
sensing, e.g. for environmental monitoring. The problem is highly challenging,
as classes generally have high inter-class and low intra-class variance.
Techniques to improve urban land cover classification performance in remote
sensing include fusion of data from different sensors with different data
modalities. However, such techniques require all modalities to be available to
the classifier in the decision-making process, i.e. at test time, as well as in
training. If a data modality is missing at test time, current state-of-the-art
approaches have in general no procedure available for exploiting information
from these modalities. This represents a waste of potentially useful
information. We propose as a remedy a convolutional neural network (CNN)
architecture for urban land cover classification which is able to embed all
available training modalities in a so-called hallucination network. The network
will in effect replace missing data modalities in the test phase, enabling
fusion capabilities even when data modalities are missing in testing. We
demonstrate the method using two datasets consisting of optical and digital
surface model (DSM) images. We simulate missing modalities by assuming that DSM
images are missing during testing. Our method outperforms both standard CNNs
trained only on optical images as well as an ensemble of two standard CNNs. We
further evaluate the potential of our method to handle situations where only
some DSM images are missing during testing. Overall, we show that we can
clearly exploit training time information of the missing modality during
testing
Learning Multimodal Latent Attributes
Abstract—The rapid development of social media sharing has created a huge demand for automatic media classification and annotation techniques. Attribute learning has emerged as a promising paradigm for bridging the semantic gap and addressing data sparsity via transferring attribute knowledge in object recognition and relatively simple action classification. In this paper, we address the task of attribute learning for understanding multimedia data with sparse and incomplete labels. In particular we focus on videos of social group activities, which are particularly challenging and topical examples of this task because of their multi-modal content and complex and unstructured nature relative to the density of annotations. To solve this problem, we (1) introduce a concept of semi-latent attribute space, expressing user-defined and latent attributes in a unified framework, and (2) propose a novel scalable probabilistic topic model for learning multi-modal semi-latent attributes, which dramatically reduces requirements for an exhaustive accurate attribute ontology and expensive annotation effort. We show that our framework is able to exploit latent attributes to outperform contemporary approaches for addressing a variety of realistic multimedia sparse data learning tasks including: multi-task learning, learning with label noise, N-shot transfer learning and importantly zero-shot learning
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
- …