395 research outputs found
Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification
Designing discriminative powerful texture features robust to realistic
imaging conditions is a challenging computer vision problem with many
applications, including material recognition and analysis of satellite or
aerial imagery. In the past, most texture description approaches were based on
dense orderless statistical distribution of local features. However, most
recent approaches to texture recognition and remote sensing scene
classification are based on Convolutional Neural Networks (CNNs). The d facto
practice when learning these CNN models is to use RGB patches as input with
training performed on large amounts of labeled data (ImageNet). In this paper,
we show that Binary Patterns encoded CNN models, codenamed TEX-Nets, trained
using mapped coded images with explicit texture information provide
complementary information to the standard RGB deep models. Additionally, two
deep architectures, namely early and late fusion, are investigated to combine
the texture and color information. To the best of our knowledge, we are the
first to investigate Binary Patterns encoded CNNs and different deep network
fusion architectures for texture recognition and remote sensing scene
classification. We perform comprehensive experiments on four texture
recognition datasets and four remote sensing scene classification benchmarks:
UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with
7 categories and the recently introduced large scale aerial image dataset (AID)
with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary
information to standard RGB deep model of the same network architecture. Our
late fusion TEX-Net architecture always improves the overall performance
compared to the standard RGB network on both recognition problems. Our final
combination outperforms the state-of-the-art without employing fine-tuning or
ensemble of RGB network architectures.Comment: To appear in ISPRS Journal of Photogrammetry and Remote Sensin
Transfer Learning from Deep Features for Remote Sensing and Poverty Mapping
The lack of reliable data in developing countries is a major obstacle to
sustainable development, food security, and disaster relief. Poverty data, for
example, is typically scarce, sparse in coverage, and labor-intensive to
obtain. Remote sensing data such as high-resolution satellite imagery, on the
other hand, is becoming increasingly available and inexpensive. Unfortunately,
such data is highly unstructured and currently no techniques exist to
automatically extract useful insights to inform policy decisions and help
direct humanitarian efforts. We propose a novel machine learning approach to
extract large-scale socioeconomic indicators from high-resolution satellite
imagery. The main challenge is that training data is very scarce, making it
difficult to apply modern techniques such as Convolutional Neural Networks
(CNN). We therefore propose a transfer learning approach where nighttime light
intensities are used as a data-rich proxy. We train a fully convolutional CNN
model to predict nighttime lights from daytime imagery, simultaneously learning
features that are useful for poverty prediction. The model learns filters
identifying different terrains and man-made structures, including roads,
buildings, and farmlands, without any supervision beyond nighttime lights. We
demonstrate that these learned features are highly informative for poverty
mapping, even approaching the predictive performance of survey data collected
in the field.Comment: In Proc. 30th AAAI Conference on Artificial Intelligenc
Aggregated Deep Local Features for Remote Sensing Image Retrieval
Remote Sensing Image Retrieval remains a challenging topic due to the special
nature of Remote Sensing Imagery. Such images contain various different
semantic objects, which clearly complicates the retrieval task. In this paper,
we present an image retrieval pipeline that uses attentive, local convolutional
features and aggregates them using the Vector of Locally Aggregated Descriptors
(VLAD) to produce a global descriptor. We study various system parameters such
as the multiplicative and additive attention mechanisms and descriptor
dimensionality. We propose a query expansion method that requires no external
inputs. Experiments demonstrate that even without training, the local
convolutional features and global representation outperform other systems.
After system tuning, we can achieve state-of-the-art or competitive results.
Furthermore, we observe that our query expansion method increases overall
system performance by about 3%, using only the top-three retrieved images.
Finally, we show how dimensionality reduction produces compact descriptors with
increased retrieval performance and fast retrieval computation times, e.g. 50%
faster than the current systems.Comment: Published in Remote Sensing. The first two authors have equal
contributio
- …