8,760 research outputs found
Transfer Learning from Deep Features for Remote Sensing and Poverty Mapping
The lack of reliable data in developing countries is a major obstacle to
sustainable development, food security, and disaster relief. Poverty data, for
example, is typically scarce, sparse in coverage, and labor-intensive to
obtain. Remote sensing data such as high-resolution satellite imagery, on the
other hand, is becoming increasingly available and inexpensive. Unfortunately,
such data is highly unstructured and currently no techniques exist to
automatically extract useful insights to inform policy decisions and help
direct humanitarian efforts. We propose a novel machine learning approach to
extract large-scale socioeconomic indicators from high-resolution satellite
imagery. The main challenge is that training data is very scarce, making it
difficult to apply modern techniques such as Convolutional Neural Networks
(CNN). We therefore propose a transfer learning approach where nighttime light
intensities are used as a data-rich proxy. We train a fully convolutional CNN
model to predict nighttime lights from daytime imagery, simultaneously learning
features that are useful for poverty prediction. The model learns filters
identifying different terrains and man-made structures, including roads,
buildings, and farmlands, without any supervision beyond nighttime lights. We
demonstrate that these learned features are highly informative for poverty
mapping, even approaching the predictive performance of survey data collected
in the field.Comment: In Proc. 30th AAAI Conference on Artificial Intelligenc
Zero-Shot Learning -- A Comprehensive Evaluation of the Good, the Bad and the Ugly
Due to the importance of zero-shot learning, i.e. classifying images where
there is a lack of labeled training data, the number of proposed approaches has
recently increased steadily. We argue that it is time to take a step back and
to analyze the status quo of the area. The purpose of this paper is three-fold.
First, given the fact that there is no agreed upon zero-shot learning
benchmark, we first define a new benchmark by unifying both the evaluation
protocols and data splits of publicly available datasets used for this task.
This is an important contribution as published results are often not comparable
and sometimes even flawed due to, e.g. pre-training on zero-shot test classes.
Moreover, we propose a new zero-shot learning dataset, the Animals with
Attributes 2 (AWA2) dataset which we make publicly available both in terms of
image features and the images themselves. Second, we compare and analyze a
significant number of the state-of-the-art methods in depth, both in the
classic zero-shot setting but also in the more realistic generalized zero-shot
setting. Finally, we discuss in detail the limitations of the current status of
the area which can be taken as a basis for advancing it.Comment: Accepted by TPAMI in July, 2018. We introduce Proposed Split Version
2.0 (Please download it from our project webpage). arXiv admin note:
substantial text overlap with arXiv:1703.0439
SEGCloud: Semantic Segmentation of 3D Point Clouds
3D semantic scene labeling is fundamental to agents operating in the real
world. In particular, labeling raw 3D point sets from sensors provides
fine-grained semantics. Recent works leverage the capabilities of Neural
Networks (NNs), but are limited to coarse voxel predictions and do not
explicitly enforce global consistency. We present SEGCloud, an end-to-end
framework to obtain 3D point-level segmentation that combines the advantages of
NNs, trilinear interpolation(TI) and fully connected Conditional Random Fields
(FC-CRF). Coarse voxel predictions from a 3D Fully Convolutional NN are
transferred back to the raw 3D points via trilinear interpolation. Then the
FC-CRF enforces global consistency and provides fine-grained semantics on the
points. We implement the latter as a differentiable Recurrent NN to allow joint
optimization. We evaluate the framework on two indoor and two outdoor 3D
datasets (NYU V2, S3DIS, KITTI, Semantic3D.net), and show performance
comparable or superior to the state-of-the-art on all datasets.Comment: Accepted as a spotlight at the International Conference of 3D Vision
(3DV 2017
- …