2,693 research outputs found
Learning to semantically segment high-resolution remote sensing images
Land cover classification is a task that requires methods capable of learning high-level features while dealing with high volume of data. Overcoming these challenges, Convolutional Networks (ConvNets) can learn specific and adaptable features depending on the data while, at the same time, learn classifiers. In this work, we propose a novel technique to automatically perform pixel-wise land cover classification. To the best of our knowledge, there is no other work in the literature that perform pixel-wise semantic segmentation based on data-driven feature descriptors for high-resolution remote sensing images. The main idea is to exploit the power of ConvNet feature representations to learn how to semantically segment remote sensing images. First, our method learns each label in a pixel-wise manner by taking into account the spatial context of each pixel. In a predicting phase, the probability of a pixel belonging to a class is also estimated according to its spatial context and the learned patterns. We conducted a systematic evaluation of the proposed algorithm using two remote sensing datasets with very distinct properties. Our results show that the proposed algorithm provides improvements when compared to traditional and state-of-the-art methods that ranges from 5 to 15% in terms of accuracy
DAugNet: Unsupervised, Multi-source, Multi-target, and Life-long Domain Adaptation for Semantic Segmentation of Satellite Images
The domain adaptation of satellite images has recently gained an increasing
attention to overcome the limited generalization abilities of machine learning
models when segmenting large-scale satellite images. Most of the existing
approaches seek for adapting the model from one domain to another. However,
such single-source and single-target setting prevents the methods from being
scalable solutions, since nowadays multiple source and target domains having
different data distributions are usually available. Besides, the continuous
proliferation of satellite images necessitates the classifiers to adapt to
continuously increasing data. We propose a novel approach, coined DAugNet, for
unsupervised, multi-source, multi-target, and life-long domain adaptation of
satellite images. It consists of a classifier and a data augmentor. The data
augmentor, which is a shallow network, is able to perform style transfer
between multiple satellite images in an unsupervised manner, even when new data
are added over the time. In each training iteration, it provides the classifier
with diversified data, which makes the classifier robust to large data
distribution difference between the domains. Our extensive experiments prove
that DAugNet significantly better generalizes to new geographic locations than
the existing approaches
Dense semantic labeling of sub-decimeter resolution images with convolutional neural networks
Semantic labeling (or pixel-level land-cover classification) in ultra-high
resolution imagery (< 10cm) requires statistical models able to learn high
level concepts from spatial data, with large appearance variations.
Convolutional Neural Networks (CNNs) achieve this goal by learning
discriminatively a hierarchy of representations of increasing abstraction.
In this paper we present a CNN-based system relying on an
downsample-then-upsample architecture. Specifically, it first learns a rough
spatial map of high-level representations by means of convolutions and then
learns to upsample them back to the original resolution by deconvolutions. By
doing so, the CNN learns to densely label every pixel at the original
resolution of the image. This results in many advantages, including i)
state-of-the-art numerical accuracy, ii) improved geometric accuracy of
predictions and iii) high efficiency at inference time.
We test the proposed system on the Vaihingen and Potsdam sub-decimeter
resolution datasets, involving semantic labeling of aerial images of 9cm and
5cm resolution, respectively. These datasets are composed by many large and
fully annotated tiles allowing an unbiased evaluation of models making use of
spatial information. We do so by comparing two standard CNN architectures to
the proposed one: standard patch classification, prediction of local label
patches by employing only convolutions and full patch labeling by employing
deconvolutions. All the systems compare favorably or outperform a
state-of-the-art baseline relying on superpixels and powerful appearance
descriptors. The proposed full patch labeling CNN outperforms these models by a
large margin, also showing a very appealing inference time.Comment: Accepted in IEEE Transactions on Geoscience and Remote Sensing, 201
Text2Seg: Remote Sensing Image Semantic Segmentation via Text-Guided Visual Foundation Models
Recent advancements in foundation models (FMs), such as GPT-4 and LLaMA, have
attracted significant attention due to their exceptional performance in
zero-shot learning scenarios. Similarly, in the field of visual learning,
models like Grounding DINO and the Segment Anything Model (SAM) have exhibited
remarkable progress in open-set detection and instance segmentation tasks. It
is undeniable that these FMs will profoundly impact a wide range of real-world
visual learning tasks, ushering in a new paradigm shift for developing such
models. In this study, we concentrate on the remote sensing domain, where the
images are notably dissimilar from those in conventional scenarios. We
developed a pipeline that leverages multiple FMs to facilitate remote sensing
image semantic segmentation tasks guided by text prompt, which we denote as
Text2Seg. The pipeline is benchmarked on several widely-used remote sensing
datasets, and we present preliminary results to demonstrate its effectiveness.
Through this work, we aim to provide insights into maximizing the applicability
of visual FMs in specific contexts with minimal model tuning. The code is
available at https://github.com/Douglas2Code/Text2Seg.Comment: 10 pages, 6 figure
GeoAI-enhanced Techniques to Support Geographical Knowledge Discovery from Big Geospatial Data
abstract: Big data that contain geo-referenced attributes have significantly reformed the way that I process and analyze geospatial data. Compared with the expected benefits received in the data-rich environment, more data have not always contributed to more accurate analysis. “Big but valueless” has becoming a critical concern to the community of GIScience and data-driven geography. As a highly-utilized function of GeoAI technique, deep learning models designed for processing geospatial data integrate powerful computing hardware and deep neural networks into various dimensions of geography to effectively discover the representation of data. However, limitations of these deep learning models have also been reported when People may have to spend much time on preparing training data for implementing a deep learning model. The objective of this dissertation research is to promote state-of-the-art deep learning models in discovering the representation, value and hidden knowledge of GIS and remote sensing data, through three research approaches. The first methodological framework aims to unify varied shadow into limited number of patterns, with the convolutional neural network (CNNs)-powered shape classification, multifarious shadow shapes with a limited number of representative shadow patterns for efficient shadow-based building height estimation. The second research focus integrates semantic analysis into a framework of various state-of-the-art CNNs to support human-level understanding of map content. The final research approach of this dissertation focuses on normalizing geospatial domain knowledge to promote the transferability of a CNN’s model to land-use/land-cover classification. This research reports a method designed to discover detailed land-use/land-cover types that might be challenging for a state-of-the-art CNN’s model that previously performed well on land-cover classification only.Dissertation/ThesisDoctoral Dissertation Geography 201
- …