3 research outputs found
SRDA-Net: Super-Resolution Domain Adaptation Networks for Semantic Segmentation
Recently, Unsupervised Domain Adaptation was proposed to address the domain
shift problem in semantic segmentation task, but it may perform poor when
source and target domains belong to different resolutions. In this work, we
design a novel end-to-end semantic segmentation network, Super-Resolution
Domain Adaptation Network (SRDA-Net), which could simultaneously complete
super-resolution and domain adaptation. Such characteristics exactly meet the
requirement of semantic segmentation for remote sensing images which usually
involve various resolutions. Generally, SRDA-Net includes three deep neural
networks: a Super-Resolution and Segmentation (SRS) model focuses on recovering
high-resolution image and predicting segmentation map; a pixel-level domain
classifier (PDC) tries to distinguish the images from which domains; and
output-space domain classifier (ODC) discriminates pixel label distributions
from which domains. PDC and ODC are considered as the discriminators, and SRS
is treated as the generator. By the adversarial learning, SRS tries to align
the source with target domains on pixel-level visual appearance and
output-space. Experiments are conducted on the two remote sensing datasets with
different resolutions. SRDA-Net performs favorably against the state-of-the-art
methods in terms of accuracy and visual quality. Code and models are available
at https://github.com/tangzhenjie/SRDA-Net
Do Game Data Generalize Well for Remote Sensing Image Segmentation?
Despite the recent progress in deep learning and remote sensing image interpretation, the adaption of a deep learning model between different sources of remote sensing data still remains a challenge. This paper investigates an interesting question: do synthetic data generalize well for remote sensing image applications? To answer this question, we take the building segmentation as an example by training a deep learning model on the city map of a well-known video game “Grand Theft Auto V” and then adapting the model to real-world remote sensing images. We propose a generative adversarial training based segmentation framework to improve the adaptability of the segmentation model. Our model consists of a CycleGAN model and a ResNet based segmentation network, where the former one is a well-known image-to-image translation framework which learns a mapping of the image from the game domain to the remote sensing domain; and the latter one learns to predict pixel-wise building masks based on the transformed data. All models in our method can be trained in an end-to-end fashion. The segmentation model can be trained without using any additional ground truth reference of the real-world images. Experimental results on a public building segmentation dataset suggest the effectiveness of our adaptation method. Our method shows superiority over other state-of-the-art semantic segmentation methods, for example, Deeplab-v3 and UNet. Another advantage of our method is that by introducing semantic information to the image-to-image translation framework, the image style conversion can be further improved
Analyse comparative de l'utilisation de l'apprentissage profond sur des images satellitaires
L'analyse d'images satellites est un domaine de la géomatique permettant de nombreuses observations par rapport à la terre.
Une étape importante de toute observation est d'identifier le contenu de l'image.
Cette étape est normalement effectuée à la main, ce qui coûte temps et argent.
Avec l'avènement des réseaux de neurones profonds, des GPUs à forte capacité de calculs et du nombre croissant de données satellitaires annotées, les algorithmes apprenants sont désormais les outils les plus prometteurs pour l'analyse automatique d'images satellitaires.
Ce mémoire présente une étude préliminaire de l'application des réseaux à convolution sur des images satellites, ainsi que deux nouvelles méthodes devant permettre d'entraîner des réseaux de neurones a l'aide de données satellitaires pauvrement annotées.
Pour cela, on a utilisé deux bases de données de l'international society for photogrammetry and remote sensing comprenant 40 images étiquetées de six classes.
Les deux atouts majeurs de ces bases de données sont la grande variété des canaux composant leurs images, ainsi que les lieux différents (et donc contextes) où ces images ont été acquises.
Par la suite, nous présenterons des résultats empiriques à plusieurs questions d'ordre pratique en lien avec les performances attendues des réseaux de neurones profonds appliqués à l'imagerie satellitaire.
Vers la fin du rapport, nous présenterons deux techniques permettant de combiner plusieurs ensembles de données, et ce, grâce à des étiquettes de classes hiérarchiques