268 research outputs found

    Unlocking the capabilities of explainable fewshot learning in remote sensing

    Full text link
    Recent advancements have significantly improved the efficiency and effectiveness of deep learning methods for imagebased remote sensing tasks. However, the requirement for large amounts of labeled data can limit the applicability of deep neural networks to existing remote sensing datasets. To overcome this challenge, fewshot learning has emerged as a valuable approach for enabling learning with limited data. While previous research has evaluated the effectiveness of fewshot learning methods on satellite based datasets, little attention has been paid to exploring the applications of these methods to datasets obtained from UAVs, which are increasingly used in remote sensing studies. In this review, we provide an up to date overview of both existing and newly proposed fewshot classification techniques, along with appropriate datasets that are used for both satellite based and UAV based data. Our systematic approach demonstrates that fewshot learning can effectively adapt to the broader and more diverse perspectives that UAVbased platforms can provide. We also evaluate some SOTA fewshot approaches on a UAV disaster scene classification dataset, yielding promising results. We emphasize the importance of integrating XAI techniques like attention maps and prototype analysis to increase the transparency, accountability, and trustworthiness of fewshot models for remote sensing. Key challenges and future research directions are identified, including tailored fewshot methods for UAVs, extending to unseen tasks like segmentation, and developing optimized XAI techniques suited for fewshot remote sensing problems. This review aims to provide researchers and practitioners with an improved understanding of fewshot learnings capabilities and limitations in remote sensing, while highlighting open problems to guide future progress in efficient, reliable, and interpretable fewshot methods.Comment: Under review, once the paper is accepted, the copyright will be transferred to the corresponding journa

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    GeoAI-enhanced Techniques to Support Geographical Knowledge Discovery from Big Geospatial Data

    Get PDF
    abstract: Big data that contain geo-referenced attributes have significantly reformed the way that I process and analyze geospatial data. Compared with the expected benefits received in the data-rich environment, more data have not always contributed to more accurate analysis. โ€œBig but valuelessโ€ has becoming a critical concern to the community of GIScience and data-driven geography. As a highly-utilized function of GeoAI technique, deep learning models designed for processing geospatial data integrate powerful computing hardware and deep neural networks into various dimensions of geography to effectively discover the representation of data. However, limitations of these deep learning models have also been reported when People may have to spend much time on preparing training data for implementing a deep learning model. The objective of this dissertation research is to promote state-of-the-art deep learning models in discovering the representation, value and hidden knowledge of GIS and remote sensing data, through three research approaches. The first methodological framework aims to unify varied shadow into limited number of patterns, with the convolutional neural network (CNNs)-powered shape classification, multifarious shadow shapes with a limited number of representative shadow patterns for efficient shadow-based building height estimation. The second research focus integrates semantic analysis into a framework of various state-of-the-art CNNs to support human-level understanding of map content. The final research approach of this dissertation focuses on normalizing geospatial domain knowledge to promote the transferability of a CNNโ€™s model to land-use/land-cover classification. This research reports a method designed to discover detailed land-use/land-cover types that might be challenging for a state-of-the-art CNNโ€™s model that previously performed well on land-cover classification only.Dissertation/ThesisDoctoral Dissertation Geography 201

    CMIR-NET : A Deep Learning Based Model For Cross-Modal Retrieval In Remote Sensing

    Get PDF
    We address the problem of cross-modal information retrieval in the domain of remote sensing. In particular, we are interested in two application scenarios: i) cross-modal retrieval between panchromatic (PAN) and multi-spectral imagery, and ii) multi-label image retrieval between very high resolution (VHR) images and speech based label annotations. Notice that these multi-modal retrieval scenarios are more challenging than the traditional uni-modal retrieval approaches given the inherent differences in distributions between the modalities. However, with the growing availability of multi-source remote sensing data and the scarcity of enough semantic annotations, the task of multi-modal retrieval has recently become extremely important. In this regard, we propose a novel deep neural network based architecture which is considered to learn a discriminative shared feature space for all the input modalities, suitable for semantically coherent information retrieval. Extensive experiments are carried out on the benchmark large-scale PAN - multi-spectral DSRSID dataset and the multi-label UC-Merced dataset. Together with the Merced dataset, we generate a corpus of speech signals corresponding to the labels. Superior performance with respect to the current state-of-the-art is observed in all the cases

    Multitemporal Very High Resolution from Space: Outcome of the 2016 IEEE GRSS Data Fusion Contest

    Get PDF
    In this paper, the scientific outcomes of the 2016 Data Fusion Contest organized by the Image Analysis and Data Fusion Technical Committee of the IEEE Geoscience and Remote Sensing Society are discussed. The 2016 Contest was an open topic competition based on a multitemporal and multimodal dataset, which included a temporal pair of very high resolution panchromatic and multispectral Deimos-2 images and a video captured by the Iris camera on-board the International Space Station. The problems addressed and the techniques proposed by the participants to the Contest spanned across a rather broad range of topics, and mixed ideas and methodologies from the remote sensing, video processing, and computer vision. In particular, the winning team developed a deep learning method to jointly address spatial scene labeling and temporal activity modeling using the available image and video data. The second place team proposed a random field model to simultaneously perform coregistration of multitemporal data, semantic segmentation, and change detection. The methodological key ideas of both these approaches and the main results of the corresponding experimental validation are discussed in this paper

    ์ดˆ๊ณ ํ•ด์ƒ๋„ ์˜์ƒ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ์ˆœํ™˜ ์ ๋Œ€์  ์ƒ์„ฑ ์‹ ๊ฒฝ๋ง ๊ธฐ๋ฐ˜์˜ ์ค€์ง€๋„ ํ•™์Šต ํ”„๋ ˆ์ž„์›Œํฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ๊ฑด์„คํ™˜๊ฒฝ๊ณตํ•™๋ถ€, 2021.8. ๊น€์šฉ์ผ.๊ณ ํ•ด์ƒ๋„ ์˜์ƒ ๋ถ„๋ฅ˜๋Š” ํ† ์ง€ํ”ผ๋ณต์ง€๋„ ์ œ์ž‘, ์‹์ƒ ๋ถ„๋ฅ˜, ๋„์‹œ ๊ณ„ํš ๋“ฑ์—์„œ ๋‹ค์–‘ํ•˜๊ฒŒ ํ™œ์šฉ๋˜๋Š” ๋Œ€ํ‘œ์ ์ธ ์˜์ƒ ๋ถ„์„ ๊ธฐ์ˆ ์ด๋‹ค. ์ตœ๊ทผ, ์‹ฌ์ธต ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง (deep convolutional neural network)์€ ์˜์ƒ ๋ถ„๋ฅ˜ ๋ถ„์•ผ์—์„œ ๋‘๊ฐ์„ ๋ณด์—ฌ์™”๋‹ค. ํŠนํžˆ, ์‹ฌ์ธต ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ๊ธฐ๋ฐ˜์˜ ์˜๋ฏธ๋ก ์  ์˜์ƒ ๋ถ„ํ•  (semantic segmentation) ๊ธฐ๋ฒ•์€ ์—ฐ์‚ฐ ๋น„์šฉ์„ ๋งค์šฐ ๊ฐ์†Œ์‹œํ‚ค๋ฉฐ, ์ด๋Ÿฌํ•œ ์ ์€ ์ง€์†์ ์œผ๋กœ ๊ณ ํ•ด์ƒ๋„ ๋ฐ์ดํ„ฐ๊ฐ€ ์ถ•์ ๋˜๊ณ  ์žˆ๋Š” ๊ณ ํ•ด์ƒ๋„ ์˜์ƒ์„ ๋ถ„์„ํ•  ๋•Œ ์ค‘์š”ํ•˜๊ฒŒ ์ž‘์šฉ๋œ๋‹ค. ์‹ฌ์ธต ํ•™์Šต (deep learning) ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•์ด ์•ˆ์ •์ ์ธ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ์ถฉ๋ถ„ํ•œ ์–‘์˜ ๋ผ๋ฒจ๋ง๋œ ๋ฐ์ดํ„ฐ (labeled data)๊ฐ€ ํ™•๋ณด๋˜์–ด์•ผ ํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜, ์›๊ฒฉํƒ์‚ฌ ๋ถ„์•ผ์—์„œ ๊ณ ํ•ด์ƒ๋„ ์˜์ƒ์— ๋Œ€ํ•œ ์ฐธ์กฐ๋ฐ์ดํ„ฐ๋ฅผ ์–ป๋Š” ๊ฒƒ์€ ๋น„์šฉ์ ์œผ๋กœ ์ œํ•œ์ ์ธ ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ผ๋ฒจ๋ง๋œ ์˜์ƒ๊ณผ ๋ผ๋ฒจ๋ง๋˜์ง€ ์•Š์€ ์˜์ƒ (unlabeled image)์„ ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜๋Š” ์ค€์ง€๋„ ํ•™์Šต ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•˜์˜€์œผ๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ๊ณ ํ•ด์ƒ๋„ ์˜์ƒ ๋ถ„๋ฅ˜๋ฅผ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ผ๋ฒจ๋ง๋˜์ง€ ์•Š์€ ์˜์ƒ์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ ๊ฐœ์„ ๋œ ์ˆœํ™˜ ์ ๋Œ€์  ์ƒ์„ฑ ์‹ ๊ฒฝ๋ง (CycleGAN) ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ˆœํ™˜ ์ ๋Œ€์  ์ƒ์„ฑ ์‹ ๊ฒฝ๋ง์€ ์˜์ƒ ๋ณ€ํ™˜ ๋ชจ๋ธ (image translation model)๋กœ ์ฒ˜์Œ ์ œ์•ˆ๋˜์—ˆ์œผ๋ฉฐ, ํŠนํžˆ ์ˆœํ™˜ ์ผ๊ด€์„ฑ ์†์‹ค ํ•จ์ˆ˜ (cycle consistency loss function)๋ฅผ ํ†ตํ•ด ํŽ˜์–ด๋ง๋˜์ง€ ์•Š์€ ์˜์ƒ (unpaired image)์„ ๋ชจ๋ธ ํ•™์Šต์— ํ™œ์šฉํ•œ ์—ฐ๊ตฌ์ด๋‹ค. ์ด๋Ÿฌํ•œ ์ˆœํ™˜ ์ผ๊ด€์„ฑ ์†์‹ค ํ•จ์ˆ˜์— ์˜๊ฐ์„ ๋ฐ›์•„, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ผ๋ฒจ๋ง๋˜์ง€ ์•Š์€ ์˜์ƒ์„ ์ฐธ์กฐ๋ฐ์ดํ„ฐ์™€ ํŽ˜์–ด๋ง๋˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ๋กœ ๊ฐ„์ฃผํ•˜์˜€์œผ๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ๋ผ๋ฒจ๋ง๋˜์ง€ ์•Š์€ ์˜์ƒ์œผ๋กœ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ํ•จ๊ป˜ ํ•™์Šต์‹œ์ผฐ๋‹ค. ์ˆ˜๋งŽ์€ ๋ผ๋ฒจ๋ง๋˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ์™€ ์ƒ๋Œ€์ ์œผ๋กœ ์ ์€ ๋ผ๋ฒจ๋ง๋œ ๋ฐ์ดํ„ฐ๋ฅผ ํ•จ๊ป˜ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด, ๋ณธ ๋…ผ๋ฌธ์€ ์ง€๋„ ํ•™์Šต๊ณผ ๊ฐœ์„ ๋œ ์ค€์ง€๋„ ํ•™์Šต ๊ธฐ๋ฐ˜์˜ ์ˆœํ™˜ ์ ๋Œ€์  ์ƒ์„ฑ ์‹ ๊ฒฝ๋ง์„ ๊ฒฐํ•ฉํ•˜์˜€๋‹ค. ์ œ์•ˆ๋œ ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ์ˆœํ™˜ ๊ณผ์ •(cyclic phase), ์ ๋Œ€์  ๊ณผ์ •(adversarial phase), ์ง€๋„ ํ•™์Šต ๊ณผ์ •(supervised learning phase), ์„ธ ๋ถ€๋ถ„์„ ํฌํ•จํ•˜๊ณ  ์žˆ๋‹ค. ๋ผ๋ฒจ๋ง๋œ ์˜์ƒ์€ ์ง€๋„ ํ•™์Šต ๊ณผ์ •์—์„œ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐ์— ์‚ฌ์šฉ๋œ๋‹ค. ์ ๋Œ€์  ๊ณผ์ •๊ณผ ์ง€๋„ ํ•™์Šต ๊ณผ์ •์—์„œ๋Š” ๋ผ๋ฒจ๋ง๋˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ๊ฐ€ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ์ ์€ ์–‘์˜ ์ฐธ์กฐ๋ฐ์ดํ„ฐ๋กœ ์ธํ•ด ์ถฉ๋ถ„ํžˆ ํ•™์Šต๋˜์ง€ ๋ชปํ•œ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ์ถ”๊ฐ€์ ์œผ๋กœ ํ•™์Šต์‹œํ‚จ๋‹ค. ์ œ์•ˆ๋œ ํ”„๋ ˆ์ž„์›Œํฌ์˜ ๊ฒฐ๊ณผ๋Š” ๊ณต๊ณต ๋ฐ์ดํ„ฐ์ธ ISPRS Vaihingen Dataset์„ ํ†ตํ•ด ํ‰๊ฐ€๋˜์—ˆ๋‹ค. ์ •ํ™•๋„ ๊ฒ€์ฆ์„ ์œ„ํ•ด, ์ œ์•ˆ๋œ ํ”„๋ ˆ์ž„์›Œํฌ์˜ ๊ฒฐ๊ณผ๋Š” 5๊ฐœ์˜ ๋ฒค์น˜๋งˆํฌ๋“ค (benchmarks)๊ณผ ๋น„๊ต๋˜์—ˆ์œผ๋ฉฐ, ์ด๋•Œ ์‚ฌ์šฉ๋œ ๋ฒค์น˜๋งˆํฌ ๋ชจ๋ธ๋“ค์€ ์ง€๋„ ํ•™์Šต๊ณผ ์ค€์ง€๋„ ํ•™์Šต ๋ฐฉ๋ฒ• ๋ชจ๋‘๋ฅผ ํฌํ•จํ•œ๋‹ค. ์ด์— ๋”ํ•ด, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ผ๋ฒจ๋ง๋œ ๋ฐ์ดํ„ฐ์™€ ๋ผ๋ฒจ๋ง๋˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ์˜ ๊ตฌ์„ฑ์— ๋”ฐ๋ฅธ ์˜ํ–ฅ์„ ํ™•์ธํ•˜์˜€์œผ๋ฉฐ, ๋‹ค๋ฅธ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์— ๋Œ€ํ•œ ๋ณธ ํ”„๋ ˆ์ž„์›Œํฌ์˜ ์ ์šฉ๊ฐ€๋Šฅ์„ฑ์— ๋Œ€ํ•œ ์ถ”๊ฐ€์ ์ธ ์‹คํ—˜๋„ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. ์ œ์•ˆ๋œ ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ๋‹ค๋ฅธ ๋ฒค์น˜๋งˆํฌ๋“ค๊ณผ ๋น„๊ตํ•ด์„œ ๊ฐ€์žฅ ๋†’์€ ์ •ํ™•๋„ (์„ธ ์‹คํ—˜ ์ง€์—ญ์— ๋Œ€ํ•ด 0.796, 0.786, 0.784์˜ ์ „์ฒด ์ •ํ™•๋„)๋ฅผ ๋‹ฌ์„ฑํ•˜์˜€๋‹ค. ํŠนํžˆ, ๊ฐ์ฒด์˜ ํฌ๊ธฐ๋‚˜ ๋ชจ์–‘๊ณผ ๊ฐ™์€ ํŠน์„ฑ์ด ๋‹ค๋ฅธ ์‹คํ—˜ ์ง€์—ญ์—์„œ ๊ฐ€์žฅ ํฐ ์ •ํ™•๋„ ์ƒ์Šน์„ ํ™•์ธํ•˜์˜€์œผ๋ฉฐ, ์ด๋Ÿฌํ•œ ๊ฒฐ๊ณผ๋ฅผ ํ†ตํ•ด ์ œ์•ˆ๋œ ์ค€์ง€๋„ ํ•™์Šต์ด ๋ชจ๋ธ์„ ์šฐ์ˆ˜ํ•˜๊ฒŒ ์ •๊ทœํ™”(regularization)ํ•จ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋˜ํ•œ, ์ค€์ง€๋„ ํ•™์Šต์„ ํ†ตํ•ด ํ–ฅ์ƒ๋˜๋Š” ์ •ํ™•๋„๋Š” ๋ผ๋ฒจ๋ง๋œ ๋ฐ์ดํ„ฐ์— ๋น„ํ•ด ๋ผ๋ฒจ๋ง๋˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ๊ฐ€ ์ƒ๋Œ€์ ์œผ๋กœ ๋งŽ์•˜์„ ๋•Œ ๊ทธ ์ฆ๊ฐ€ ํญ์ด ๋”์šฑ ์ปค์กŒ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์ œ์•ˆ๋œ ์ค€์ง€๋„ ํ•™์Šต ๊ธฐ๋ฐ˜์˜ ์ˆœํ™˜ ์ ๋Œ€์  ์ƒ์„ฑ ์‹ ๊ฒฝ๋ง ๊ธฐ๋ฒ•์ด UNet ์™ธ์—๋„ FPN๊ณผ PSPNet์ด๋ผ๋Š” ๋‹ค๋ฅธ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์—์„œ๋„ ์œ ์˜๋ฏธํ•œ ์ •ํ™•๋„ ์ƒ์Šน์„ ๋ณด์˜€๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋‹ค๋ฅธ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์— ๋Œ€ํ•œ ์ œ์•ˆ๋œ ํ”„๋ ˆ์ž„์›Œํฌ์˜ ์ ์šฉ๊ฐ€๋Šฅ์„ฑ์„ ํ™•์ธํ•˜์˜€๋‹คImage classification of Very High Resolution (VHR) images is a fundamental task in the remote sensing domain for various applications such as land cover mapping, vegetation mapping, and urban planning. In recent years, deep convolutional neural networks have shown promising performance in image classification studies. In particular, semantic segmentation models with fully convolutional architecture-based networks demonstrated great improvements in terms of computational cost, which has become especially important with the large accumulation of VHR images in recent years. However, deep learning-based approaches are generally limited by the need of a sufficient amount of labeled data to obtain stable accuracy, and acquiring reference labels of remotely-sensed VHR images is very labor-extensive and expensive. To overcome this problem, this thesis proposed a semi-supervised learning framework for VHR image classification. Semi-supervised learning uses both labeled and unlabeled data together, thus reducing the modelโ€™s dependency on data labels. To address this issue, this thesis employed a modified CycleGAN model to utilize large amounts of unlabeled images. CycleGAN is an image translation model which was developed from Generative Adversarial Networks (GAN) for image generation. CycleGAN trains unpaired dataset by using cycle consistency loss with two generators and two discriminators. Inspired by the concept of cycle consistency, this thesis modified CycleGAN to enable the use of unlabeled VHR data in model training by considering the unlabeled images as images unpaired with their corresponding ground truth maps. To utilize a large amount of unlabeled VHR data and a relatively small amount of labeled VHR data, this thesis combined a supervised learning classification model with the modified CycleGAN architecture. The proposed framework contains three phases: cyclic phase, adversarial phase, and supervised learning phase. Through the three phase, both labeled and unlabeled data can be utilized simultaneously to train the model in an end-to-end manner. The result of the proposed framework was evaluated by using an open-source VHR image dataset, referred to as the International Society for Photogrammetry and Remote Sensing (ISPRS) Vaihingen dataset. To validate the accuracy of the proposed framework, benchmark models including both supervised and semi-supervised learning methods were compared on the same dataset. Furthermore, two additional experiments were conducted to confirm the impact of labeled and unlabeled data on classification accuracy and adaptation of the CycleGAN model for other classification models. These results were evaluated by the popular three metrics for image classification: Overall Accuracy (OA), F1-score, and mean Intersection over Union (mIoU). The proposed framework achieved the highest accuracy (OA: 0.796, 0.786, and 0.784, respectively in three test sites) in comparison to the other five benchmarks. In particular, in a test site containing numerous objects with various properties, the largest increase in accuracy was observed due to the regularization effect from the semi-supervised method using unlabeled data with the modified CycleGAN. Moreover, by controlling the amount of labeled and unlabeled data, results indicated that a relatively sufficient amount of unlabeled and labeled data is required to increase the accuracy when using the semi-supervised CycleGAN. Lastly, this thesis applied the proposed CycleGAN method to other classification models such as the feature pyramid network (FPN) and the pyramid scene parsing network (PSPNet), in place of UNet. In all cases, the proposed framework returned significantly improved results, displaying the frameworkโ€™s applicability for semi-supervised image classification on remotely-sensed VHR images.1. Introduction 1 2. Background and Related Works 6 2.1. Deep Learning for Image Classification 6 2.1.1. Image-level Classifiaction 6 2.1.2. Fully Convolutional Architectures 7 2.1.3. Semantic Segmentation for Remote Sensing Images 9 2.2. Generative Adversarial Networks (GAN) 12 2.2.1. Introduction to GAN 12 2.2.2. Image Translation 14 2.2.3. GAN for Semantic Segmentation 16 3. Proposed Framework 20 3.1. Modification of CycleGAN 22 3.2. Feed-forward Path of the Proposed Framework 23 3.2.1. Cyclic Phase 23 3.2.2. Adversarial Phase 23 3.2.3. Supervised Learning Phase 24 3.3. Loss Function for Back-propagation 25 3.4. Proposed Network Architecture 28 3.4.1. Generator Architecture 28 3.4.2. Discriminator Architecture 29 4. Experimental Design 31 4.1. Overall Workflow 33 4.2. Vaihingen Dataset 38 4.3. Implementation Details 40 4.4. Metrics for Quantitative Evaluation 41 5. Results and Discussion 42 5.1. Performance Evaluation of the Proposed Feamwork 42 5.2. Comparison of Classification Performance in the Proposed Framework and Benchmarks 45 5.3. Impact of labeled and Unlabeled Data for Semi-supervised Learning 52 5.4. Cycle Consistency in Semi-supervised Learning 55 5.5. Adaptation of the GAN Framework for Other Classification Models 59 6. Conclusion 62 Reference 65 ๊ตญ๋ฌธ ์ดˆ๋ก 69์„

    Deep learning-based change detection in remote sensing images:a review

    Get PDF
    Images gathered from different satellites are vastly available these days due to the fast development of remote sensing (RS) technology. These images significantly enhance the data sources of change detection (CD). CD is a technique of recognizing the dissimilarities in the images acquired at distinct intervals and are used for numerous applications, such as urban area development, disaster management, land cover object identification, etc. In recent years, deep learning (DL) techniques have been used tremendously in change detection processes, where it has achieved great success because of their practical applications. Some researchers have even claimed that DL approaches outperform traditional approaches and enhance change detection accuracy. Therefore, this review focuses on deep learning techniques, such as supervised, unsupervised, and semi-supervised for different change detection datasets, such as SAR, multispectral, hyperspectral, VHR, and heterogeneous images, and their advantages and disadvantages will be highlighted. In the end, some significant challenges are discussed to understand the context of improvements in change detection datasets and deep learning models. Overall, this review will be beneficial for the future development of CD methods
    • โ€ฆ
    corecore