24 research outputs found

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Deep Vision in Optical Imagery: From Perception to Reasoning

    Get PDF
    Deep learning has achieved extraordinary success in a wide range of tasks in computer vision field over the past years. Remote sensing data present different properties as compared to natural images/videos, due to their unique imaging technique, shooting angle, etc. For instance, hyperspectral images usually have hundreds of spectral bands, offering additional information, and the size of objects (e.g., vehicles) in remote sensing images is quite limited, which brings challenges for detection or segmentation tasks. This thesis focuses on two kinds of remote sensing data, namely hyper/multi-spectral and high-resolution images, and explores several methods to try to find answers to the following questions: - In comparison with natural images or videos in computer vision, the unique asset of hyper/multi-spectral data is their rich spectral information. But what this “additional” information brings for learning a network? And how do we take full advantage of these spectral bands? - Remote sensing images at high resolution have pretty different characteristics, bringing challenges for several tasks, for example, small object segmentation. Can we devise tailored networks for such tasks? - Deep networks have produced stunning results in a variety of perception tasks, e.g., image classification, object detection, and semantic segmentation. While the capacity to reason about relations over space is vital for intelligent species. Can a network/module with the capacity of reasoning benefit to parsing remote sensing data? To this end, a couple of networks are devised to figure out what a network learns from hyperspectral images and how to efficiently use spectral bands. In addition, a multi-task learning network is investigated for the instance segmentation of vehicles from aerial images and videos. Finally, relational reasoning modules are designed to improve semantic segmentation of aerial images

    Leveraging Supervoxels for Medical Image Volume Segmentation With Limited Supervision

    Get PDF
    The majority of existing methods for machine learning-based medical image segmentation are supervised models that require large amounts of fully annotated images. These types of datasets are typically not available in the medical domain and are difficult and expensive to generate. A wide-spread use of machine learning based models for medical image segmentation therefore requires the development of data-efficient algorithms that only require limited supervision. To address these challenges, this thesis presents new machine learning methodology for unsupervised lung tumor segmentation and few-shot learning based organ segmentation. When working in the limited supervision paradigm, exploiting the available information in the data is key. The methodology developed in this thesis leverages automatically generated supervoxels in various ways to exploit the structural information in the images. The work on unsupervised tumor segmentation explores the opportunity of performing clustering on a population-level in order to provide the algorithm with as much information as possible. To facilitate this population-level across-patient clustering, supervoxel representations are exploited to reduce the number of samples, and thereby the computational cost. In the work on few-shot learning-based organ segmentation, supervoxels are used to generate pseudo-labels for self-supervised training. Further, to obtain a model that is robust to the typically large and inhomogeneous background class, a novel anomaly detection-inspired classifier is proposed to ease the modelling of the background. To encourage the resulting segmentation maps to respect edges defined in the input space, a supervoxel-informed feature refinement module is proposed to refine the embedded feature vectors during inference. Finally, to improve trustworthiness, an architecture-agnostic mechanism to estimate model uncertainty in few-shot segmentation is developed. Results demonstrate that supervoxels are versatile tools for leveraging structural information in medical data when training segmentation models with limited supervision

    A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery

    Full text link
    Semantic segmentation (classification) of Earth Observation imagery is a crucial task in remote sensing. This paper presents a comprehensive review of technical factors to consider when designing neural networks for this purpose. The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and transformer models, discussing prominent design patterns for these ANN families and their implications for semantic segmentation. Common pre-processing techniques for ensuring optimal data preparation are also covered. These include methods for image normalization and chipping, as well as strategies for addressing data imbalance in training samples, and techniques for overcoming limited data, including augmentation techniques, transfer learning, and domain adaptation. By encompassing both the technical aspects of neural network design and the data-related considerations, this review provides researchers and practitioners with a comprehensive and up-to-date understanding of the factors involved in designing effective neural networks for semantic segmentation of Earth Observation imagery.Comment: 145 pages with 32 figure

    Graph learning and its applications : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science, Massey University, Albany, Auckland, New Zealand

    Get PDF
    Since graph features consider the correlations between two data points to provide high-order information, i.e., more complex correlations than the low-order information which considers the correlations in the individual data, they have attracted much attention in real applications. The key of graph feature extraction is the graph construction. Previous study has demonstrated that the quality of the graph usually determines the effectiveness of the graph feature. However, the graph is usually constructed from the original data which often contain noise and redundancy. To address the above issue, graph learning is designed to iteratively adjust the graph and model parameters so that improving the quality of the graph and outputting optimal model parameters. As a result, graph learning has become a very popular research topic in traditional machine learning and deep learning. Although previous graph learning methods have been applied in many fields by adding a graph regularization to the objective function, they still have some issues to be addressed. This thesis focuses on the study of graph learning aiming to overcome the drawbacks in previous methods for different applications. We list the proposed methods as follows. • We propose a traditional graph learning method under supervised learning to consider the robustness and the interpretability of graph learning. Specifically, we propose utilizing self-paced learning to assign important samples with large weights, conducting feature selection to remove redundant features, and learning a graph matrix from the low dimensional data of the original data to preserve the local structure of the data. As a consequence, both important samples and useful features are used to select support vectors in the SVM framework. • We propose a traditional graph learning method under semi-supervised learning to explore parameter-free fusion of graph learning. Specifically, we first employ the discrete wavelet transform and Pearson correlation coefficient to obtain multiple fully connected Functional Connectivity brain Networks (FCNs) for every subject, and then learn a sparsely connected FCN for every subject. Finally, the ℓ1-SVM is employed to learn the important features and conduct disease diagnosis. • We propose a deep graph learning method to consider graph fusion of graph learning. Specifically, we first employ the Simple Linear Iterative Clustering (SLIC) method to obtain multi-scale features for every image, and then design a new graph fusion method to fine-tune features of every scale. As a result, the multi-scale feature fine-tuning, graph learning, and feature learning are embedded into a unified framework. All proposed methods are evaluated on real-world data sets, by comparing to state-of-the-art methods. Experimental results demonstrate that our methods outperformed all comparison methods

    A comprehensive review of graph convolutional networks: approaches and applications

    Get PDF
    Convolutional neural networks (CNNs) utilize local translation invariance in the Euclidean domain and have remarkable achievements in computer vision tasks. However, there are many data types with non-Euclidean structures, such as social networks, chemical molecules, knowledge graphs, etc., which are crucial to real-world applications. The graph convolutional neural network (GCN), as a derivative of CNNs for non-Euclidean data, was established for non-Euclidean graph data. In this paper, we mainly survey the progress of GCNs and introduce in detail several basic models based on GCNs. First, we review the challenges in building GCNs, including large-scale graph data, directed graphs and multi-scale graph tasks. Also, we briefly discuss some applications of GCNs, including computer vision, transportation networks and other fields. Furthermore, we point out some open issues and highlight some future research trends for GCNs

    GEOBIA 2016 : Solutions and Synergies., 14-16 September 2016, University of Twente Faculty of Geo-Information and Earth Observation (ITC): open access e-book

    Get PDF

    Machine learning strategies for diagnostic imaging support on histopathology and optical coherence tomography

    Full text link
    Tesis por compendio[ES] Esta tesis presenta soluciones de vanguardia basadas en algoritmos de computer vision (CV) y machine learning (ML) para ayudar a los expertos en el diagnóstico clínico. Se centra en dos áreas relevantes en el campo de la imagen médica: la patología digital y la oftalmología. Este trabajo propone diferentes paradigmas de machine learning y deep learning para abordar diversos escenarios de supervisión en el estudio del cáncer de próstata, el cáncer de vejiga y el glaucoma. En particular, se consideran métodos supervisados convencionales para segmentar y clasificar estructuras específicas de la próstata en imágenes histológicas digitalizadas. Para el reconocimiento de patrones específicos de la vejiga, se llevan a cabo enfoques totalmente no supervisados basados en técnicas de deep-clustering. Con respecto a la detección del glaucoma, se aplican algoritmos de memoria a corto plazo (LSTMs) que permiten llevar a cabo un aprendizaje recurrente a partir de volúmenes de tomografía por coherencia óptica en el dominio espectral (SD-OCT). Finalmente, se propone el uso de redes neuronales prototípicas (PNN) en un marco de few-shot learning para determinar el nivel de gravedad del glaucoma a partir de imágenes OCT circumpapilares. Los métodos de inteligencia artificial (IA) que se detallan en esta tesis proporcionan una valiosa herramienta de ayuda al diagnóstico por imagen, ya sea para el diagnóstico histológico del cáncer de próstata y vejiga o para la evaluación del glaucoma a partir de datos de OCT.[CA] Aquesta tesi presenta solucions d'avantguarda basades en algorismes de *computer *vision (CV) i *machine *learning (ML) per a ajudar als experts en el diagnòstic clínic. Se centra en dues àrees rellevants en el camp de la imatge mèdica: la patologia digital i l'oftalmologia. Aquest treball proposa diferents paradigmes de *machine *learning i *deep *learning per a abordar diversos escenaris de supervisió en l'estudi del càncer de pròstata, el càncer de bufeta i el glaucoma. En particular, es consideren mètodes supervisats convencionals per a segmentar i classificar estructures específiques de la pròstata en imatges histològiques digitalitzades. Per al reconeixement de patrons específics de la bufeta, es duen a terme enfocaments totalment no supervisats basats en tècniques de *deep-*clustering. Respecte a la detecció del glaucoma, s'apliquen algorismes de memòria a curt termini (*LSTMs) que permeten dur a terme un aprenentatge recurrent a partir de volums de tomografia per coherència òptica en el domini espectral (SD-*OCT). Finalment, es proposa l'ús de xarxes neuronals *prototípicas (*PNN) en un marc de *few-*shot *learning per a determinar el nivell de gravetat del glaucoma a partir d'imatges *OCT *circumpapilares. Els mètodes d'intel·ligència artificial (*IA) que es detallen en aquesta tesi proporcionen una valuosa eina d'ajuda al diagnòstic per imatge, ja siga per al diagnòstic histològic del càncer de pròstata i bufeta o per a l'avaluació del glaucoma a partir de dades d'OCT.[EN] This thesis presents cutting-edge solutions based on computer vision (CV) and machine learning (ML) algorithms to assist experts in clinical diagnosis. It focuses on two relevant areas at the forefront of medical imaging: digital pathology and ophthalmology. This work proposes different machine learning and deep learning paradigms to address various supervisory scenarios in the study of prostate cancer, bladder cancer and glaucoma. In particular, conventional supervised methods are considered for segmenting and classifying prostate-specific structures in digitised histological images. For bladder-specific pattern recognition, fully unsupervised approaches based on deep-clustering techniques are carried out. Regarding glaucoma detection, long-short term memory algorithms (LSTMs) are applied to perform recurrent learning from spectral-domain optical coherence tomography (SD-OCT) volumes. Finally, the use of prototypical neural networks (PNNs) in a few-shot learning framework is proposed to determine the severity level of glaucoma from circumpapillary OCT images. The artificial intelligence (AI) methods detailed in this thesis provide a valuable tool to aid diagnostic imaging, whether for the histological diagnosis of prostate and bladder cancer or glaucoma assessment from OCT data.García Pardo, JG. (2022). Machine learning strategies for diagnostic imaging support on histopathology and optical coherence tomography [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/182400Compendi
    corecore