5 research outputs found

    Application of Convolutional Neural Network in the Segmentation and Classification of High-Resolution Remote Sensing Images

    Get PDF
    Numerous convolution neural networks increase accuracy of classification for remote sensing scene images at the expense of the models space and time sophistication This causes the model to run slowly and prevents the realization of a trade-off among model accuracy and running time The loss of deep characteristics as the network gets deeper makes it impossible to retrieve the key aspects with a sample double branching structure which is bad for classifying remote sensing scene photo

    Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification

    Full text link
    Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The d facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Binary Patterns encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Our final combination outperforms the state-of-the-art without employing fine-tuning or ensemble of RGB network architectures.Comment: To appear in ISPRS Journal of Photogrammetry and Remote Sensin

    Optimum Pipeline for Visual Terrain Classification Using Improved Bag of Visual Words and Fusion Methods

    Get PDF
    We propose an optimum pipeline and develop the hybrid representation to produce an effective and efficient visual terrain classification system. The bag of visual words (BOVW) framework has emerged as a promising approach and effective paradigm for visual terrain classification. The method includes four main steps: (1) feature extraction, (2) codebook generation, (3) feature coding, and (4) pooling and normalization. Recent researches have primarily focused on feature extraction in the development of new handcrafted descriptors that are specific to the visual terrain. However, the effects of other steps on visual terrain classification are still unknown. At the same time, fusion methods are often used to boost classification performance by exploring the complementarity of diverse features. We provide a comprehensive study of all steps in the BOVW framework and different fusion methods for visual terrain classification. Then, multiple approaches in each step and their effects are explored on the visual terrain dataset. Finally, the feature preprocessing technique, improved BOVW framework, and fusion method are used to construct an optimum pipeline for visual terrain classification. The hybrid representation developed by the optimum pipeline performs effectively and rapidly for visual terrain classification in the terrain dataset, outperforming those current methods. Furthermore, it is robust to diverse noises and illumination alterations

    Integration of stacked-autoencoders and convolutional neural networks for hyperspectral image classification

    Get PDF
    Orientador: Prof. Dr. Jorge Antônio Silva CentenoTese (doutorado) - Universidade Federal do Paraná, Setor de Ciências da Terra, Programa de Pós-Graduação em Ciências Geodésicas. Defesa : Curitiba, 24/05/2021Inclui referências: p. 97-103Resumo: Deep Learning ou aprendizado profundo abriu novas possibilidades para o pré-processamento, processamento e análise de dados hiperespectrais usando várias camadas de redes neurais e pode ser usado como ferramenta de extração de atributos. Nesta pesquisa, é desenvolvido um modelo híbrido baseado em pixels que integra Stacked-Autoencoders (SAE) y Redes Neurais Convolucionais (CNN) para classificar dados hiperespectrais. O núcleo do modelo integrado (SAE-1DCNN) é um Autoencoder que é aprimorado usando camadas convolucionais nas etapas de codificação (encoding) e decodificação (decoding). Isso permite melhorar a discriminação de dados no treinamento não supervisionado e reduzir o tempo no processamento, pois permite uma descrição dos atributos baseada na assinatura hiperespectral do pixel e aproveita a eficácia da arquitetura profunda com base nas camadas convolucionais e pooling. Como filtros unidimensionais foram aplicados no modelo integrado, o tempo de processamento é consideravelmente menor do que ao usar filtros 2D-CNN. Em uma primeira etapa, o modelo SAE-1DCNN é usado para extração de atributos e, em seguida, esses resultados são usados em uma etapa final para uma classificação supervisionada. Assim, na primeira etapa os parâmetros da rede são ajustados usando amostras de treinamento e após na segunda etapa uma abordagem fine-tuning composta de regressão logística com base na função de ativação softmax foi aplicada para classificação. Três aspectos são analisados nesta pesquisa: a capacidade do modelo de excluir bandas ruidosas, sua capacidade de redução da dimensionalidade e seu potencial para realizar a classificação da cobertura da terra usando dados hiperespectrais. Os experimentos foram realizados com diferentes conjuntos de dados hiperespectrais: Indian Pines, Universidade de Pavia e Salinas, amplamente utilizados pela comunidade científica, e uma imagem hiperespectral capturada na Fazenda Canguiri da Universidade Federal do Paraná (UFPR) no Paraná-Brasil. Para validar a metodologia proposta, os resultados obtidos foram comparados aos métodos tradicionais de aprendizado de máquina para verificar o potencial da integração de autoencoders (AE) e redes convolucionais. Os resultados obtidos mostraram similaridade com os métodos tradicionais em termos de acurácia da classificação hiperespectral, porém demandaram menos tempo de processamento, portanto, a metodologia proposta (SAE-1DCNN) é considerada promissora, sólida e pode ser uma alternativa para o pré-processamento de dados hiperespectrais e processamento.Abstract: Deep learning opened new possibilities for hyperspectral data processing and analysis using multiple neural nets layers and can be used as a feature extraction tool. In this research, a pixel-based hybrid model is developed that integrates Stacked-Autoencoders (SAE) and Convolutional Neural Network (CNN) for hyperspectral image classification. The core of the integrated model (SAE-1DCNN) is an autoencoder that is improved by using convolutional layers in the encoding and decoding steps. This allows improving data discrimination in unsupervised training and reducing the processing time because it allows a feature-based description of the pixel's hyperspectral signature and takes advantage of the effectiveness of deep architecture based on the convolutional and pooling layers. As one-dimensional filters are applied, the processing time is considerably lower than when using 2D-CNN filters. In a first step, the SAE-1DCNN model is used for feature extraction and then these results are used in a final supervised classification step. Thus, in the first stage, the parameters of the net are adjusted using training samples and then, in the second stage, a fine-tuning approach followed by logistic regression based on the softmax activation function was applied for classification. Three aspects are analyzed in detail: the capacity of the model to exclude noisy bands, its ability to dimensionality reduction, and its potential to perform land cover classification based on hyperspectral data. Experiments were performed using different hyperspectral data sets: Indian Pines University of Pavia and Salinas, widely used by the scientific community, and a hyperspectral image captured at the Canguiri Farm of the Federal University of Paraná (UFPR) in Paraná-Brazil. To validate the proposed methodology, the obtained results were compared to traditional machine learning methods to verify the potential of the integration of autoencoders (AE) and convolutional nets. These obtained results showed similarity with traditional methods in terms of hyperspectral classification accuracy, however, they demanded less time for processing, therefore, the proposed methodology (SAE-1DCNN) is considered promising, solid, and can be an alternative for hyperspectral data pre-processing and processing.Resumen: Deep Learning o aprendizaje profundo abrió nuevos desafíos para el preprocesamiento, procesamiento y análisis de datos hiperespectrales usando varias capas de redes neuronales y puede ser usado como herramienta de extracción de atributos. En esta investigación, se desarrolla un modelo híbrido basado en pixeles que integra Stacked-Autoencoders (SAE) y redes Neuronales Convolucionales (CNN) para clasificar datos hiperespectrales. Este enfoque uso un modelo basado en pixeles que integra Convolutional Neural Networks (CNN) y Stacked-Autoencoders (SAE). El núcleo del modelo integrado (SAE-1DCNN) es un Autoencoder (AE) mejorado que usa capas convolucionales en las etapas de codificación y decodificación. Esto permite mejorar la discriminación de datos a través de un entrenamiento supervisado y además reducir el tiempo en el procesamiento, pues permite una descripción de los atributos basad en la respuesta hiperespectral del pixel y aprovecha la efectividad de la arquitectura profunda en las capas convolucionales (convolutional) y de agrupamiento (pooling). En este modelo integrado se aplican filtros unidimensionales lo que permite que el tiempo en el procesamiento sea menor si se compara con los filtros bidimensionales 2D-CNN. En una primera etapa, el modelo SAE-1DCNN es usado para la extracción de atributos y en seguida, esos resultados son usados para la etapa final basada en la clasificación supervisada. De esta forma, en la primera etapa los parámetros de la red son ajustados usando las muestras de entrenamiento y después en la segunda etapa el enfoque conocido como fine-tuning fue aplicado para la clasificación de cobertura terrestre basado en regresión logística y la función de activación softmax. Tres aspectos son analizados en esta investigación, la capacidad del modelo para excluir bandas ruidosas, la capacidad para seleccionar las bandas redundantes y así reducir la dimensionalidad y el potencial para realizar la clasificación de la cobertura terrestre usando datos hiperespectrales. Los experimentos fueron realizados con diferentes conjuntos de datos hiperespectrales: Indian Pines, Universidad de Pavia y Salinas, ampliamente usados en trabajos científicos, y una imagen hiperespectral capturada en la Hacienda Canguiri de la Universidad Federal de Paraná (UFPR) en Paraná-Brasil. Para validar la metodología propuesta, los resultados obtenidos se compararon con métodos tradicionales de aprendizaje de máquina (machine learning) para verificar el potencial de la integración de Autoencoders (AE) y redes convolucionales. Los resultados obtenidos mostraron similitud con los métodos tradicionales en cuanto a la precisión de clasificación hiperespectral, sin embargo, exigieron menos tiempo de procesamiento, por lo que, la metodología propuesta (SAE-1DCNN) se considera prometedora, sólida y puede ser una alternativa para el pré-procesamiento y procesamiento de datos hiperespectrales

    Large Area Land Cover Mapping Using Deep Neural Networks and Landsat Time-Series Observations

    Get PDF
    This dissertation focuses on analysis and implementation of deep learning methodologies in the field of remote sensing to enhance land cover classification accuracy, which has important applications in many areas of environmental planning and natural resources management. The first manuscript conducted a land cover analysis on 26 Landsat scenes in the United States by considering six classifier variants. An extensive grid search was conducted to optimize classifier parameters using only the spectral components of each pixel. Results showed no gain in using deep networks by using only spectral components over conventional classifiers, possibly due to the small reference sample size and richness of features. The effect of changing training data size, class distribution, or scene heterogeneity were also studied and we found all of them having significant effect on classifier accuracy. The second manuscript reviewed 103 research papers on the application of deep learning methodologies in remote sensing, with emphasis on per-pixel classification of mono-temporal data and utilizing spectral and spatial data dimensions. A meta-analysis quantified deep network architecture improvement over selected convolutional classifiers. The effect of network size, learning methodology, input data dimensionality and training data size were also studied, with deep models providing enhanced performance over conventional one using spectral and spatial data. The analysis found that input dataset was a major limitation and available datasets have already been utilized to their maximum capacity. The third manuscript described the steps to build the full environment for dataset generation based on Landsat time-series data using spectral, spatial, and temporal information available for each pixel. A large dataset containing one sample block from each of 84 ecoregions in the conterminous United States (CONUS) was created and then processed by a hybrid convolutional+recurrent deep network, and the network structure was optimized with thousands of simulations. The developed model achieved an overall accuracy of 98% on the test dataset. Also, the model was evaluated for its overall and per-class performance under different conditions, including individual blocks, individual or combined Landsat sensors, and different sequence lengths. The analysis found that although the deep model performance per each block is superior to other candidates, the per block performance still varies considerably from block to block. This suggests extending the work by model fine-tuning for local areas. The analysis also found that including more time stamps or combining different Landsat sensor observations in the model input significantly enhances the model performance
    corecore