5,321 research outputs found

    Tile2Vec: Unsupervised representation learning for spatially distributed data

    Full text link
    Geospatial analysis lacks methods like the word vector representations and pre-trained networks that significantly boost performance across a wide range of natural language and computer vision tasks. To fill this gap, we introduce Tile2Vec, an unsupervised representation learning algorithm that extends the distributional hypothesis from natural language -- words appearing in similar contexts tend to have similar meanings -- to spatially distributed data. We demonstrate empirically that Tile2Vec learns semantically meaningful representations on three datasets. Our learned representations significantly improve performance in downstream classification tasks and, similar to word vectors, visual analogies can be obtained via simple arithmetic in the latent space.Comment: 8 pages, 4 figures in main text; 9 pages, 11 figures in appendi

    Advances in Hyperspectral Image Classification Methods for Vegetation and Agricultural Cropland Studies

    Get PDF
    Hyperspectral data are becoming more widely available via sensors on airborne and unmanned aerial vehicle (UAV) platforms, as well as proximal platforms. While space-based hyperspectral data continue to be limited in availability, multiple spaceborne Earth-observing missions on traditional platforms are scheduled for launch, and companies are experimenting with small satellites for constellations to observe the Earth, as well as for planetary missions. Land cover mapping via classification is one of the most important applications of hyperspectral remote sensing and will increase in significance as time series of imagery are more readily available. However, while the narrow bands of hyperspectral data provide new opportunities for chemistry-based modeling and mapping, challenges remain. Hyperspectral data are high dimensional, and many bands are highly correlated or irrelevant for a given classification problem. For supervised classification methods, the quantity of training data is typically limited relative to the dimension of the input space. The resulting Hughes phenomenon, often referred to as the curse of dimensionality, increases potential for unstable parameter estimates, overfitting, and poor generalization of classifiers. This is particularly problematic for parametric approaches such as Gaussian maximum likelihoodbased classifiers that have been the backbone of pixel-based multispectral classification methods. This issue has motivated investigation of alternatives, including regularization of the class covariance matrices, ensembles of weak classifiers, development of feature selection and extraction methods, adoption of nonparametric classifiers, and exploration of methods to exploit unlabeled samples via semi-supervised and active learning. Data sets are also quite large, motivating computationally efficient algorithms and implementations. This chapter provides an overview of the recent advances in classification methods for mapping vegetation using hyperspectral data. Three data sets that are used in the hyperspectral classification literature (e.g., Botswana Hyperion satellite data and AVIRIS airborne data over both Kennedy Space Center and Indian Pines) are described in Section 3.2 and used to illustrate methods described in the chapter. An additional high-resolution hyperspectral data set acquired by a SpecTIR sensor on an airborne platform over the Indian Pines area is included to exemplify the use of new deep learning approaches, and a multiplatform example of airborne hyperspectral data is provided to demonstrate transfer learning in hyperspectral image classification. Classical approaches for supervised and unsupervised feature selection and extraction are reviewed in Section 3.3. In particular, nonlinearities exhibited in hyperspectral imagery have motivated development of nonlinear feature extraction methods in manifold learning, which are outlined in Section 3.3.1.4. Spatial context is also important in classification of both natural vegetation with complex textural patterns and large agricultural fields with significant local variability within fields. Approaches to exploit spatial features at both the pixel level (e.g., co-occurrencebased texture and extended morphological attribute profiles [EMAPs]) and integration of segmentation approaches (e.g., HSeg) are discussed in this context in Section 3.3.2. Recently, classification methods that leverage nonparametric methods originating in the machine learning community have grown in popularity. An overview of both widely used and newly emerging approaches, including support vector machines (SVMs), Gaussian mixture models, and deep learning based on convolutional neural networks is provided in Section 3.4. Strategies to exploit unlabeled samples, including active learning and metric learning, which combine feature extraction and augmentation of the pool of training samples in an active learning framework, are outlined in Section 3.5. Integration of image segmentation with classification to accommodate spatial coherence typically observed in vegetation is also explored, including as an integrated active learning system. Exploitation of multisensor strategies for augmenting the pool of training samples is investigated via a transfer learning framework in Section 3.5.1.2. Finally, we look to the future, considering opportunities soon to be provided by new paradigms, as hyperspectral sensing is becoming common at multiple scales from ground-based and airborne autonomous vehicles to manned aircraft and space-based platforms

    An uncertainty prediction approach for active learning - application to earth observation

    Get PDF
    Mapping land cover and land usage dynamics are crucial in remote sensing since farmers are encouraged to either intensify or extend crop use due to the ongoing rise in the world’s population. A major issue in this area is interpreting and classifying a scene captured in high-resolution satellite imagery. Several methods have been put forth, including neural networks which generate data-dependent models (i.e. model is biased toward data) and static rule-based approaches with thresholds which are limited in terms of diversity(i.e. model lacks diversity in terms of rules). However, the problem of having a machine learning model that, given a large amount of training data, can classify multiple classes over different geographic Sentinel-2 imagery that out scales existing approaches remains open. On the other hand, supervised machine learning has evolved into an essential part of many areas due to the increasing number of labeled datasets. Examples include creating classifiers for applications that recognize images and voices, anticipate traffic, propose products, act as a virtual personal assistant and detect online fraud, among many more. Since these classifiers are highly dependent from the training datasets, without human interaction or accurate labels, the performance of these generated classifiers with unseen observations is uncertain. Thus, researchers attempted to evaluate a number of independent models using a statistical distance. However, the problem of, given a train-test split and classifiers modeled over the train set, identifying a prediction error using the relation between train and test sets remains open. Moreover, while some training data is essential for supervised machine learning, what happens if there is insufficient labeled data? After all, assigning labels to unlabeled datasets is a time-consuming process that may need significant expert human involvement. When there aren’t enough expert manual labels accessible for the vast amount of openly available data, active learning becomes crucial. However, given a large amount of training and unlabeled datasets, having an active learning model that can reduce the training cost of the classifier and at the same time assist in labeling new data points remains an open problem. From the experimental approaches and findings, the main research contributions, which concentrate on the issue of optical satellite image scene classification include: building labeled Sentinel-2 datasets with surface reflectance values; proposal of machine learning models for pixel-based image scene classification; proposal of a statistical distance based Evidence Function Model (EFM) to detect ML models misclassification; and proposal of a generalised sampling approach for active learning that, together with the EFM enables a way of determining the most informative examples. Firstly, using a manually annotated Sentinel-2 dataset, Machine Learning (ML) models for scene classification were developed and their performance was compared to Sen2Cor the reference package from the European Space Agency – a micro-F1 value of 84% was attained by the ML model, which is a significant improvement over the corresponding Sen2Cor performance of 59%. Secondly, to quantify the misclassification of the ML models, the Mahalanobis distance-based EFM was devised. This model achieved, for the labeled Sentinel-2 dataset, a micro-F1 of 67.89% for misclassification detection. Lastly, EFM was engineered as a sampling strategy for active learning leading to an approach that attains the same level of accuracy with only 0.02% of the total training samples when compared to a classifier trained with the full training set. With the help of the above-mentioned research contributions, we were able to provide an open-source Sentinel-2 image scene classification package which consists of ready-touse Python scripts and a ML model that classifies Sentinel-2 L1C images generating a 20m-resolution RGB image with the six studied classes (Cloud, Cirrus, Shadow, Snow, Water, and Other) giving academics a straightforward method for rapidly and effectively classifying Sentinel-2 scene images. Additionally, an active learning approach that uses, as sampling strategy, the observed prediction uncertainty given by EFM, will allow labeling only the most informative points to be used as input to build classifiers; Sumário: Uma Abordagem de Previsão de Incerteza para Aprendizagem Ativa – Aplicação à Observação da Terra O mapeamento da cobertura do solo e a dinâmica da utilização do solo são cruciais na deteção remota uma vez que os agricultores são incentivados a intensificar ou estender as culturas devido ao aumento contínuo da população mundial. Uma questão importante nesta área é interpretar e classificar cenas capturadas em imagens de satélite de alta resolução. Várias aproximações têm sido propostas incluindo a utilização de redes neuronais que produzem modelos dependentes dos dados (ou seja, o modelo é tendencioso em relação aos dados) e aproximações baseadas em regras que apresentam restrições de diversidade (ou seja, o modelo carece de diversidade em termos de regras). No entanto, a criação de um modelo de aprendizagem automática que, dada uma uma grande quantidade de dados de treino, é capaz de classificar, com desempenho superior, as imagens do Sentinel-2 em diferentes áreas geográficas permanece um problema em aberto. Por outro lado, têm sido utilizadas técnicas de aprendizagem supervisionada na resolução de problemas nas mais diversas áreas de devido à proliferação de conjuntos de dados etiquetados. Exemplos disto incluem classificadores para aplicações que reconhecem imagem e voz, antecipam tráfego, propõem produtos, atuam como assistentes pessoais virtuais e detetam fraudes online, entre muitos outros. Uma vez que estes classificadores são fortemente dependente do conjunto de dados de treino, sem interação humana ou etiquetas precisas, o seu desempenho sobre novos dados é incerta. Neste sentido existem propostas para avaliar modelos independentes usando uma distância estatística. No entanto, o problema de, dada uma divisão de treino-teste e um classificador, identificar o erro de previsão usando a relação entre aqueles conjuntos, permanece aberto. Mais ainda, embora alguns dados de treino sejam essenciais para a aprendizagem supervisionada, o que acontece quando a quantidade de dados etiquetados é insuficiente? Afinal, atribuir etiquetas é um processo demorado e que exige perícia, o que se traduz num envolvimento humano significativo. Quando a quantidade de dados etiquetados manualmente por peritos é insuficiente a aprendizagem ativa torna-se crucial. No entanto, dada uma grande quantidade dados de treino não etiquetados, ter um modelo de aprendizagem ativa que reduz o custo de treino do classificador e, ao mesmo tempo, auxilia a etiquetagem de novas observações permanece um problema em aberto. A partir das abordagens e estudos experimentais, as principais contribuições deste trabalho, que se concentra na classificação de cenas de imagens de satélite óptico incluem: criação de conjuntos de dados Sentinel-2 etiquetados, com valores de refletância de superfície; proposta de modelos de aprendizagem automática baseados em pixels para classificação de cenas de imagens de satétite; proposta de um Modelo de Função de Evidência (EFM) baseado numa distância estatística para detetar erros de classificação de modelos de aprendizagem; e proposta de uma abordagem de amostragem generalizada para aprendizagem ativa que, em conjunto com o EFM, possibilita uma forma de determinar os exemplos mais informativos. Em primeiro lugar, usando um conjunto de dados Sentinel-2 etiquetado manualmente, foram desenvolvidos modelos de Aprendizagem Automática (AA) para classificação de cenas e seu desempenho foi comparado com o do Sen2Cor – o produto de referência da Agência Espacial Europeia – tendo sido alcançado um valor de micro-F1 de 84% pelo classificador, o que representa uma melhoria significativa em relação ao desempenho Sen2Cor correspondente, de 59%. Em segundo lugar, para quantificar o erro de classificação dos modelos de AA, foi concebido o Modelo de Função de Evidência baseado na distância de Mahalanobis. Este modelo conseguiu, para o conjunto de dados etiquetado do Sentinel-2 um micro-F1 de 67,89% na deteção de classificação incorreta. Por fim, o EFM foi utilizado como uma estratégia de amostragem para a aprendizagem ativa, uma abordagem que permitiu atingir o mesmo nível de desempenho com apenas 0,02% do total de exemplos de treino quando comparado com um classificador treinado com o conjunto de treino completo. Com a ajuda das contribuições acima mencionadas, foi possível desenvolver um pacote de código aberto para classificação de cenas de imagens Sentinel-2 que, utilizando num conjunto de scripts Python, um modelo de classificação, e uma imagem Sentinel-2 L1C, gera a imagem RGB correspondente (com resolução de 20m) com as seis classes estudadas (Cloud, Cirrus, Shadow, Snow, Water e Other), disponibilizando à academia um método direto para a classificação de cenas de imagens do Sentinel-2 rápida e eficaz. Além disso, a abordagem de aprendizagem ativa que usa, como estratégia de amostragem, a deteção de classificacão incorreta dada pelo EFM, permite etiquetar apenas os pontos mais informativos a serem usados como entrada na construção de classificadores

    A Deep Learning Framework in Selected Remote Sensing Applications

    Get PDF
    The main research topic is designing and implementing a deep learning framework applied to remote sensing. Remote sensing techniques and applications play a crucial role in observing the Earth evolution, especially nowadays, where the effects of climate change on our life is more and more evident. A considerable amount of data are daily acquired all over the Earth. Effective exploitation of this information requires the robustness, velocity and accuracy of deep learning. This emerging need inspired the choice of this topic. The conducted studies mainly focus on two European Space Agency (ESA) missions: Sentinel 1 and Sentinel 2. Images provided by the ESA Sentinel-2 mission are rapidly becoming the main source of information for the entire remote sensing community, thanks to their unprecedented combination of spatial, spectral and temporal resolution, as well as their open access policy. The increasing interest gained by these satellites in the research laboratory and applicative scenarios pushed us to utilize them in the considered framework. The combined use of Sentinel 1 and Sentinel 2 is crucial and very prominent in different contexts and different kinds of monitoring when the growing (or changing) dynamics are very rapid. Starting from this general framework, two specific research activities were identified and investigated, leading to the results presented in this dissertation. Both these studies can be placed in the context of data fusion. The first activity deals with a super-resolution framework to improve Sentinel 2 bands supplied at 20 meters up to 10 meters. Increasing the spatial resolution of these bands is of great interest in many remote sensing applications, particularly in monitoring vegetation, rivers, forests, and so on. The second topic of the deep learning framework has been applied to the multispectral Normalized Difference Vegetation Index (NDVI) extraction, and the semantic segmentation obtained fusing Sentinel 1 and S2 data. The S1 SAR data is of great importance for the quantity of information extracted in the context of monitoring wetlands, rivers and forests, and many other contexts. In both cases, the problem was addressed with deep learning techniques, and in both cases, very lean architectures were used, demonstrating that even without the availability of computing power, it is possible to obtain high-level results. The core of this framework is a Convolutional Neural Network (CNN). {CNNs have been successfully applied to many image processing problems, like super-resolution, pansharpening, classification, and others, because of several advantages such as (i) the capability to approximate complex non-linear functions, (ii) the ease of training that allows to avoid time-consuming handcraft filter design, (iii) the parallel computational architecture. Even if a large amount of "labelled" data is required for training, the CNN performances pushed me to this architectural choice.} In our S1 and S2 integration task, we have faced and overcome the problem of manually labelled data with an approach based on integrating these two different sensors. Therefore, apart from the investigation in Sentinel-1 and Sentinel-2 integration, the main contribution in both cases of these works is, in particular, the possibility of designing a CNN-based solution that can be distinguished by its lightness from a computational point of view and consequent substantial saving of time compared to more complex deep learning state-of-the-art solutions

    Adapting Computer Vision Models To Limitations On Input Dimensionality And Model Complexity

    Get PDF
    When considering instances of distributed systems where visual sensors communicate with remote predictive models, data traffic is limited to the capacity of communication channels, and hardware limits the processing of collected data prior to transmission. We study novel methods of adapting visual inference to limitations on complexity and data availability at test time, wherever the aforementioned limitations exist. Our contributions detailed in this thesis consider both task-specific and task-generic approaches to reducing the data requirement for inference, and evaluate our proposed methods on a wide range of computer vision tasks. This thesis makes four distinct contributions: (i) We investigate multi-class action classification via two-stream convolutional neural networks that directly ingest information extracted from compressed video bitstreams. We show that selective access to macroblock motion vector information provides a good low-dimensional approximation of the underlying optical flow in visual sequences. (ii) We devise a bitstream cropping method by which AVC/H.264 and H.265 bitstreams are reduced to the minimum amount of necessary elements for optical flow extraction, while maintaining compliance with codec standards. We additionally study the effect of codec rate-quality control on the sparsity and noise incurred on optical flow derived from resulting bitstreams, and do so for multiple coding standards. (iii) We demonstrate degrees of variability in the amount of data required for action classification, and leverage this to reduce the dimensionality of input volumes by inferring the required temporal extent for accurate classification prior to processing via learnable machines. (iv) We extend the Mixtures-of-Experts (MoE) paradigm to adapt the data cost of inference for any set of constituent experts. We postulate that the minimum acceptable data cost of inference varies for different input space partitions, and consider mixtures where each expert is designed to meet a different set of constraints on input dimensionality. To take advantage of the flexibility of such mixtures in processing different input representations and modalities, we train biased gating functions such that experts requiring less information to make their inferences are favoured to others. We finally note that, our proposed data utility optimization solutions include a learnable component which considers specified priorities on the amount of information to be used prior to inference, and can be realized for any combination of tasks, modalities, and constraints on available data

    Deep Pyramidal Residual Networks for Spectral-Spatial Hyperspectral Image Classification

    Get PDF
    Convolutional neural networks (CNNs) exhibit good performance in image processing tasks, pointing themselves as the current state-of-the-art of deep learning methods. However, the intrinsic complexity of remotely sensed hyperspectral images still limits the performance of many CNN models. The high dimensionality of the HSI data, together with the underlying redundancy and noise, often makes the standard CNN approaches unable to generalize discriminative spectral-spatial features. Moreover, deeper CNN architectures also find challenges when additional layers are added, which hampers the network convergence and produces low classification accuracies. In order to mitigate these issues, this paper presents a new deep CNN architecture specially designed for the HSI data. Our new model pursues to improve the spectral-spatial features uncovered by the convolutional filters of the network. Specifically, the proposed residual-based approach gradually increases the feature map dimension at all convolutional layers, grouped in pyramidal bottleneck residual blocks, in order to involve more locations as the network depth increases while balancing the workload among all units, preserving the time complexity per layer. It can be seen as a pyramid, where the deeper the blocks, the more feature maps can be extracted. Therefore, the diversity of high-level spectral-spatial attributes can be gradually increased across layers to enhance the performance of the proposed network with the HSI data. Our experiments, conducted using four well-known HSI data sets and 10 different classification techniques, reveal that our newly developed HSI pyramidal residual model is able to provide competitive advantages (in terms of both classification accuracy and computational time) over the state-of-the-art HSI classification methods

    Use of Remote Imagery and Object-based Image Methods to Count Plants in an Open-field Container Nursery

    Get PDF
    In general, the nursery industry lacks an automated inventory control system. Object-based image analysis (OBIA) software and aerial images could be used to count plants in nurseries. The objectives of this research were: 1) to evaluate the effect of an unmanned aerial vehicle (UAV) flight altitude and plant canopy separation of container-grown plants on count accuracy using aerial images and 2) to evaluate the effect of plant canopy shape, presence of flowers, and plant status (living and dead) on counting accuracy of container-grown plants using remote sensing images. Images were analyzed using Feature Analyst® (FA) and an algorithm trained using MATLAB®. Total count error, false positives and unidentified plants were recorded from output images using FA; only total count error was reported for the MATLAB algorithm. For objective 1, images were taken at 6, 12 and 22 m above the ground using a UAV. Plants were placed on black fabric and gravel, and spaced as follows: 5 cm between canopy edges, canopy edges touching, and 5 cm of canopy edge overlap. In general, when both methods were considered, total count error was smaller [ranging from -5 (undercount) to 4 (over count)] when plants were fully separated with the exception of images taken at 22 m. FA showed a smaller total count error (-2) than MATLAB (-5) when plants were placed on black fabric than those placed on gravel. For objective 2, the plan was to continue using the UAV, however, due to the unexpected disruption of the GPS-based navigation by heightened solar flare activity in 2013, a boom lift that could provide images on a more reliable basis was used. When images obtained using a boom lift were analyzed using FA there was no difference between variables measured when an algorithm trained with an image displaying regular or irregular plant canopy shape was applied to images displaying both plant canopy shapes even though the canopy shape of `Sea Green\u27 juniper is less compact than `Plumosa Compacta\u27. There was a significant difference in all variables measured between images of flowering and non-flowering plants, when non-flowering `samples\u27 were used to train the counting algorithm and analyzed with FA. No dead plants were counted as living and vice versa, when data were analyzed using FA. When the algorithm trained in MATLAB was applied, there was no significant difference in total count errors when plant canopy shape and presence of flowers were evaluated. Based on the combined results from these separate experiments, FA and MATLAB algorithms appear to be fairly robust when used to count container-grown plants from images taken at the heights specified
    • …
    corecore