79 research outputs found

    Deep Learning Approaches for Seagrass Detection in Multispectral Imagery

    Get PDF
    Seagrass forms the basis for critically important marine ecosystems. Seagrass is an important factor to balance marine ecological systems, and it is of great interest to monitor its distribution in different parts of the world. Remote sensing imagery is considered as an effective data modality based on which seagrass monitoring and quantification can be performed remotely. Traditionally, researchers utilized multispectral satellite images to map seagrass manually. Automatic machine learning techniques, especially deep learning algorithms, recently achieved state-of-the-art performances in many computer vision applications. This dissertation presents a set of deep learning models for seagrass detection in multispectral satellite images. It also introduces novel domain adaptation approaches to adapt the models for new locations and for temporal image series. In Chapter 3, I compare a deep capsule network (DCN) with a deep convolutional neural network (DCNN) for seagrass detection in high-resolution multispectral satellite images. These methods are tested on three satellite images in Florida coastal areas and obtain comparable performances. In addition, I also propose a few-shot deep learning strategy to transfer knowledge learned by DCN from one location to the others for seagrass detection. In Chapter 4, I develop a semi-supervised domain adaptation method to generalize a trained DCNN model to multiple locations for seagrass detection. First, the model utilizes a generative adversarial network (GAN) to align marginal distribution of data in the source domain to that in the target domain using unlabeled data from both domains. Second, it uses a few labeled samples from the target domain to align class-specific data distributions between the two. The model achieves the best results in 28 out of 36 scenarios as compared to other state-of-the-art domain adaptation methods. In Chapter 5, I develop a semantic segmentation method for seagrass detection in multispectral time-series images. First, I train a state-of-the-art image segmentation method using an active learning approach where I use the DCNN classifier in the loop. Then, I develop an unsupervised domain adaptation (UDA) algorithm to detect seagrass across temporal images. I also extend our unsupervised domain adaptation work for seagrass detection across locations. In Chapter 6, I present an automated bathymetry estimation model based on multispectral satellite images. Bathymetry refers to the depth of the ocean floor and contributes a predominant role in identifying marine species in seawater. Accurate bathymetry information of coastal areas will facilitate seagrass detection by reducing false positives because seagrass usually do not grow beyond a certain depth. However, bathymetry information of most parts of the world is obsolete or missing. Traditional bathymetry measurement systems require extensive labor efforts. I utilize an ensemble machine learning-based approach to estimate bathymetry based on a few in-situ sonar measurements and evaluate the proposed model in three coastal locations in Florida

    Statistical methods for tissue array images - algorithmic scoring and co-training

    Full text link
    Recent advances in tissue microarray technology have allowed immunohistochemistry to become a powerful medium-to-high throughput analysis tool, particularly for the validation of diagnostic and prognostic biomarkers. However, as study size grows, the manual evaluation of these assays becomes a prohibitive limitation; it vastly reduces throughput and greatly increases variability and expense. We propose an algorithm - Tissue Array Co-Occurrence Matrix Analysis (TACOMA) - for quantifying cellular phenotypes based on textural regularity summarized by local inter-pixel relationships. The algorithm can be easily trained for any staining pattern, is absent of sensitive tuning parameters and has the ability to report salient pixels in an image that contribute to its score. Pathologists' input via informative training patches is an important aspect of the algorithm that allows the training for any specific marker or cell type. With co-training, the error rate of TACOMA can be reduced substantially for a very small training sample (e.g., with size 30). We give theoretical insights into the success of co-training via thinning of the feature set in a high-dimensional setting when there is "sufficient" redundancy among the features. TACOMA is flexible, transparent and provides a scoring process that can be evaluated with clarity and confidence. In a study based on an estrogen receptor (ER) marker, we show that TACOMA is comparable to, or outperforms, pathologists' performance in terms of accuracy and repeatability.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS543 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Remote Sensing for Monitoring the Mountaintop Mining Landscape: Applications for Land Cover Mapping at the Individual Mine Complex Scale

    Get PDF
    The aim of this dissertation was to investigate the potential for mapping land cover associated with mountaintop mining in Southern West Virginia using high spatial resolution aerial- and satellite-based multispectral imagery, as well as light detection and ranging (LiDAR) elevation data and terrain derivatives. The following research themes were explored: comparing aerial- and satellite-based imagery, combining data sets of multiple dates and types, incorporating measures of texture, using nonparametric, machine learning classification algorithms, and employing a geographical object-based image analysis (GEOBIA) framework. This research is presented as four interrelated manuscripts.;In a comparison of aerial National Agriculture Imagery Program (NAIP) orthophotography and satellite-based RapidEye data, the aerial imagery was found to provide statistically less accurate classifications of land cover. These lower accuracies are most likely due to inconsistent viewing geometry and radiometric normalization associated with the aerial imagery. Nevertheless, NAIP orthophotography has many characteristics that make it useful for surface mine mapping and monitoring, including its availability for multiple years, a general lack of cloud cover, contiguous coverage of large areas, ease of availability, and low cost. The lower accuracies of the NAIP classifications were somewhat remediated by decreasing the spatial resolution and reducing the number of classes mapped.;Combining LiDAR with multispectral imagery statistically improved the classification of mining and mine reclamation land cover in comparison to only using multispectral data for both pixel-based and GEOBIA classification. This suggests that the reduced spectral resolution of high spatial resolution data can be combated by incorporating data from another sensor.;Generally, the support vector machines (SVM) algorithm provided higher classification accuracies in comparison to random forests (RF) and boosted classification and regression trees (CART) for both pixel-based and GEOBIA classification. It also outperformed k-nearest neighbor, the algorithm commonly used for GEOBIA classification. However, optimizing user-defined parameters for the SVM algorithm tends to be more complex in comparison to the other algorithms. In particular, RF has fewer parameters, and the program seems robust regarding the parameter settings. RF also offers measures to assess model performance, such as estimates of variable importance and overall accuracy.;Textural measures were found to be of marginal value for pixel-based classification. For GEOBIA, neither measures of texture nor object-specific geometry improved the classification accuracy. Notably, the incorporation of additional information from LiDAR provided a greater improvement in classification accuracy then deriving complex textural and geometric measures.;Pre- and post-mining terrain data classified using GEOBIA and machine learning algorithms resulted in significantly more accurate differentiation of mine-reclaimed and non-mining grasslands than was possible with spectral data. The combination of pre- and post-mining terrain data or just pre-mining data generally outperformed post-mining data. Elevation change data were shown to be of particular value, as were terrain shape parameters. GEOBIA was a valuable tool for combining data collected using different sensors and gridded at variable cell sizes, and machine learning algorithms were particularly useful for incorporating the ancillary data derived from the digital elevation models (DEMs), since these most likely would not have met the basic assumptions of multivariate normality required for parametric classifiers.;Collectively, this research suggests that high spatial resolution remotely sensed data are valuable for mapping and monitoring surface mining and mine reclamation, especially when elevation and spectral data are combined. Machine learning algorithms and GEOBIA are useful for integrating such diverse data

    Mapping Chestnut Stands Using Bi-Temporal VHR Data

    Get PDF
    This study analyzes the potential of very high resolution (VHR) remote sensing images and extended morphological profiles for mapping Chestnut stands on Tenerife Island (Canary Islands, Spain). Regarding their relevance for ecosystem services in the region (cultural and provisioning services) the public sector demand up-to-date information on chestnut and a simple straight-forward approach is presented in this study. We used two VHR WorldView images (March and May 2015) to cover different phenological phases. Moreover, we included spatial information in the classification process by extended morphological profiles (EMPs). Random forest is used for the classification process and we analyzed the impact of the bi-temporal information as well as of the spatial information on the classification accuracies. The detailed accuracy assessment clearly reveals the benefit of bi-temporal VHR WorldView images and spatial information, derived by EMPs, in terms of the mapping accuracy. The bi-temporal classification outperforms or at least performs equally well when compared to the classification accuracies achieved by the mono-temporal data. The inclusion of spatial information by EMPs further increases the classification accuracy by 5% and reduces the quantity and allocation disagreements on the final map. Overall the new proposed classification strategy proves useful for mapping chestnut stands in a heterogeneous and complex landscape, such as the municipality of La Orotava, Tenerife

    A review of machine learning applications in wildfire science and management

    Full text link
    Artificial intelligence has been applied in wildfire science and management since the 1990s, with early applications including neural networks and expert systems. Since then the field has rapidly progressed congruently with the wide adoption of machine learning (ML) in the environmental sciences. Here, we present a scoping review of ML in wildfire science and management. Our objective is to improve awareness of ML among wildfire scientists and managers, as well as illustrate the challenging range of problems in wildfire science available to data scientists. We first present an overview of popular ML approaches used in wildfire science to date, and then review their use in wildfire science within six problem domains: 1) fuels characterization, fire detection, and mapping; 2) fire weather and climate change; 3) fire occurrence, susceptibility, and risk; 4) fire behavior prediction; 5) fire effects; and 6) fire management. We also discuss the advantages and limitations of various ML approaches and identify opportunities for future advances in wildfire science and management within a data science context. We identified 298 relevant publications, where the most frequently used ML methods included random forests, MaxEnt, artificial neural networks, decision trees, support vector machines, and genetic algorithms. There exists opportunities to apply more current ML methods (e.g., deep learning and agent based learning) in wildfire science. However, despite the ability of ML models to learn on their own, expertise in wildfire science is necessary to ensure realistic modelling of fire processes across multiple scales, while the complexity of some ML methods requires sophisticated knowledge for their application. Finally, we stress that the wildfire research and management community plays an active role in providing relevant, high quality data for use by practitioners of ML methods.Comment: 83 pages, 4 figures, 3 table

    Tackling Uncertainties and Errors in the Satellite Monitoring of Forest Cover Change

    Get PDF
    This study aims at improving the reliability of automatic forest change detection. Forest change detection is of vital importance for understanding global land cover as well as the carbon cycle. Remote sensing and machine learning have been widely adopted for such studies with increasing degrees of success. However, contemporary global studies still suffer from lower-than-satisfactory accuracies and robustness problems whose causes were largely unknown. Global geographical observations are complex, as a result of the hidden interweaving geographical processes. Is it possible that some geographical complexities were not expected in contemporary machine learning? Could they cause uncertainties and errors when contemporary machine learning theories are applied for remote sensing? This dissertation adopts the philosophy of error elimination. We start by explaining the mathematical origins of possible geographic uncertainties and errors in chapter two. Uncertainties are unavoidable but might be mitigated. Errors are hidden but might be found and corrected. Then in chapter three, experiments are specifically designed to assess whether or not the contemporary machine learning theories can handle these geographic uncertainties and errors. In chapter four, we identify an unreported systemic error source: the proportion distribution of classes in the training set. A subsequent Bayesian Optimal solution is designed to combine Support Vector Machine and Maximum Likelihood. Finally, in chapter five, we demonstrate how this type of error is widespread not just in classification algorithms, but also embedded in the conceptual definition of geographic classes before the classification. In chapter six, the sources of errors and uncertainties and their solutions are summarized, with theoretical implications for future studies. The most important finding is that, how we design a classification largely pre-determines what we eventually get out of it. This applies for many contemporary popular classifiers including various types of neural nets, decision tree, and support vector machine. This is a cause of the so-called overfitting problem in contemporary machine learning. Therefore, we propose that the emphasis of classification work be shifted to the planning stage before the actual classification. Geography should not just be the analysis of collected observations, but also about the planning of observation collection. This is where geography, machine learning, and survey statistics meet

    Inference in supervised spectral classifiers for on-board hyperspectral imaging: An overview

    Get PDF
    Machine learning techniques are widely used for pixel-wise classification of hyperspectral images. These methods can achieve high accuracy, but most of them are computationally intensive models. This poses a problem for their implementation in low-power and embedded systems intended for on-board processing, in which energy consumption and model size are as important as accuracy. With a focus on embedded anci on-board systems (in which only the inference step is performed after an off-line training process), in this paper we provide a comprehensive overview of the inference properties of the most relevant techniques for hyperspectral image classification. For this purpose, we compare the size of the trained models and the operations required during the inference step (which are directly related to the hardware and energy requirements). Our goal is to search for appropriate trade-offs between on-board implementation (such as model size anci energy consumption) anci classification accuracy

    GIS-based urban land use characterization and population modeling with subpixel information measured from remote sensing data

    Get PDF
    This dissertation provides deeper understanding on the application of Vegetation-Impervious Surface-Soil (V-I-S) model in the urban land use characterization and population modeling, focusing on New Orleans area. Previous research on the V-I-S model used in urban land use classification emphasized on the accuracy improvement while ignoring the discussion of the stability of classifiers. I developed an evaluation framework by using randomization techniques and decision tree method to assess and compare the performance of classifiers and input features. The proposed evaluation framework is applied to demonstrate the superiority of V-I-S fractions and LST for urban land use classification. It could also be applied to the assessment of input features and classifiers for other remote sensing image classification context. An innovative urban land use classification based on the V-I-S model is implemented and tested in this dissertation. Due to the shape of the V-I-S bivariate histogram that resembles topological surfaces, a pattern that honors the Lu-Weng’s urban model, the V-I-S feature space is rasterized into grey-scale image and subsequently partitioned by marker-controlled watershed segmentation, leading to an urban land use classification. This new approach is proven to be insensitive to the selection of initial markers as long as they are positioned around the underlying watershed centers. This dissertation links the population distribution of New Orleans with its physiogeographic conditions indicated by the V-I-S sub-pixel composition and the land use information. It shows that the V-I-S fractions cannot be directly used to model the population distribution. Both the OLS and GWR models produced poor model fit. In contrast, the land use information extracted from the V-I-S information and LST significantly improved regression models. A three-class land use model is fitted adequately. The GWR model reveals the spatial nonstationarity as the relationship between the population distribution and the land use is relatively poor in the city center and becomes stronger towards the city fringe, depicting a classic urban concentric pattern. It highlighted that New Orleans is a complex metropolitan area, and its population distribution cannot be fully modeled with the physiogeographic measurements
    • …
    corecore