    Anomaly detection in hyperspectral signatures using automated derivative spectroscopy methods

    The goal of this research was to detect anomalies in remotely sensed Hyperspectral images using automated derivative based methods. A database of Hyperspectral signatures was used that had simulated additive Gaussian anomalies that modeled a weakly concentrated aerosol in several spectral bands. The automated pattern detection system was carried out in four steps. They were: (1) feature extraction, (2) feature reduction through linear discriminant analysis, (3) performance characterization through receiver operating characteristic curves, and (4) signature classification using nearest mean and maximum likelihood classifiers. The Hyperspectral database contained signatures with various anomaly concentrations ranging from weakly present to moderately present and also anomalies in various spectral reflective and absorptive bands. It was found that the automated derivative based detection system gave classification accuracies of 97 percent for a Gaussian anomaly of SNR -45 dB and 70 percent for Gaussian anomaly of SNR -85 dB. This demonstrates the applicability of using derivative analysis methods for pattern detection and classification with remotely sensed Hyperspectral images

    New methods in Palaeopalynology: Classification of pollen through pollen chemistry

    Pollen grains are one of the primary tools of palaeoecologists to reconstruct vegetation changes in the past. The description, counting and analysis of pollen grains (palynology) has contributed to our understanding of establishment and dynamics of past and present plant communities. Advances in identification accuracy, precision and increased taxonomic resolution have greatly improved our understanding of biogeography and plant community interactions. Nevertheless, the techniques by which palynological studies are performed have not fundamentally changed. Taxonomic resolution and automation have been identified as some of the key challenges for palynology and palaeoecology. Chemical methods have been proposed as a potential alternative to morphological approaches and have demonstrated promising results in the classification of modern pollen grains and in the analysis of pollen chemical responses to UV-B radiation. The application of chemical methods for palynological needs have not been thoroughly explored, with analysis of (sub-)fossil pollen lagging behind their modern counterpart. Especially the application of infrared methods have gained popularity as an alternative to traditional morphological approaches. In this thesis, I explore the use of infrared methods for palynological applications, by exploring the chemical variation in modern pollen grains and in the analysis of fossil pollen grains with IR microscope approaches. The objectives of this thesis are formulated into three research objectives: * Collect modern pollen and explore the variation in chemical composition * Apply chemical methods to fossil material * Explore microscopy chemical methods on modern pollen The thesis is structured into four studies to study these objectives. Papers I and II explore variation and classification based on the chemical composition of modern *Quercus* pollen using two IR approaches, Fourier transform infrared spectroscopy (FTIR) and Fourier transform Raman spectroscopy (FT-Raman). After exploring modern chemical composition of pollen, paper III investigates FTIR methods for the analysis of fossil pollen, in spectra of Holocene *Pinus* pollen. Additionally, the effects of acetolysis and density separation on *Pinus* pollen is described. Paper IV addresses the challenge of scattering signals when measuring small pollen grains of four *Quercus* species with FTIR microscopy and ways to surpress or weaken the scattering signals. The results from paper I and II show classification success, surpassing traditional morphological approaches, at the *Quercus* section level and ~90% recall on species level with both IR approaches. Chemical bands most useful for classification are lipids, sporopollenin and proteins for both FT-Raman and FTIR. We observe differences in the importance of chemical functional groups for the classification. FT-Raman relies more on sporopollenin chemistry, while FTIR utilizes more variation in lipid bands. After finding considerable variation in sporopollenin chemistry in modern pollen samples, FTIR methods were applied to pollen from sediment cores spanning the Holocene. Paper III examines the differences between modern and sub-fossil pollen and reported large differences between them, mainly the removal of labile components, such as lipids and protein peaks from the sub-fossil spectra during diagenesis. Additionally, paper III finds changes to pollen chemistry caused by acetolysis in the 1200 - 1000 cm^-1^ region of the spectra, when comparing acetolysed spectra to non-acetolysed spectra. The paper concludes with findings of unwanted inorganic signals (BSi) and contamination from density separation media in the sediment pollen spectra. Paper IV demonstrates two successful methods of removing scattering signals from pollen spectra. Two approaches were examined, embedding and processing with signal correction algorithms. Spectra from embedded pollen have no scattering anomalies, but part of the spectra is unusable, because of absorbance of the embedding matrix (paraffin). The signal processing algorithm removes most of the scatter components and allows the scatter components to be extracted. Classification of the different data-sets (spectra without correction, embedded spectra, processed spectra, scatter parameters) reveals that scatter correction methods reduce classification success and that scatter parameters contain taxonomic information. This suggests that scatter corrections may not be the best approach for applications mainly focused on classification or identification, while reconstructions of, for example, UV-B radiation may benefit from scatter correction methods, when measuring single grain spectra. This thesis shows that the performance of IR methods surpasses traditional morphological methods for pollen classification and that a considerable amount of taxonomic information is stored in functional groups associated with sporopollenin (phenylpropanoids). In a study on fossil pollen, this thesis demonstrates that conventional chemical extraction methods, such as acetolysis, alter the chemical composition of pollen and may not be ideal for palaeochemical purposes. Additionally, the scatter correction methods show that IR can provide non-chemical information in the form of scatter parameters, which contain taxonomic information. These results are useful additions to the growing knowledge on chemical methods for palaeoecological and palynological analyses.

    Data mining techniques for the assessment of factors contributing to the damage of residential houses in Australia

    This paper reports on the preparation and management processes of inconsistent data on damage on residential houses in Victoria, Australia. There are no existing specific and fully relevant databases readily available except for the incomplete paper-based and electronic-based reports. Therefore, the extracting of information from the reports is complicated and time consuming in order to extract and include all the necessary information needed for analysis of damage on residential houses founded on expansive soils. Data mining is adopted to develop a database. Statistical methods and Artificial Intelligence methods are used to quantify the quality of data. The paper concludes that the development of such database could enable BHC to evaluate the usefulness of the reports prepared on the reported damage properties for further analysis

    Mesogeos: A multi-purpose dataset for data-driven wildfire modeling in the Mediterranean

    We introduce Mesogeos, a large-scale multi-purpose dataset for wildfire modeling in the Mediterranean. Mesogeos integrates variables representing wildfire drivers (meteorology, vegetation, human activity) and historical records of wildfire ignitions and burned areas for 17 years (2006-2022). It is designed as a cloud-friendly spatio-temporal dataset, namely a datacube, harmonizing all variables in a grid of 1km x 1km x 1-day resolution. The datacube structure offers opportunities to assess machine learning (ML) usage in various wildfire modeling tasks. We extract two ML-ready datasets that establish distinct tracks to demonstrate this potential: (1) short-term wildfire danger forecasting and (2) final burned area estimation given the point of ignition. We define appropriate metrics and baselines to evaluate the performance of models in each track. By publishing the datacube, along with the code to create the ML datasets and models, we encourage the community to foster the implementation of additional tracks for mitigating the increasing threat of wildfires in the Mediterranean

    Spatiotemporal graphical modeling for cyber-physical systems

    Cyber-Physical Systems (CPSs) are combinations of physical processes and network computation. Modern CPSs such as smart buildings, power plants, transportation networks, and power-grids have shown tremendous potential for increased efficiency, robustness, and resilience. However, such modern CPSs encounter a large variety of physical faults and cyber anomalies, and in many cases are vulnerable to catastrophic fault propagation scenarios due to strong connectivity among their sub-systems. To address these issues, this study proposes a graphical modeling framework to monitor and predict the performance of CPSs in a scalable and robust way. This thesis investigates on two critical CPS applications to evaluate the effectiveness of this proposed framework, namely (i) health monitoring of highway traffic sensors and (ii) building energy consumption prediction. In highway traffic sensor networks, accurate traffic sensor data is essential for traffic operation management systems and acquisition of real-time traffic surveillance data depends heavily on the reliability of the physical systems. Therefore, detecting the health status of the sensors in a traffic sensor network is critical for the departments of transportation as well as other public and private entities, especially in the circumstances where real-time decision making is required. With the purpose of efficiently determining the traffic network status and identifying failed sensor(s), this study proposes a cost-effective spatiotemporal graphical modeling approach called spatiotemporal pattern network (STPN). Traffic speed and volume measurement sensors are used in this work to formulate and analyze the proposed sensor health monitoring system. The historical time-series data from the networked traffic sensors on the Interstate 35 (I-35) within the state of Iowa is used for validation. Based on the validation results, this study demonstrates that the proposed graphical modeling approach can: (i) extract spatiotemporal dependencies among the different sensors which lead to an efficient graphical representation of the sensor network in the information space, and (ii) distinguish and quantify a sensor issue by leveraging the extracted spatiotemporal relationship of the candidate sensor(s) to the other sensors in the network. In the building energy consumption prediction case, we consider the fact that energy performance of buildings is primarily affected by the heat exchange with the building outer skin and the surrounding environment. In addition, it is a common practice in building energy simulation (BES) to predict energy usage with a variable degree of accuracy. Therefore, to account for accurate building energy consumption, especially in urban environments with a lot of anthropogenic heat sources, it is necessary to consider the microclimate conditions around the building. These conditions are influenced by the immediate environment, such as surrounding buildings, hard surfaces, and trees. Moreover, deployment of sensors to monitor the microclimate information of a building can be quite challenging and therefore, not scalable. Instead of applying local weather data directly on building energy simulation (BES) tools, this work proposes a spatiotemporal pattern network (STPN) based machine learning framework to predict the microclimate information based on the local weather station, which leads to better energy consumption prediction in buildings

    Multisensor Fusion Remote Sensing Technology For Assessing Multitemporal Responses In Ecohydrological Systems

    Earth ecosystems and environment have been changing rapidly due to the advanced technologies and developments of humans. Impacts caused by human activities and developments are difficult to acquire for evaluations due to the rapid changes. Remote sensing (RS) technology has been implemented for environmental managements. A new and promising trend in remote sensing for environment is widely used to measure and monitor the earth environment and its changes. RS allows large-scaled measurements over a large region within a very short period of time. Continuous and repeatable measurements are the very indispensable features of RS. Soil moisture is a critical element in the hydrological cycle especially in a semiarid or arid region. Point measurement to comprehend the soil moisture distribution contiguously in a vast watershed is difficult because the soil moisture patterns might greatly vary temporally and spatially. Space-borne radar imaging satellites have been popular because they have the capability to exhibit all weather observations. Yet the estimation methods of soil moisture based on the active or passive satellite imageries remain uncertain. This study aims at presenting a systematic soil moisture estimation method for the Choke Canyon Reservoir Watershed (CCRW), a semiarid watershed with an area of over 14,200 km2 in south Texas. With the aid of five corner reflectors, the RADARSAT-1 Synthetic Aperture Radar (SAR) imageries of the study area acquired in April and September 2004 were processed by both radiometric and geometric calibrations at first. New soil moisture estimation models derived by genetic programming (GP) technique were then developed and applied to support the soil moisture distribution analysis. The GP-based nonlinear function derived in the evolutionary process uniquely links a series of crucial topographic and geographic features. Included in this process are slope, aspect, vegetation cover, and soil permeability to compliment the well-calibrated SAR data. Research indicates that the novel application of GP proved useful for generating a highly nonlinear structure in regression regime, which exhibits very strong correlations statistically between the model estimates and the ground truth measurements (volumetric water content) on the basis of the unseen data sets. In an effort to produce the soil moisture distributions over seasons, it eventually leads to characterizing local- to regional-scale soil moisture variability and performing the possible estimation of water storages of the terrestrial hydrosphere. A new evolutionary computational, supervised classification scheme (Riparian Classification Algorithm, RICAL) was developed and used to identify the change of riparian zones in a semi-arid watershed temporally and spatially. The case study uniquely demonstrates an effort to incorporating both vegetation index and soil moisture estimates based on Landsat 5 TM and RADARSAT-1 imageries while trying to improve the riparian classification in the Choke Canyon Reservoir Watershed (CCRW), South Texas. The CCRW was selected as the study area contributing to the reservoir, which is mostly agricultural and range land in a semi-arid coastal environment. This makes the change detection of riparian buffers significant due to their interception capability of non-point source impacts within the riparian buffer zones and the maintenance of ecosystem integrity region wide. The estimation of soil moisture based on RADARSAT-1 Synthetic Aperture Radar (SAR) satellite imagery as previously developed was used. Eight commonly used vegetation indices were calculated from the reflectance obtained from Landsat 5 TM satellite images. The vegetation indices were individually used to classify vegetation cover in association with genetic programming algorithm. The soil moisture and vegetation indices were integrated into Landsat TM images based on a pre-pixel channel approach for riparian classification. Two different classification algorithms were used including genetic programming, and a combination of ISODATA and maximum likelihood supervised classification. The white box feature of genetic programming revealed the comparative advantage of all input parameters. The GP algorithm yielded more than 90% accuracy, based on unseen ground data, using vegetation index and Landsat reflectance band 1, 2, 3, and 4. The detection of changes in the buffer zone was proved to be technically feasible with high accuracy. Overall, the development of the RICAL algorithm may lead to the formulation of more effective management strategies for the handling of non-point source pollution control, bird habitat monitoring, and grazing and live stock management in the future. Soil properties, landscapes, channels, fault lines, erosion/deposition patches, and bedload transport history show geologic and geomorphologic features in a variety of watersheds. In response to these unique watershed characteristics, the hydrology of large-scale watersheds is often very complex. Precipitation, infiltration and percolation, stream flow, plant transpiration, soil moisture changes, and groundwater recharge are intimately related with each other to form water balance dynamics on the surface of these watersheds. Within this chapter, depicted is an optimal site selection technology using a grey integer programming (GIP) model to assimilate remote sensing-based geo-environmental patterns in an uncertain environment with respect to some technical and resources constraints. It enables us to retrieve the hydrological trends and pinpoint the most critical locations for the deployment of monitoring stations in a vast watershed. Geo-environmental information amassed in this study includes soil permeability, surface temperature, soil moisture, precipitation, leaf area index (LAI) and normalized difference vegetation index (NDVI). With the aid of a remote sensing-based GIP analysis, only five locations out of more than 800 candidate sites were selected by the spatial analysis, and then confirmed by a field investigation. The methodology developed in this remote sensing-based GIP analysis will significantly advance the state-of-the-art technology in optimum arrangement/distribution of water sensor platforms for maximum sensing coverage and information-extraction capacity. Effective water resources management is a critically important priority across the globe. While water scarcity limits the uses of water in many ways, floods also have caused so many damages and lives. To more efficiently use the limited amount of water or to resourcefully provide adequate time for flood warning, the results have led us to seek advanced techniques for improving streamflow forecasting. The objective of this section of research is to incorporate sea surface temperature (SST), Next Generation Radar (NEXRAD) and meteorological characteristics with historical stream data to forecast the actual streamflow using genetic programming. This study case concerns the forecasting of stream discharge of a complex-terrain, semi-arid watershed. This study elicits microclimatological factors and the resultant stream flow rate in river system given the influence of dynamic basin features such as soil moisture, soil temperature, ambient relative humidity, air temperature, sea surface temperature, and precipitation. Evaluations of the forecasting results are expressed in terms of the percentage error (PE), the root-mean-square error (RMSE), and the square of the Pearson product moment correlation coefficient (r-squared value). The developed models can predict streamflow with very good accuracy with an r-square of 0.84 and PE of 1% for a 30-day prediction

    Mapping and monitoring forest remnants : a multiscale analysis of spatio-temporal data

    KEYWORDS : Landsat, time series, machine learning, semideciduous Atlantic forest, Brazil, wavelet transforms, classification, change detectionForests play a major role in important global matters such as carbon cycle, climate change, and biodiversity. Besides, forests also influence soil and water dynamics with major consequences for ecological relations and decision-making. One basic requirement to quantify and model these processes is the availability of accurate maps of forest cover. Data acquisition and analysis at appropriate scales is the keystone to achieve the mapping accuracy needed for development and reliable use of ecological models.The current and upcoming production of high-resolution data sets plus the ever-increasing time series that have been collected since the seventieth must be effectively explored. Missing values and distortions further complicate the analysis of this data set. Thus, integration and proper analysis is of utmost importance for environmental research. New conceptual models in environmental sciences, like the perception of multiple scales, require the development of effective implementation techniques.This thesis presents new methodologies to map and monitor forests on large, highly fragmented areas with complex land use patterns. The use of temporal information is extensively explored to distinguish natural forests from other land cover types that are spectrally similar. In chapter 4, novel schemes based on multiscale wavelet analysis are introduced, which enabled an effective preprocessing of long time series of Landsat data and improved its applicability on environmental assessment.In chapter 5, the produced time series as well as other information on spectral and spatial characteristics were used to classify forested areas in an experiment relating a number of combinations of attribute features. Feature sets were defined based on expert knowledge and on data mining techniques to be input to traditional and machine learning algorithms for pattern recognition, viz . maximum likelihood, univariate and multivariate decision trees, and neural networks. The results showed that maximum likelihood classification using temporal texture descriptors as extracted with wavelet transforms was most accurate to classify the semideciduous Atlantic forest in the study area.In chapter 6, a multiscale approach to digital change detection was developed to deal with multisensor and noisy remotely sensed images. Changes were extracted according to size classes minimising the effects of geometric and radiometric misregistration.Finally, in chapter 7, an automated procedure for GIS updating based on feature extraction, segmentation and classification was developed to monitor the remnants of semideciduos Atlantic forest. The procedure showed significant improvements over post classification comparison and direct multidate classification based on artificial neural networks.</p