247 research outputs found

    Advancing Data Analysis for Spectroscopic Imaging by Combining Wavelet Compression with Chemometrics

    Get PDF
    Spectroscopic imaging is a vital tool for studying heterogeneous samples such as bacteria and tissue. Its ability to acquire spatially resolved information allows for identification and classification of the various constituents within a sample. Spectroscopic imagers quickly acquire thousands to tens of thousands of spectra per measurement. These data are often arranged in the form of a 3-dimensional (3D) data cube which contains two spatial dimensions and one spectral dimension. This large amount of data is beneficial for gaining a thorough understanding about the distributions of chemical information. If too little information is measured, important chemical behavior may be overlooked. Statistical analysis algorithms (chemometrics) are required to determine the relevant spectroscopic information within a data cube. Applying chemometrics to such large volumes of data presents computational difficulties regarding computer memory and processing speed. To overcome these burdens, wavelet transform compression is applied prior to chemometric evaluation to accelerate computations and reduce data storage requirements. To optimize compression by enhancing acceleration and reducing approximation errors, different wavelets, or „hybrid wavelets‟, can be applied to the different dimensions of a 3D data set. Determining which combination of wavelets will yield the most compression and best data representation is difficult since many possibilities exist. A compression method is presented that automatically determines the optimum wavelet combinations for a given data set. Principal component analysis (PCA) is used to demonstrate the capabilities of this new procedure, but the compression routine is advantageous for many chemometric techniques. Although linear algorithms like PCA work well in many situations, they are not well-adapted for explaining nonlinear relationships. Kernel principal component analysis (KPCA) has recently been developed to overcome the limitations of linear algorithms. However, when applied to spectroscopic imaging, KPCA calculations require multiple gigabytes of RAM just for holding the data. Therefore, routine use of the algorithm is often prohibited on personal computers. To circumvent such situations, a wavelet compression algorithm is presented that avoids ever having to hold all data in memory at any point during the calculations. The goal is to enable the application of KPCA to large imaging data sets of heterogeneous samples

    Methods for Improving Signal to Noise Ratio in Raman Spectra

    Get PDF
    Raman microspectroscopy is an optoelectronic technique based on the inelastic scattering of light. This technique has been demonstrated to have potential to identify different materials based on subtle differences in the Raman spectral profile using various multivariate statistical classification tools. However, Raman scattering is an inherently weak process. Low photon counts coupled with non-ideal collection efficiencies means that Raman spectroscopy is vulnerable to noise. This makes system optimisations, as well as efficient and reliable noise removal, a necessity in sensitive applications such as chemical classification or diagnostics. Provided in this thesis are software and experimental methodologies to evaluate system performance, predict system performance under various conditions, and to identify the optimal system configuration/set-up in order to achieve the highest possible signal to noise ratio. Modelling methodologies presented in this thesis allow the user to systematically evaluate minimum acquisition times, optimise camera read-out modes, and predict system behaviour with alternative optical elements in order to maximise signal to noise ratio. The denosing algorithms presented in this thesis have been shown to provide superior signal to noise ratio when compared with their traditional counterparts. When compared with the double acquisition method, the proposed cosmic ray removal algorithm resulted in a 10% improvement. An algorithm that enhances Savitzky-Golay smoothing with maximum likelihood estimation produced spectra with up to double the signal to noise ratio when compared to the raw spectra and consistently outperformed the algorithms it was compared to. The use of reflective substrates is also investigated and was shown to approximately triple the collected Raman scatter when compared with transparent substrates. By utilising the methodologies detailed in this thesis it is possible to improve the efficiency of the Raman system in question

    On the Use of Imaging Spectroscopy from Unmanned Aerial Systems (UAS) to Model Yield and Assess Growth Stages of a Broadacre Crop

    Get PDF
    Snap bean production was valued at $363 million in 2018. Moreover, the increasing need in food production, caused by the exponential increase in population, makes this crop vitally important to study. Traditionally, harvest time determination and yield prediction are performed by collecting limited number of samples. While this approach could work, it is inaccurate, labor-intensive, and based on a small sample size. The ambiguous nature of this approach furthermore leaves the grower with under-ripe and over-mature plants, decreasing the final net profit and the overall quality of the product. A more cost-effective method would be a site-specific approach that would save time and labor for farmers and growers, while providing them with exact detail to when and where to harvest and how much is to be harvested (while forecasting yield). In this study we used hyperspectral (i.e., point-based and image-based), as well as biophysical data, to identify spectral signatures and biophysical attributes that could schedule harvest and forecast yield prior to harvest. Over the past two decades, there have been immense advances in the field of yield and harvest modeling using remote sensing data. Nevertheless, there still exists a wide gap in the literature covering yield and harvest assessment as a function of time using both ground-based and unmanned aerial systems. There is a need for a study focusing on crop-specific yield and harvest assessment using a rapid, affordable system. We hypothesize that a down-sampled multispectral system, tuned with spectral features identified from hyperspectral data, could address the mentioned gaps. Moreover, we hypothesize that the airborne data will contain noise that could negatively impact the performance and the reliability of the utilized models. Thus, We address these knowledge gaps with three objectives as below: 1. Assess yield prediction of snap bean crop using spectral and biophysical data and identify discriminating spectral features via statistical and machine learning approaches. 2. Evaluate snap bean harvest maturity at both the plant growth stage and pod maturity level, by means of spectral and biophysical indicators, and identify the corresponding discriminating spectral features. 3. Assess the feasibility of using a deep learning architecture for reducing noise in the hyperspectral data. In the light of the mentioned objectives, we carried out a greenhouse study in the winter and spring of 2019, where we studied temporal change in spectra and physical attributes of snap-bean crop, from Huntington cultivar, using a handheld spectrometer in the visible- to shortwave-infrared domain (400-2500 nm). Chapter 3 of this dissertation focuses on yield assessment of the greenhouse study. Findings from this best-case scenario yield study showed that the best time to study yield is approximately 20-25 days prior to harvest that would give out the most accurate yield predictions. The proposed approach was able to explain variability as high as R2 = 0.72, with spectral features residing in absorption regions for chlorophyll, protein, lignin, and nitrogen, among others. The captured data from this study contained minimal noise, even in the detector fall-off regions. Moving the focus to harvest maturity assessment, Chapter 4 presents findings from this objective in the greenhouse environment. Our findings showed that four stages of maturity, namely vegetative growth, budding, flowering, and pod formation, are distinguishable with 79% and 78% accuracy, respectively, via the two introduced vegetation indices, as snap-bean growth index (SGI) and normalized difference snap-bean growth index (NDSI), respectively. Moreover, pod-level maturity classification showed that ready-to-harvest and not-ready-to-harvest pods can be separated with 78% accuracy with identified wavelengths residing in green, red edge, and shortwave-infrared regions. Moreover, Chapters 5 and 6 focus on transitioning the learned concepts from the mentioned greenhouse scenario to UAS domain. We transitioned from a handheld spectrometer in the visible to short-wave infrared domain (400-2500 nm) to a UAS-mounted hyperspectral imager in the visible-to-near-infrared region (400-1000 nm). Two years worth of data, at two different geographical locations, were collected in upstate New York and examined for yield modeling and harvest scheduling objectives. For analysis of the collected data, we introduced a feature selection library in Python, named “Jostar”, to identify the most discriminating wavelengths. The findings from the yield modeling UAS study show that pod weight and seed length, as two different yield indicators, can be explained with R2 as high as 0.93 and 0.98, respectively. Identified wavelengths resided in blue, green, red, and red edge regions, and 44-55 days after planting (DAP) showed to be the optimal time for yield assessment. Chapter 6, on the other hand, evaluates maturity assessment, in terms of pod classification, from the UAS perspective. Results from this study showed that the identified features resided in blue, green, red, and red-edge regions, contributing to F1 score as high as 0.91 for differentiating between ready-to-harvest vs. not ready-to-harvest. The identified features from this study is in line with those detected from the UAS yield assessment study. In order to have a parallel comparison of the greenhouse study against the UAS study, we adopted the methodology employed for UAS studies and applied it to the greenhouse studies, in Chapter 7. Since the greenhouse data were captured in the visible-to-shortwave-infrared (400-2500 nm) domain, and the UAS study data were captured in the VNIR (400-1000 nm) domain, we truncated the spectral range of the collected data from the greenhouse study to the VNIR domain. The comparison experiment between the greenhouse study and the UAS studies for yield assessment, at two harvest stages early and late, showed that spectral features in 450-470, 500-520, 650, 700-730 nm regions were repeated on days with highest coefficient of determination. Moreover, 46-48 DAP with high coefficient of determination for yield prediction were repeated in five out of six data sets (two early stages, each three data sets). On the other hand, the harvest maturity comparison between the greenhouse study and the UAS data sets showed that similar identified wavelengths reside in ∼450, ∼530, ∼715, and ∼760 nm regions, with performance metric (F1 score) of 0.78, 0.84, and 0.9 for greenhouse, 2019 UAS, and 2020 UAS data, respectively. However, the incorporated noise in the captured data from the UAS study, along with the high computational cost of the classical mathematical approach employed for denoising hyperspectral data, have inspired us to leverage the computational performance of hyperspectral denoising by assessing the feasibility of transferring the learned concepts to deep learning models. In Chapter 8, we approached hyperspectral denoising in spectral domain (1D fashion) for two types of noise, integrated noise and non-independent and non-identically distributed (non-i.i.d.) noise. We utilized Memory Networks due to their power in image denoising for hyperspectral denoising, introduced a new loss and benchmarked it against several data sets and models. The proposed model, HypeMemNet, ranked first - up to 40% in terms of signal-to-noise ratio (SNR) for resolving integrated noise, and first or second, by a small margin for resolving non-i.i.d. noise. Our findings showed that a proper receptive field and a suitable number of filters are crucial for denoising integrated noise, while parameter size was shown to be of the highest importance for non-i.i.d. noise. Results from the conducted studies provide a comprehensive understanding encompassing yield modeling, harvest scheduling, and hyperspectral denoising. Our findings bode well for transitioning from an expensive hyperspectral imager to a multispectral imager, tuned with the identified bands, as well as employing a rapid deep learning model for hyperspectral denoising

    Advanced Image Acquisition, Processing Techniques and Applications

    Get PDF
    "Advanced Image Acquisition, Processing Techniques and Applications" is the first book of a series that provides image processing principles and practical software implementation on a broad range of applications. The book integrates material from leading researchers on Applied Digital Image Acquisition and Processing. An important feature of the book is its emphasis on software tools and scientific computing in order to enhance results and arrive at problem solution

    Quantitative Mapping of Soil Property Based on Laboratory and Airborne Hyperspectral Data Using Machine Learning

    Get PDF
    Soil visible and near-infrared spectroscopy provides a non-destructive, rapid and low-cost approach to quantify various soil physical and chemical properties based on their reflectance in the spectral range of 400–2500 nm. With an increasing number of large-scale soil spectral libraries established across the world and new space-borne hyperspectral sensors, there is a need to explore methods to extract informative features from reflectance spectra and produce accurate soil spectroscopic models using machine learning. Features generated from regional or large-scale soil spectral data play a key role in the quantitative spectroscopic model for soil properties. The Land Use/Land Cover Area Frame Survey (LUCAS) soil library was used to explore PLS-derived components and fractal features generated from soil spectra in this study. The gradient-boosting method performed well when coupled with extracted features on the estimation of several soil properties. Transfer learning based on convolutional neural networks (CNNs) was proposed to make the model developed from laboratory data transferable for airborne hyperspectral data. The soil clay map was successfully derived using HyMap imagery and the fine-tuned CNN model developed from LUCAS mineral soils, as deep learning has the potential to learn transferable features that generalise from the source domain to target domain. The external environmental factors like the presence of vegetation restrain the application of imaging spectroscopy. The reflectance data can be transformed into a vegetation suppressed domain with a force invariance approach, the performance of which was evaluated in an agricultural area using CASI airborne hyperspectral data. However, the relationship between vegetation and acquired spectra is complicated, and more efforts should put on removing the effects of external factors to make the model transferable from one sensor to another.:Abstract I Kurzfassung III Table of Contents V List of Figures IX List of Tables XIII List of Abbreviations XV 1 Introduction 1 1.1 Motivation 1 1.2 Soil spectra from different platforms 2 1.3 Soil property quantification using spectral data 4 1.4 Feature representation of soil spectra 5 1.5 Objectives 6 1.6 Thesis structure 7 2 Combining Partial Least Squares and the Gradient-Boosting Method for Soil Property Retrieval Using Visible Near-Infrared Shortwave Infrared Spectra 9 2.1 Abstract 10 2.2 Introduction 10 2.3 Materials and methods 13 2.3.1 The LUCAS soil spectral library 13 2.3.2 Partial least squares algorithm 15 2.3.3 Gradient-Boosted Decision Trees 15 2.3.4 Calculation of relative variable importance 16 2.3.5 Assessment 17 2.4 Results 17 2.4.1 Overview of the spectral measurement 17 2.4.2 Results of PLS regression for the estimation of soil properties 19 2.4.3 Results of PLS-GBDT for the estimation of soil properties 21 2.4.4 Relative important variables derived from PLS regression and the gradient-boosting method 24 2.5 Discussion 28 2.5.1 Dimension reduction for high-dimensional soil spectra 28 2.5.2 GBDT for quantitative soil spectroscopic modelling 29 2.6 Conclusions 30 3 Quantitative Retrieval of Organic Soil Properties from Visible Near-Infrared Shortwave Infrared Spectroscopy Using Fractal-Based Feature Extraction 31 3.1 Abstract 32 3.2 Introduction 32 3.3 Materials and Methods 35 3.3.1 The LUCAS topsoil dataset 35 3.3.2 Fractal feature extraction method 37 3.3.3 Gradient-boosting regression model 37 3.3.4 Evaluation 41 3.4 Results 42 3.4.1 Fractal features for soil spectroscopy 42 3.4.2 Effects of different step and window size on extracted fractal features 45 3.4.3 Modelling soil properties with fractal features 47 3.4.3 Comparison with PLS regression 49 3.5 Discussion 51 3.5.1 The importance of fractal dimension for soil spectra 51 3.5.2 Modelling soil properties with fractal features 52 3.6 Conclusions 53 4 Transfer Learning for Soil Spectroscopy Based on Convolutional Neural Networks and Its Application in Soil Clay Content Mapping Using Hyperspectral Imagery 55 4.1 Abstract 55 4.2 Introduction 56 4.3 Materials and Methods 59 4.3.1 Datasets 59 4.3.2 Methods 62 4.3.3 Assessment 67 4.4 Results and Discussion 67 4.4.1 Interpretation of mineral and organic soils from LUCAS dataset 67 4.4.2 1D-CNN and spectral index for LUCAS soil clay content estimation 69 4.4.3 Application of transfer learning for soil clay content mapping using the pre-trained 1D-CNN model 72 4.4.4 Comparison between spectral index and transfer learning 74 4.4.5 Large-scale soil spectral library for digital soil mapping at the local scale using hyperspectral imagery 75 4.5 Conclusions 75 5 A Case Study of Forced Invariance Approach for Soil Salinity Estimation in Vegetation-Covered Terrain Using Airborne Hyperspectral Imagery 77 5.1 Abstract 78 5.2 Introduction 78 5.3 Materials and Methods 81 5.3.1 Study area of Zhangye Oasis 81 5.3.2 Data description 82 5.3.3 Methods 83 5.3.3 Model performance assessment 85 5.4 Results and Discussion 86 5.4.1 The correlation between NDVI and soil salinity 86 5.4.2 Vegetation suppression performance using the Forced Invariance Approach 86 5.4.3 Estimation of soil properties using airborne hyperspectral data 88 5.5 Conclusions 90 6 Conclusions and Outlook 93 Bibliography 97 Acknowledgements 11

    Caracterização e estudo comparativo de exsudações de hidrocarbonetos e plays petrolíferos em bacias terrestres das regiões central do Irã e sudeste do Brasil usando sensoriamento remoto espectral

    Get PDF
    Orientador: Carlos Roberto de Souza FilhoTese (doutorado) - Universidade Estadual de Campinas, Instituto de GeociênciasResumo: O objetivo desta pesquisa foi explorar as assinaturas de exsudações de hidrocarbonetos na superfície usando a tecnologia de detecção remota espectral. Isso foi alcançado primeiro, realizando uma revisão abrangente das capacidades e potenciais técnicas de detecção direta e indireta. Em seguida, a técnica foi aplicada para investigar dois locais de teste localizados no Irã e no Brasil, conhecidos por hospedar sistemas ativos de micro-exsudações e afloramentos betuminosos, respectivamente. A primeira área de estudo está localizada perto da cidade de Qom (Irã), e está inserida no campo petrolífero Alborz, enterrado sob sedimentos datados do Oligoceno da Formação Upper Red. O segundo local está localizado perto da cidade de Anhembi (SP), na margem oriental da bacia do Paraná, no Brasil, e inclui acumulações de betume em arenitos triássicos da Formação Pirambóia. O trabalho na área de Qom integrou evidências de (i) estudos petrográficos e geoquímicos em laboratório, (ii) investigações de afloramentos em campo, e (iii) mapeamento de anomalia em larga escala através de conjuntos de dados multi-espectrais ASTER e Sentinel-2. O resultado deste estudo se trata de novos indicadores mineralógicos e geoquímicos para a exploração de micro-exsudações e um modelo de micro-exsudações atualizado. Durante este trabalho, conseguimos desenvolver novas metodologias para análise de dados espectroscópicos. Através da utilização de dados simulados, indicamos que o instrumento de satélite WorldView-3 tem potencial para detecção direta de hidrocarbonetos. Na sequência do estudo, dados reais sobre afloramentos de arenitos e óleo na área de Anhembi foram investigados. A área foi fotografada novamente no chão e usando o sistema de imagem hiperespectral AisaFENIX. Seguiu-se estudos e amostragem no campo,incluindo espectroscopia de alcance fechado das amostras no laboratório usando instrumentos de imagem (ou seja, sisuCHEMA) e não-imagem (ou seja, FieldSpec-4). O estudo demonstrou que uma abordagem espectroscópica multi-escala poderia fornecer uma imagem completa das variações no conteúdo e composição do betume e minerais de alteração que acompanham. A assinatura de hidrocarbonetos, especialmente a centrada em 2300 nm, mostrou-se consistente e comparável entre as escalas e capaz de estimar o teor de betume de areias de petróleo em todas as escalas de imagemAbstract: The objective of this research was to explore for the signatures of seeping hydrocarbons on the surface using spectral remote sensing technology. It was achieved firstly by conducting a comprehensive review of the capacities and potentials of the technique for direct and indirect seepage detection. Next, the technique was applied to investigate two distinctive test sites located in Iran and Brazil known to retain active microseepage systems and bituminous outcrops, respectively. The first study area is located near the city of Qom in Iran, and consists of Alborz oilfield buried under Oligocene sediments of the Upper-Red Formation. The second site is located near the town of Anhembi on the eastern edge of the Paraná Basin in Brazil and includes bitumen accumulations in the Triassic sandstones of the Pirambóia Formation. Our work in Qom area integrated evidence from (i) petrographic, spectroscopic, and geochemical studies in the laboratory, (ii) outcrop investigations in the field, and (iii) broad-scale anomaly mapping via orbital remote sensing data. The outcomes of this study was novel mineralogical and geochemical indicators for microseepage characterization and a classification scheme for the microseepage-induced alterations. Our study indicated that active microseepage systems occur in large parts of the lithofacies in Qom area, implying that the extent of the petroleum reservoir is much larger than previously thought. During this work, we also developed new methodologies for spectroscopic data analysis and processing. On the other side, by using simulated data, we indicated that WorldView-3 satellite instrument has the potential for direct hydrocarbon detection. Following this demonstration, real datasets were acquired over oil-sand outcrops of the Anhembi area. The area was further imaged on the ground and from the air by using an AisaFENIX hyperspectral imaging system. This was followed by outcrop studies and sampling in the field and close-range spectroscopy in the laboratory using both imaging (i.e. sisuCHEMA) and nonimaging instruments. The study demonstrated that a multi-scale spectroscopic approach could provide a complete picture of the variations in the content and composition of bitumen and associated alteration mineralogy. The oil signature, especially the one centered at 2300 nm, was shown to be consistent and comparable among scales, and capable of estimating the bitumen content of oil-sands at all imaging scalesDoutoradoGeologia e Recursos NaturaisDoutor em Geociências2015/06663-7FAPES

    Hyperspectral Image Unmixing Incorporating Adjacency Information

    Get PDF
    While the spectral information contained in hyperspectral images is rich, the spatial resolution of such images is in many cases very low. Many pixel spectra are mixtures of pure materials’ spectra and therefore need to be decomposed into their constituents. This work investigates new decomposition methods taking into account spectral, spatial and global 3D adjacency information. This allows for faster and more accurate decomposition results

    Evaluation of machine learning classifiers for mineralogy mapping based on near infrared hyperspectral imaging

    Get PDF
    The exploration of mineral resources is a major challenge in a world that seeks sustainable energy, renewable energy, advanced engineering, and new commercial technological devices. The rapid decrease in mineral reserves shifted the focus to under-explored and low accessibility areas that led to the use of on-site portable techniques for mineral mapping purposes, such as near infrared hyperspectral image sensors. The large datasets acquired with these instruments needs data pre-processing, a series of mathematical manipulations that can be achieved using machine learning. The aim of this thesis is to improve an existing method for mineralogy mapping, by focusing on the mineral classification phase. More specifically, a spectral similarity index was utilized to support machine learning classifiers. This was introduced because of the inability of the employed classification models to recognize samples that are not part of a given database; the models always classified samples based on one of the known labels of the database. This could be a problem in hyperspectral images as the pure component found in a sample could correspond to a mineral but also to noise or artefacts due to a variety of reasons, such as baseline correction. The spectral similarity index calculates the similarity between a sample spectrum and its assigned database class spectrum; this happens through the use of a threshold that defines whether the sample belongs to a class or not. The metrics utilized in the spectral similarity index were the spectral angler mapper, the correlation coefficient and five different distances. The machine learning classifiers used to evaluate the spectral similarity index were the decision tree, k-nearest neighbor, and support vector machine. Simulated distortions were also introduced in the dataset to test the robustness of the indexes and to choose the best classifier. The spectral similarity index was assessed with a dataset of nine minerals acquired from the Geological Survey of Finland retrieved from a Specim SWIR camera. The validation of the indexes was assessed with two mine samples obtained with a VTT active hyperspectral sensor prototype. The support vector machine was chosen after the comparison between the three classifiers as it showed higher tolerance to distorted data. With the evaluation of the spectral similarity indexes, was found out that the best performances were achieved with SAM and Chebyshev distance, which maintained high stability with smaller and bigger threshold changes. The best threshold value found is the one that, in the dataset analysed, corresponded to the number of spectra available for each class. As for the validation procedure no reference was available; because of this reason, the results of the mine samples obtained with the spectral similarity index were compared with results that can be obtained through visual interpretation, which were in agreement. The method proposed can be useful to future mineral exploration as it is of great importance to correctly classify minerals found during explorations, regardless the database utilized

    Mass spectral imaging of clinical samples using deep learning

    Get PDF
    A better interpretation of tumour heterogeneity and variability is vital for the improvement of novel diagnostic techniques and personalized cancer treatments. Tumour tissue heterogeneity is characterized by biochemical heterogeneity, which can be investigated by unsupervised metabolomics. Mass Spectrometry Imaging (MSI) combined with Machine Learning techniques have generated increasing interest as analytical and diagnostic tools for the analysis of spatial molecular patterns in tissue samples. Considering the high complexity of data produced by the application of MSI, which can consist of many thousands of spectral peaks, statistical analysis and in particular machine learning and deep learning have been investigated as novel approaches to deduce the relationships between the measured molecular patterns and the local structural and biological properties of the tissues. Machine learning have historically been divided into two main categories: Supervised and Unsupervised learning. In MSI, supervised learning methods may be used to segment tissues into histologically relevant areas e.g. the classification of tissue regions in H&E (Haemotoxylin and Eosin) stained samples. Initial classification by an expert histopathologist, through visual inspection enables the development of univariate or multivariate models, based on tissue regions that have significantly up/down-regulated ions. However, complex data may result in underdetermined models, and alternative methods that can cope with high dimensionality and noisy data are required. Here, we describe, apply, and test a novel diagnostic procedure built using a combination of MSI and deep learning with the objective of delineating and identifying biochemical differences between cancerous and non-cancerous tissue in metastatic liver cancer and epithelial ovarian cancer. The workflow investigates the robustness of single (1D) to multidimensional (3D) tumour analyses and also highlights possible biomarkers which are not accessible from classical visual analysis of the H&E images. The identification of key molecular markers may provide a deeper understanding of tumour heterogeneity and potential targets for intervention.Open Acces
    corecore