111 research outputs found

    Deep Learning Techniques to Improve the Performance of Olive Oil Classification

    Get PDF
    The olive oil assessment involves the use of a standardized sensory analysis according to the “panel test” method. However, there is an important interest to design novel strategies based on the use of Gas Chromatography (GC) coupled to mass spectrometry (MS), or ion mobility spectrometry (IMS) together with a chemometric data treatment for olive oil classification. It is an essential task in an attempt to get the most robust model over time and, both to avoid fraud in the price and to know whether it is suitable for consumption or not. The aim of this paper is to combine chemical techniques and Deep Learning approaches to automatically classify olive oil samples from two different harvests in their three corresponding classes: extra virgin olive oil (EVOO), virgin olive oil (VOO), and lampante olive oil (LOO). Our Deep Learning model is built with 701 samples, which were obtained from two olive oil campaigns (2014–2015 and 2015–2016). The data from the two harvests are built from the selection of specific olive oil markers from the whole spectral fingerprint obtained with GC-IMS method. In order to obtain the best results we have configured the parameters of our model according to the nature of the data. The results obtained show that a deep learning approach applied to data obtained from chemical instrumental techniques is a good method when classifying oil samples in their corresponding categories, with higher success rates than those obtained in previous works.Ministerio de Economía y Competitividad TIN2017-88209-C2-2-

    Predicting and Mapping of Soil Organic Carbon Using Machine Learning Algorithms in Northern Iran

    Full text link
    Estimation of the soil organic carbon content is of utmost importance in understanding the chemical, physical, and biological functions of the soil. This study proposes machine learning algorithms of support vector machines, artificial neural networks, regression tree, random forest, extreme gradient boosting, and conventional deep neural network for advancing prediction models of SOC. Models are trained with 1879 composite surface soil samples, and 105 auxiliary data as predictors. The genetic algorithm is used as a feature selection approach to identify effective variables. The results indicate that precipitation is the most important predictor driving 15 percent of SOC spatial variability followed by the normalized difference vegetation index, day temperature index of moderate resolution imaging spectroradiometer, multiresolution valley bottom flatness and land use, respectively. Based on 10 fold cross validation, the DNN model reported as a superior algorithm with the lowest prediction error and uncertainty. In terms of accuracy, DNN yielded a mean absolute error of 59 percent, a root mean squared error of 75 percent, a coefficient of determination of 0.65, and Lins concordance correlation coefficient of 0.83. The SOC content was the highest in udic soil moisture regime class with mean values of 4 percent, followed by the aquic and xeric classes, respectively. Soils in dense forestlands had the highest SOC contents, whereas soils of younger geological age and alluvial fans had lower SOC. The proposed DNN is a promising algorithm for handling large numbers of auxiliary data at a province scale, and due to its flexible structure and the ability to extract more information from the auxiliary data surrounding the sampled observations, it had high accuracy for the prediction of the SOC baseline map and minimal uncertainty.Comment: 30pages, 9 figure

    Quantitative Mapping of Soil Property Based on Laboratory and Airborne Hyperspectral Data Using Machine Learning

    Get PDF
    Soil visible and near-infrared spectroscopy provides a non-destructive, rapid and low-cost approach to quantify various soil physical and chemical properties based on their reflectance in the spectral range of 400–2500 nm. With an increasing number of large-scale soil spectral libraries established across the world and new space-borne hyperspectral sensors, there is a need to explore methods to extract informative features from reflectance spectra and produce accurate soil spectroscopic models using machine learning. Features generated from regional or large-scale soil spectral data play a key role in the quantitative spectroscopic model for soil properties. The Land Use/Land Cover Area Frame Survey (LUCAS) soil library was used to explore PLS-derived components and fractal features generated from soil spectra in this study. The gradient-boosting method performed well when coupled with extracted features on the estimation of several soil properties. Transfer learning based on convolutional neural networks (CNNs) was proposed to make the model developed from laboratory data transferable for airborne hyperspectral data. The soil clay map was successfully derived using HyMap imagery and the fine-tuned CNN model developed from LUCAS mineral soils, as deep learning has the potential to learn transferable features that generalise from the source domain to target domain. The external environmental factors like the presence of vegetation restrain the application of imaging spectroscopy. The reflectance data can be transformed into a vegetation suppressed domain with a force invariance approach, the performance of which was evaluated in an agricultural area using CASI airborne hyperspectral data. However, the relationship between vegetation and acquired spectra is complicated, and more efforts should put on removing the effects of external factors to make the model transferable from one sensor to another.:Abstract I Kurzfassung III Table of Contents V List of Figures IX List of Tables XIII List of Abbreviations XV 1 Introduction 1 1.1 Motivation 1 1.2 Soil spectra from different platforms 2 1.3 Soil property quantification using spectral data 4 1.4 Feature representation of soil spectra 5 1.5 Objectives 6 1.6 Thesis structure 7 2 Combining Partial Least Squares and the Gradient-Boosting Method for Soil Property Retrieval Using Visible Near-Infrared Shortwave Infrared Spectra 9 2.1 Abstract 10 2.2 Introduction 10 2.3 Materials and methods 13 2.3.1 The LUCAS soil spectral library 13 2.3.2 Partial least squares algorithm 15 2.3.3 Gradient-Boosted Decision Trees 15 2.3.4 Calculation of relative variable importance 16 2.3.5 Assessment 17 2.4 Results 17 2.4.1 Overview of the spectral measurement 17 2.4.2 Results of PLS regression for the estimation of soil properties 19 2.4.3 Results of PLS-GBDT for the estimation of soil properties 21 2.4.4 Relative important variables derived from PLS regression and the gradient-boosting method 24 2.5 Discussion 28 2.5.1 Dimension reduction for high-dimensional soil spectra 28 2.5.2 GBDT for quantitative soil spectroscopic modelling 29 2.6 Conclusions 30 3 Quantitative Retrieval of Organic Soil Properties from Visible Near-Infrared Shortwave Infrared Spectroscopy Using Fractal-Based Feature Extraction 31 3.1 Abstract 32 3.2 Introduction 32 3.3 Materials and Methods 35 3.3.1 The LUCAS topsoil dataset 35 3.3.2 Fractal feature extraction method 37 3.3.3 Gradient-boosting regression model 37 3.3.4 Evaluation 41 3.4 Results 42 3.4.1 Fractal features for soil spectroscopy 42 3.4.2 Effects of different step and window size on extracted fractal features 45 3.4.3 Modelling soil properties with fractal features 47 3.4.3 Comparison with PLS regression 49 3.5 Discussion 51 3.5.1 The importance of fractal dimension for soil spectra 51 3.5.2 Modelling soil properties with fractal features 52 3.6 Conclusions 53 4 Transfer Learning for Soil Spectroscopy Based on Convolutional Neural Networks and Its Application in Soil Clay Content Mapping Using Hyperspectral Imagery 55 4.1 Abstract 55 4.2 Introduction 56 4.3 Materials and Methods 59 4.3.1 Datasets 59 4.3.2 Methods 62 4.3.3 Assessment 67 4.4 Results and Discussion 67 4.4.1 Interpretation of mineral and organic soils from LUCAS dataset 67 4.4.2 1D-CNN and spectral index for LUCAS soil clay content estimation 69 4.4.3 Application of transfer learning for soil clay content mapping using the pre-trained 1D-CNN model 72 4.4.4 Comparison between spectral index and transfer learning 74 4.4.5 Large-scale soil spectral library for digital soil mapping at the local scale using hyperspectral imagery 75 4.5 Conclusions 75 5 A Case Study of Forced Invariance Approach for Soil Salinity Estimation in Vegetation-Covered Terrain Using Airborne Hyperspectral Imagery 77 5.1 Abstract 78 5.2 Introduction 78 5.3 Materials and Methods 81 5.3.1 Study area of Zhangye Oasis 81 5.3.2 Data description 82 5.3.3 Methods 83 5.3.3 Model performance assessment 85 5.4 Results and Discussion 86 5.4.1 The correlation between NDVI and soil salinity 86 5.4.2 Vegetation suppression performance using the Forced Invariance Approach 86 5.4.3 Estimation of soil properties using airborne hyperspectral data 88 5.5 Conclusions 90 6 Conclusions and Outlook 93 Bibliography 97 Acknowledgements 11

    Early Detection of Wild Rocket Tracheofusariosis Using Hyperspectral Image-Based Machine Learning

    Get PDF
    Fusarium oxysporum f. sp. raphani is responsible for wilting wild rocket (Diplotaxis tenuifolia L. [D.C.]). A machine learning model based on hyperspectral data was constructed to monitor disease progression. Thus, pathogenesis after artificial inoculation was monitored over a 15-day period by symptom assessment, qPCR pathogen quantification, and hyperspectral imaging. The host colonization by a pathogen evolved accordingly with symptoms as confirmed by qPCR. Spectral data showed differences as early as 5-day post infection and 12 hypespectral vegetation indices were selected to follow disease development. The hyperspectral dataset was used to feed the XGBoost machine learning algorithm with the aim of developing a model that discriminates between healthy and infected plants during the time. The multiple cross-prediction strategy of the pixel-level models was able to detect hyperspectral disease profiles with an average accuracy of 0.8. For healthy pixel detection, the mean Precision value was 0.78, the Recall was 0.88, and the F1 Score was 0.82. For infected pixel detection, the average evaluation metrics were Precision: 0.73, Recall: 0.57, and F1 Score: 0.63. Machine learning paves the way for automatic early detection of infected plants, even a few days after infection

    Quantifying soybean phenotypes using UAV imagery and machine learning, deep learning methods

    Get PDF
    Crop breeding programs aim to introduce new cultivars to the world with improved traits to solve the food crisis. Food production should need to be twice of current growth rate to feed the increasing number of people by 2050. Soybean is one the major grain in the world and only US contributes around 35 percent of world soybean production. To increase soybean production, breeders still rely on conventional breeding strategy, which is mainly a 'trial and error' process. These constraints limit the expected progress of the crop breeding program. The goal was to quantify the soybean phenotypes of plant lodging and pubescence color using UAV-based imagery and advanced machine learning. Plant lodging and soybean pubescence color are two of the most important phenotypes for soybean breeding programs. Soybean lodging and pubescence color is conventionally evaluated visually by breeders, which is time-consuming and subjective to human errors. The goal of this study was to investigate the potential of unmanned aerial vehicle (UAV)-based imagery and machine learning in the assessment of lodging conditions and deep learning in the assessment pubescence color of soybean breeding lines. A UAV imaging system equipped with an RGB (red-green-blue) camera was used to collect the imagery data of 1,266 four-row plots in a soybean breeding field at the reproductive stage. Soybean lodging scores and pubescence scores were visually assessed by experienced breeders. Lodging scores were grouped into four classes, i.e., non-lodging, moderate lodging, high lodging, and severe lodging. In contrast, pubescence color scores were grouped into three classes, i.e., gray, tawny, and segregation. UAV images were stitched to build orthomosaics, and soybean plots were segmented using a grid method. Twelve image features were extracted from the collected images to assess the lodging scores of each breeding line. Four models, i.e., extreme gradient boosting (XGBoost), random forest (RF), K-nearest neighbor (KNN), and artificial neural network (ANN), were evaluated to classify soybean lodging classes. Five data pre-processing methods were used to treat the imbalanced dataset to improve the classification accuracy. Results indicate that the pre-processing method SMOTE-ENN consistently performs well for all four (XGBoost, RF, KNN, and ANN) classifiers, achieving the highest overall accuracy (OA), lowest misclassification, higher F1-score, and higher Kappa coefficient. This suggests that Synthetic Minority Over-sampling-Edited Nearest Neighbor (SMOTE-ENN) may be an excellent pre-processing method for using unbalanced datasets and classification tasks. Furthermore, an overall accuracy of 96 percent was obtained using the SMOTE-ENN dataset and ANN classifier. On the other hand, to classify the soybean pubescence color, seven pre-trained deep learning models, i.e., DenseNet121, DenseNet169, DenseNet201, ResNet50, InceptionResNet-V2, Inception-V3, and EfficientNet were used, and images of each plot were fed into the model. Data was enhanced using two rotational and two scaling factors to increase the datasets. Among the seven pre-trained deep learning models, ResNet50 and DenseNet121 classifiers showed a higher overall accuracy of 88 percent, along with higher precision, recall, and F1-score for all three classes of pubescence color. In conclusion, the developed UAV-based high-throughput phenotyping system can gather image features to estimate soybean crucial phenotypes and classify the phenotypes, which will help the breeders in phenotypic variations in breeding trials. Also, the RGB imagery-based classification could be a cost-effective choice for breeders and associated researchers for plant breeding programs in identifying superior genotypes.Includes bibliographical references
    corecore