    Principal variable selection to explain grain yield variation in winter wheat from features extracted from UAV imagery

    Background: Automated phenotyping technologies are continually advancing the breeding process. However, collecting various secondary traits throughout the growing season and processing massive amounts of data still take great efforts and time. Selecting a minimum number of secondary traits that have the maximum predictive power has the potential to reduce phenotyping efforts. The objective of this study was to select principal features extracted from UAV imagery and critical growth stages that contributed the most in explaining winter wheat grain yield. Five dates of multispectral images and seven dates of RGB images were collected by a UAV system during the spring growing season in 2018. Two classes of features (variables), totaling to 172 variables, were extracted for each plot from the vegetation index and plant height maps, including pixel statistics and dynamic growth rates. A parametric algorithm, LASSO regression (the least angle and shrinkage selection operator), and a non-parametric algorithm, random forest, were applied for variable selection. The regression coefficients estimated by LASSO and the permutation importance scores provided by random forest were used to determine the ten most important variables influencing grain yield from each algorithm. Results: Both selection algorithms assigned the highest importance score to the variables related with plant height around the grain filling stage. Some vegetation indices related variables were also selected by the algorithms mainly at earlier to mid growth stages and during the senescence. Compared with the yield prediction using all 172 variables derived from measured phenotypes, using the selected variables performed comparable or even better. We also noticed that the prediction accuracy on the adapted NE lines (r = 0.58–0.81) was higher than the other lines (r = 0.21–0.59) included in this study with different genetic backgrounds. Conclusions: With the ultra-high resolution plot imagery obtained by the UAS-based phenotyping we are now able to derive more features, such as the variation of plant height or vegetation indices within a plot other than just an averaged number, that are potentially very useful for the breeding purpose. However, too many features or variables can be derived in this way. The promising results from this study suggests that the selected set from those variables can have comparable prediction accuracies on the grain yield prediction than the full set of them but possibly resulting in a better allocation of efforts and resources on phenotypic data collection and processing

    Uumanned Aerial Vehicle Data Analysis For High-throughput Plant Phenotyping

    The continuing population is placing unprecedented demands on worldwide crop yield production and quality. Improving genomic selection for breeding process is one essential aspect for solving this dilemma. Benefitted from the advances in high-throughput genotyping, researchers already gained better understanding of genetic traits. However, given the comparatively lower efficiency in current phenotyping technique, the significance of phenotypic traits has still not fully exploited in genomic selection. Therefore, improving HTPP efficiency has become an urgent task for researchers. As one of the platforms utilized for collecting HTPP data, unmanned aerial vehicle (UAV) allows high quality data to be collected within short time and by less labor. There are currently many options for customized UAV system on market; however, data analysis efficiency is still one limitation for the fully implementation of HTPP. To this end, the focus of this program was data analysis of UAV acquired data. The specific objectives were two-fold, one was to investigate statistical correlations between UAV derived phenotypic traits and manually measured sorghum biomass, nitrogen and chlorophyll content. Another was to conduct variable selection on the phenotypic parameters calculated from UAV derived vegetation index (VI) and plant height maps, aiming to find out the principal parameters that contribute most in explaining winter wheat grain yield. Corresponding, two studies were carried out. Good correlations between UAV-derived VI/plant height and sorghum biomass/nitrogen/chlorophyll in the first study suggested that UAV-based HTPP has great potential in facilitating genetic improvement. For the second study, variable selection results from the single-year data showed that plant height related parameters, especially from later season, contributed more in explaining grain yield. Advisor: Yeyin Sh

    Digital phenotyping and genotype-to-phenotype (G2P) models to predict complex traits in cereal crops

    The revolution in digital phenotyping combined with the new layers of omics and envirotyping tools offers great promise to improve selection and accelerate genetic gains for crop improvement. This chapter examines the latest methods involving digital phenotyping tools to predict complex traits in cereals crops. The chapter has two parts. In the first part, entitled “Digital phenotyping as a tool to support breeding programs”, the secondary phenotypes measured by high-throughput plant phenotyping that are potentially useful for breeding are reviewed. In the second part, “Implementing complex G2P models in breeding programs”, the integration of data from digital phenotyping into genotype to phenotype (G2P) models to improve the prediction of complex traits using genomic information is discussed. The current status of statistical models to incorporate secondary traits in univariate and multivariate models, as well as how to better handle longitudinal (for example light interception, biomass accumulation, canopy height) traits, is reviewe

    Mehitamata ÔhusÔiduki rakendamine pÔllukultuuride saagikuse ja maa harimisviiside tuvastamisel

    A Thesis for applying for the degree of Doctor of Philosophy in Environmental Protection.VĂ€itekiri filosoofiadoktori kraadi taotlemiseks keskkonnakaitse erialal.This thesis aims to examine how machine learning (ML) technologies have aided significant advancements in image analysis in the area of precision agriculture. These multimodal computing technologies extend the use of machine learning to a broader spectrum of data collecting and selection for the advancement of agricultural practices (Nawar et al., 2017) These techniques will assist complicated cropping systems with more informed decisions with less human intervention, and provide a scalable framework for incorporating expert knowledge of the PA system. (Chlingaryan et al., 2018). Complexity, on the other hand, can be seen as a disadvantage in crop trials, as machine learning models require training/testing databases, limited areas with insignificant sampling sizes, time and space-specificity, and environmental factor interventions, all of which complicate parameter selection and make using a single empirical model for an entire region impractical. During the early stages of writing this thesis, we used a relatively traditional machine learning method to address the regression problem of crop yield and biomass prediction [(i.e., random forest regression (RFR), support vector regression (SVR), and artificial neural network (ANN)] to predicted dry matter (DM) yields of red clover. It obtained favourable results, however, the choosing of hyperparameters, the lengthy algorithms selection process, data cleaning, and redundant collinearity issues significantly limited the way of the machine learning application. We will further discuss the recent trend of automated machine learning (AutoML) that has been driving further significant technological innovation in the application of artificial intelligence from its automated algorithm selection and hyperparameter optimization of the deployable pipeline model for unravelling substance problems. However, a present knowledge gap exists in the integration of machine learning (ML) technology with unmanned aerial systems (UAS) and hyperspectral-based imaging data categorization and regression applications. In this thesis, we explored a state-of-the-art (SOTA) and entirely open-source AutoML framework, Auto-sklearn, which was built on one of the most frequently used machine learning systems, Scikit-learn. It was integrated with two unique AutoML visualization tools to examine the recognition and acceptance of multispectral vegetation indices (VI) data collected from UAS and hyperspectral narrow-band VIs across a varied spectrum of agricultural management practices (AMP). These procedures incorporate soil tillage method (STM), cultivation method (CM), and manure application (MA), and are classified as four-crop combination fields (i.e., red clover-grass mixture, spring wheat, pea-oat mixture, and spring barley). Additionally, they have not been thoroughly evaluated and lack characteristics that are accessible in agriculture remote sensing applications. This thesis further explores the existing gaps in the knowledge base for several critical crop categories and cultivation management methods referring to biomass and yield analysis, as well as to gain a better understanding of the potential for remotely sensed solutions to field-based and multifunctional platforms to meet precision agriculture demands. To overcome these knowledge gaps, this research introduces a rapid, non-destructive, and low-cost framework for field-based biomass and grain yield modelling, as well as the identification of agricultural management practices. The results may aid agronomists and farmers in establishing more accurate agricultural methods and in monitoring environmental conditions more effectively.Doktoritöö eesmĂ€rk oli uurida, kuidas masinĂ”ppe (MÕ) tehnoloogiad vĂ”imaldavad edusamme tĂ€ppispĂ”llumajanduse valdkonna pildianalĂŒĂŒsis. Multimodaalsed arvutustehnoloogiad laiendavad masinĂ”ppe kasutamist pĂ”llumajanduses andmete kogumisel ja valimisel (Nawar et al., 2017). Selline tĂ€psemal informatsioonil pĂ”hinev tehnoloogia vĂ”imaldab keerukate viljelussĂŒsteemide puhul teha otsuseid inimese vĂ€hema sekkumisega, ja loob skaleeritava raamistiku tĂ€ppispĂ”llumajanduse jaoks (Chlingaryan et al., 2018). PĂ”llukultuuride katsete korral on komplekssete masinĂ”ppemudelite kasutamine keerukas, sest alad on piiratud ning valimi suurus ei ole piisav; vaja on testandmebaase, kindlaid aja- ja ruumitingimusi ning keskkonnategureid. See komplitseerib parameetrite valikut ning muudab ebapraktiliseks ĂŒhe empiirilise mudeli kasutamise terves piirkonnas. Siinse uurimuse algetapis rakendati suhteliselt traditsioonilist masinĂ”ppemeetodit, et lahendada saagikuse ja biomassi prognoosimise regressiooniprobleem (otsustusmetsa regression, tugivektori regressioon ja tehisnĂ€rvivĂ”rk) punase ristiku prognoositava kuivaine saagikuse suhtes. Saadi sobivaid tulemusi, kuid hĂŒperparameetrite valimine, pikk algoritmide valimisprotsess, andmete puhastamine ja kollineaarsusprobleemid takistasid masinĂ”pet oluliselt. Automatiseeritud masinĂ”ppe (AMÕ) uusimate suundumustena rakendatakse tehisintellekti, et lahendada pĂ”hiprobleemid automatiseeritud algoritmi valiku ja rakendatava pipeline-mudeli hĂŒperparameetrite optimeerimise abil. Seni napib teadmisi MÕ tehnoloogia integreerimiseks mehitamata Ă”husĂ”idukite ning hĂŒperspektripĂ”histe pildiandmete kategoriseerimise ja regressioonirakendustega. VĂ€itekirjas uuriti nĂŒĂŒdisaegset ja avatud lĂ€htekoodiga AMÕ tehnoloogiat Auto-sklearn, mis on ĂŒhe enimkasutatava masinĂ”ppesĂŒsteemi Scikit-learn edasiarendus. SĂŒsteemiga liideti kaks unikaalset AMÕ visualiseerimisrakendust, et uurida mehitamata Ă”husĂ”idukiga kogutud andmete multispektraalsete taimkatteindeksite ja hĂŒperspektraalsete kitsaribaandmete taimkatteindeksite tuvastamist ja rakendamist pĂ”llumajanduses. Neid vĂ”tteid kasutatakse mullaharimisel, kultiveerimisel ja sĂ”nnikuga vĂ€etamisel nelja kultuuriga pĂ”ldudel (punase ristiku rohusegu, suvinisu, herne-kaera segu, suvioder). Neid ei ole pĂ”hjalikult hinnatud, samuti ei hĂ”lma need omadusi, mida kasutatatakse pĂ”llumajanduses kaugseire rakendustes. Uurimus kĂ€sitleb biomassi ja saagikuse seni uurimata analĂŒĂŒsivĂ”imalusi oluliste pĂ”llukultuuride ja viljelusmeetodite nĂ€itel. Hinnatakse ka kaugseirelahenduste potentsiaali pĂ”llupĂ”histe ja multifunktsionaalsete platvormide kasutamisel tĂ€ppispĂ”llumajanduses. Uurimus tutvustab kiiret, keskkonna suhtes kahjutut ja mÔÔduka hinnaga tehnoloogiat pĂ”llupĂ”hise biomassi ja teraviljasaagi modelleerimiseks, et leida sobiv viljelusviis. Töö tulemused vĂ”imaldavad pĂ”llumajandustootjatel ja agronoomidel tĂ”husamalt valida pĂ”llundustehnoloogiaid ning arvestada tĂ€psemalt keskkonnatingimustega.Publication of this thesis is supported by the Estonian University of Life Scieces and by the Doctoral School of Earth Sciences and Ecology created under the auspices of the European Social Fund

    Toward Automated Machine Learning-Based Hyperspectral Image Analysis in Crop Yield and Biomass Estimation

    The incorporation of autonomous computation and artificial intelligence (AI) technologies into smart agriculture concepts is becoming an expected scientific procedure. The airborne hyperspectral system with its vast area coverage, high spectral resolution, and varied narrow-band selection is an excellent tool for crop physiological characteristics and yield prediction. However, the extensive and redundant three-dimensional (3D) cube data processing and computation have made the popularization of this tool a challenging task. This research integrated two important open-sourced systems (R and Python) combined with automated hyperspectral narrowband vegetation index calculation and the state-of-the-art AI-based automated machine learning (AutoML) technology to estimate yield and biomass, based on three crop categories (spring wheat, pea and oat mixture, and spring barley with red clover) with multifunctional cultivation practices in northern Europe and Estonia. Our study showed the estimated capacity of the empirical AutoML regression model was significant. The best coefficient of determination (R2) and normalized root mean square error (NRMSE) for single variety planting wheat were 0.96 and 0.12 respectively; for mixed peas and oats, they were 0.76 and 0.18 in the booting to heading stage, while for mixed legumes and spring barley, they were 0.88 and 0.16 in the reproductive growth stages. In terms of straw mass estimation, R2 was 0.96, 0.83, and 0.86, and NRMSE was 0.12, 0.24, and 0.33 respectively. This research contributes to, and confirms, the use of the AutoML framework in hyperspectral image analysis to increase implementation flexibility and reduce learning costs under a variety of agricultural resource conditions. It delivers expert yield and straw mass valuation two months in advance before harvest time for decision-makers. This study also highlights that the hyperspectral system provides economic and environmental benefits and will play a critical role in the construction of sustainable and intelligent agriculture techniques in the upcoming years

    High-throughput estimation of crop traits: A review of ground and aerial phenotyping platforms

    Crop yields need to be improved in a sustainable manner to meet the expected worldwide increase in population over the coming decades as well as the effects of anticipated climate change. Recently, genomics-assisted breeding has become a popular approach to food security; in this regard, the crop breeding community must better link the relationships between the phenotype and the genotype. While high-throughput genotyping is feasible at a low cost, highthroughput crop phenotyping methods and data analytical capacities need to be improved. High-throughput phenotyping offers a powerful way to assess particular phenotypes in large-scale experiments, using high-tech sensors, advanced robotics, and imageprocessing systems to monitor and quantify plants in breeding nurseries and field experiments at multiple scales. In addition, new bioinformatics platforms are able to embrace large-scale, multidimensional phenotypic datasets. Through the combined analysis of phenotyping and genotyping data, environmental responses and gene functions can now be dissected at unprecedented resolution. This will aid in finding solutions to currently limited and incremental improvements in crop yields

    Towards synthesis for nitrogen fertilisation using a decision support system

    Nitrogen (N) fertilisation in crops can be made more efficient by moving from uniform application to meeting variable crop requirements within fields. Within field variable rate N fertilisation of winter wheat (Triticum aestivum L.) is practically feasible using information from web-based decision support systems (DSS). Data from different source platforms, such as satellite, unmanned aerial vehicle (UAV) or weather stations can be used for fertilisation planning. System output offers information that can be used  to instruct variable rate fertilizer spreaders to increase or decrease fertilizer application rate on-the-go. In Sweden, satellite-based variable rate N fertilisation was available for winter wheat via a DSS, however, the existing module could be improved in different ways. In this thesis work, a new N-uptake model was estimated and opportunities using UAV-based modelling of grain quality were tested. Transferability of UAV-based models to a satellite data scale improved understanding of the complexity of data transfer from UAV-scale to a satellite scale for use in a DSS. Furthermore, it was possible to model crop phenology from historical data, which can improve accuracy of current implemented models, by taking timing of field operations in to account

    Remote Sensing in Agriculture: State-of-the-Art

    The Special Issue on “Remote Sensing in Agriculture: State-of-the-Art” gives an exhaustive overview of the ongoing remote sensing technology transfer into the agricultural sector. It consists of 10 high-quality papers focusing on a wide range of remote sensing models and techniques to forecast crop production and yield, to map agricultural landscape and to evaluate plant and soil biophysical features. Satellite, RPAS, and SAR data were involved. This preface describes shortly each contribution published in such Special Issue

    Analyses of the Impact of Soil Conditions and Soil Degradation on Vegetation Vitality and Crop Productivity Based on Airborne Hyperspectral VNIR–SWIR–TIR Data in a Semi-Arid Rainfed Agricultural Area (Camarena, Central Spain)

    Soils are an essential factor contributing to the agricultural production of rainfed crops such as barley and triticale cereals. Changing environmental conditions and inadequate land management are endangering soil quality and productivity and, in turn, crop quality and productivity are affected. Advances in hyperspectral remote sensing are of great use for the spatial characterization and monitoring of the soil degradation status, as well as its impact on crop growth and agricultural productivity. In this study, hyperspectral airborne data covering the visible, near-infrared, short-wave infrared, and thermal infrared (VNIR–SWIR–TIR, 0.4–12 ”m) were acquired in a Mediterranean agricultural area of central Spain and used to analyze the spatial differences in vegetation vitality and grain yield in relation to the soil degradation status. Specifically, leaf area index (LAI), crop water stress index (CWSI), and the biomass of the crop yield are derived from the remote sensing data and discussed regarding their spatial differences and relationship to a classification of erosion and accumulation stages (SEAS) based on previous remote sensing analyses during bare soil conditions. LAI and harvested crop biomass yield could be well estimated by PLS regression based on the hyperspectral and in situ reference data (R2 of 0.83, r of 0.91, and an RMSE of 0.2 m2 m−2 for LAI and an R2 of 0.85, r of 0.92, and an RMSE of 0.48 t ha−1 for grain yield). In addition, the soil erosion and accumulation stages (SEAS) were successfully predicted based on the canopy spectral signal of vegetated crop fields using a random forest machine learning approach. Overall accuracy was achieved above 71% by combining the VNIR–SWIR–TIR canopy reflectance and emissivity of the growing season with topographic information after reducing the redundancy in the spectral dataset. The results show that the estimated crop traits are spatially related to the soil’s degradation status, with shallow and highly eroded soils, as well as sandy accumulation zones being associated with areas of low LAI, crop yield, and high crop water stress. Overall, the results of this study illustrate the enormous potential of imaging spectroscopy for a combined analysis of the plant-soil system in the frame of land and soil degradation monitoring
