16 research outputs found

    Reconstruction of Networks with Direct and Indirect Genetic Effects

    Get PDF
    Genetic variance of a phenotypic trait can originate from direct genetic effects, or from indirect effects, i.e., through genetic effects on other traits, affecting the trait of interest. This distinction is often of great importance, for example, when trying to improve crop yield and simultaneously control plant height. As suggested by Sewall Wright, assessing contributions of direct and indirect effects requires knowledge of (1) the presence or absence of direct genetic effects on each trait, and (2) the functional relationships between the traits. Because experimental validation of such relationships is often unfeasible, it is increasingly common to reconstruct them using causal inference methods. However, most current methods require all genetic variance to be explained by a small number of quantitative trait loci (QTL) with fixed effects. Only a few authors have considered the “missing heritability” case, where contributions of many undetectable QTL are modeled with random effects. Usually, these are treated as nuisance terms that need to be eliminated by taking residuals from a multi-trait mixed model (MTM). But fitting such an MTM is challenging, and it is impossible to infer the presence of direct genetic effects. Here, we propose an alternative strategy, where genetic effects are formally included in the graph. This has important advantages: (1) genetic effects can be directly incorporated in causal inference, implemented via our PCgen algorithm, which can analyze many more traits; and (2) we can test the existence of direct genetic effects, and improve the orientation of edges between traits. Finally, we show that reconstruction is much more accurate if individual plant or plot data are used, instead of genotypic means. We have implemented the PCgen-algorithm in the R-package pcgen.</p

    Acción : diario de Teruel y su provincia: Año III Número 633 - (11/12/34)

    Get PDF
    New types of phenotyping tools generate large amounts of data on many aspects of plant physiology and morphology with high spatial and temporal resolution. These new phenotyping data are potentially useful to improve understanding and prediction of complex traits, like yield, that are characterized by strong environmental context dependencies, i.e., genotype by environment interactions. For an evaluation of the utility of new phenotyping information, we will look at how this information can be incorporated in different classes of genotype-to-phenotype (G2P) models. G2P models predict phenotypic traits as functions of genotypic and environmental inputs. In the last decade, access to high-density single nucleotide polymorphism markers (SNPs) and sequence information has boosted the development of a class of G2P models called genomic prediction models that predict phenotypes from genome wide marker profiles. The challenge now is to build G2P models that incorporate simultaneously extensive genomic information alongside with new phenotypic information. Beyond the modification of existing G2P models, new G2P paradigms are required. We present candidate G2P models for the integration of genomic and new phenotyping information and illustrate their use in examples. Special attention will be given to the modelling of genotype by environment interactions. The G2P models provide a framework for model based phenotyping and the evaluation of the utility of phenotyping information in the context of breeding programs.</p

    Exome sequences and multi-environment field trials elucidate the genetic basis of adaptation in barley

    Get PDF
    Broadening the genetic base of crops is crucial for developing varieties to respond to global agricultural challenges such as climate change. Here, we analysed a diverse panel of 371 domesticated lines of the model crop of barley to explore the genetics of crop adaptation. We first collected exome sequence data and phenotypes of key life history traits from contrasting multi-environment common garden trials. Then we applied refined statistical methods, including based on exomic haplotype states, for genotype-by-environment (G 7E) modelling. Sub-populations defined from exomic profiles were coincident with barley's biology, geography and history, and explained a high proportion of trial phenotypic variance. Clear G 7E interactions indicated adaptation profiles that varied for landraces and cultivars. Exploration of circadian clock-related genes, associated with the environmentally-adaptive days to heading trait (crucial for the crop's spread from the Fertile Crescent), illustrated complexities in G 7E effect directions, and the importance of latitudinally-based genic context in the expression of large effect alleles. Our analysis supports a gene-level scientific understanding of crop adaption and leads to practical opportunities for crop improvement, allowing the prioritisation of genomic regions and particular sets of lines for breeding efforts seeking to cope with climate change and other stresses

    Modelling of genotype by environment interaction and prediction of complex traits across multiple environments as a synthesis of crop growth modelling, genetics and statistics

    Get PDF
    The main objective of plant breeders is to create and identify genotypes that are well-adapted to the target population of environments (TPE). The TPE corresponds to the future growing conditions in which the varieties produced by a breeding program will be grown. All possible genotypes that could be considered as selection candidates for a specific TPE are said to belong to the target population of genotypes, TPG. Genotypes commonly show different sensitivities to environmental gradients and then genotype by environment interaction (GxE) is observed. GxE can lead to changes in genotypic ranking, complicating the breeding process. The main aim of this thesis was to investigate statistical models and the combination of statistical and crop growth models to improve phenotype prediction across multiple environments. One aspect that determines the quality of phenotype prediction is the set of genotypes used to train the prediction model, especially when the TPG is structured. We proposed a method that uniformly covers the genetic space of the TPG, leading to a larger prediction accuracy than random sampling. We produced positive results for wheat, maize and rice. A second aspect that influences the accuracy of phenotype predictions is the choice of environments used to train the prediction model, which should capture the heterogeneity in the TPE. When accounting for heterogeneity in environmental quality, it is important to distinguish between repeatable and well predictable elements in the environmental conditions from those that are badly predictable. We proposed statistical methods based on the AMMI model and on mixed models to identify groups of environments that show repeatable GxE, illustrating our ideas with multi-environment wheat data in North-Western Europe. The importance of training set construction strategies and multi-environment genomic prediction models was also demonstrated for barley data. If breeders are interested in identifying the genetic basis of the target traits, it is advantageous to have a higher SNP density. In this thesis, we used exome sequence data of the EU-Whealbi-barley germplasm, which corresponds to a unique set of genotypes with a diverse origin, growth habit and breeding history. For this diverse data, we assessed the effects of QTLs and haplotypes across multiple environments for awn length, grain weight, heading date and plant height. Our results show that the EU-Whealbi-barley collection possesses a large diversity of promising alleles regulating the four traits we analysed. The last major topic addressed in this thesis is the use of a combination of statistical-genetic models and crop growth models (APSIM) as a strategy to assess the traits and phenotyping schemes to improve the prediction accuracy of a target trait like yield. We assess the potential of the combined modelling approach to characterize a sample of the TPG and TPE, and illustrate how trait correlations are modified by environmental conditions and by the genetic architecture of the sample of the TPE. We discuss the topics mentioned above, from a didactical perspective, proposing a list of subjects that should be covered in a GxE course for plant breeders. Finally, we discuss challenges and opportunities presented by the characterization of the TPE and TPG when using simulations based on statistical and crop growth models.</p

    Modelling of Genotype by Environment Interaction and Prediction of Complex Traits across Multiple Environments as a Synthesis of Crop Growth Modelling, Genetics and Statistics

    No full text
    Selection processes in plant breeding depend critically on the quality of phenotype predictions. The phenotype is classically predicted as a function of genotypic and environmental information. Models for phenotype prediction contain a mixture of statistical, genetic and physiological elements. In this chapter, we discuss prediction from linear mixed models (LMMs), with an emphasis on statistics, and prediction from crop growth models (CGMs), with an emphasis on physiology. Three modalities of prediction are distinguished: predictions for new genotypes under known environmental conditions, predictions for known genotypes under new environmental conditions, and predictions for new genotypes under new environmental conditions. For LMMs, the genotypic input information includes molecular marker variation, while the environmental input can consist of meteorological, soil and management variables. However, integrated types of environmental characterizations obtained from CGMs can also serve as environmental covariable in LMMs. LMMs consist of a fixed part, corresponding to the mean for a particular genotype in a particular environment, and a random part defined by genotypic and environmental variances and correlations. For prediction via the fixed part, genotypic and/or environmental covariables are required as in classical regression. For predictions via the random part, correlations need to be estimated between observed and new genotypes, between observed and new environments, or both. These correlations can be based on similarities calculated from genotypic and environmental covariables. A simple type of covariable assigns genotypes to sub-populations and environments to regions. Such groupings can improve phenotype prediction. For a second type of phenotype prediction, we consider CGMs. CGMs predict a target phenotype as a non-linear function of underlying intermediate phenotypes. The intermediate phenotypes are outcomes of functions defined on genotype dependent CGM parameters and classical environmental descriptors. While the intermediate phenotypes may still show some genotype by environment interaction, the genotype dependent CGM parameters should be consistent across environmental conditions. The CGM parameters are regressed on molecular marker information to allow phenotype prediction from molecular marker information and standard physiologically relevant environmental information. Both LMMs and CGMs require extensive characterization of genotypes and environments. High-throughput technologies for genotyping and phenotyping provide new opportunities for upscaling phenotype prediction and increasing the response to selection in the breeding process

    Nonlinear Observability Analysis and Joint State and Parameter Estimation in a Lettuce Greenhouse using Ensemble Kalman Filtering

    No full text
    Estimating crop states accurately and reliably through climate sensing is a promising alternative for high tech crop sensing investments. This paper explores and demonstrates the applicability of joint crop parameter and crop state estimation through indoor climate monitoring in a lettuce greenhouse system via Ensemble Kalman filtering combined with a nonlinear observability analysis via the empirical observability Gramian. The observability analysis indicated that crop dry-weight can be estimated from the indoor CO2concentration, temperature and humidity, while simultaneously the parameter that represents the light use efficiency can be estimated and even corrected for. These outcomes were confirmed by a simulation study. This showed that the method is robust against one level of process and measurement noise, and a 50 % error in the model parameter that represents the light use efficiency. More precisely, it has been shown that improvements of 50 % of the dry-weight estimation in terms of average root mean squared error can be achieved with respect to the case where no Ensemble Kalman filtering and parameter update is used

    Improvement of predictive ability by uniform coverage of the target genetic space

    No full text
    Genome-enabled prediction provides breeders with the means to increase the number of genotypes that can be evaluated for selection. One of the major challenges in genome-enabled prediction is how to construct a training set of genotypes from a calibration set that represents the target population of genotypes, where the calibration set is composed of a training and validation set. A random sampling protocol of genotypes from the calibration set will lead to low quality coverage of the total genetic space by the training set when the calibration set contains population structure. As a consequence, predictive ability will be affected negatively, because some parts of the genotypic diversity in the target population will be under-represented in the training set, whereas other parts will be over-represented. Therefore, we propose a training set construction method that uniformly samples the genetic space spanned by the target population of genotypes, thereby increasing predictive ability. To evaluate our method, we constructed training sets alongside with the identification of corresponding genomic prediction models for four genotype panels that differed in the amount of population structure they contained (maize Flint, maize Dent, wheat, and rice). Training sets were constructed using uniform sampling, stratified-uniform sampling, stratified sampling and random sampling. We compared these methods with a method that maximizes the generalized coefficient of determination (CD). Several training set sizes were considered. We investigated four genomic prediction models: multi-locus QTL models, GBLUP models, combinations of QTL and GBLUPs, and Reproducing Kernel Hilbert Space (RKHS) models. For the maize and wheat panels, construction of the training set under uniform sampling led to a larger predictive ability than under stratified and random sampling. The results of our methods were similar to those of the CDmethod. For the rice panel, all training set constructionmethods led to similar predictive ability, a reflection of the very strong population structure in this panel

    Predicting responses in multiple environments : Issues in relation to genotype × Environment interactions

    No full text
    Prediction of the phenotypes for a set of genotypes across multiple environments is a fundamental task in any plant breeding program. Genomic prediction (GP) can assist selection decisions by combining incomplete phenotypic information over multiple environments (MEs) with dense sets of markers. We compared a range of ME-GP models differing in the way environment-specific genetic effects were modeled. Information among environments was shared either implicitly via the response variable, or by the introduction of explicit environmental covariables. We discuss the models not only in the light of their accuracy, but also in their ability to predict the different parts of the incomplete genotype × environment interaction (G × E) table: (Gt; Et), (Gu; Et), (Gt; Eu), and (Gu; Eu), where G is genotype, E is environment, both tested (t; in one or more instances) and untested (u). Using the ‘Steptoe’ × ‘Morex’ barley (Hordeum vulgare L.) population as an example, we show the advantage of ME-GP models that account for G × E. In addition, for our example data set, we show that for prediction in the most challenging scenario of untested environments (Eu), the use of explicit environmental information is preferable over the simpler approach of predicting from a main effects model. Besides producing the most general ME-GP model, the use of environmental covariables naturally links with ecophysiological and crop-growth models (CGMs) for G × E. We conclude with a list of future research topics in ME-GP, where we see CGMs playing a central role.</p

    Combining crop growth modeling and statistical genetic modeling to evaluate phenotyping strategies

    No full text
    Genomic prediction of complex traits, say yield, benefits from including information on correlated component traits. Statistical criteria to decide which yield components to consider in the prediction model include the heritability of the component traits and their genetic correlation with yield. Not all component traits are easy to measure. Therefore, it may be attractive to include proxies to yield components, where these proxies are measured in (high-throughput) phenotyping platforms during the growing season. Using the Agricultural Production Systems Simulator (APSIM)-wheat cropping systems model, we simulated phenotypes for a wheat diversity panel segregating for a set of physiological parameters regulating phenology, biomass partitioning, and the ability to capture environmental resources. The distribution of the additive quantitative trait locus effects regulating the APSIM physiological parameters approximated the same distribution of quantitative trait locus effects on real phenotypic data for yield and heading date. We use the crop growth model APSIM-wheat to simulate phenotypes in three Australian environments with contrasting water deficit patterns. The APSIM output contained the dynamics of biomass and canopy cover, plus yield at the end of the growing season. Each water deficit pattern triggered different adaptive mechanisms and the impact of component traits differed between drought scenarios. We evaluated multiple phenotyping schedules by adding plot and measurement error to the dynamics of biomass and canopy cover. We used these trait dynamics to fit parametric models and P-splines to extract parameters with a larger heritability than the phenotypes at individual time points. We used those parameters in multi-trait prediction models for final yield. The combined use of crop growth models and multi-trait genomic prediction models provides a procedure to assess the efficiency of phenotyping strategies and compare methods to model trait dynamics. It also allows us to quantify the impact of yield components on yield prediction accuracy even in different environment types. In scenarios with mild or no water stress, yield prediction accuracy benefitted from including biomass and green canopy cover parameters. The advantage of the multi-trait model was smaller for the early-drought scenario, due to the reduced correlation between the secondary and the target trait. Therefore, multi-trait genomic prediction models for yield require scenario-specific correlated traits

    Genotype-specific P-spline response surfaces assist interpretation of regional wheat adaptation to climate change

    No full text
    Yield is a function of environmental quality and the sensitivity with which genotypes react to that. Environmental quality is characterized by meteorological data, soil and agronomic management, whereas genotypic sensitivity is embodied by combinations of physiological traits that determine the crop capture and partitioning of environmental resources over time. This paper illustrates how environmental quality and genotype responses can be studied by a combination of crop simulation and statistical modelling. We characterized the genotype by environment interaction for grain yield of a wheat population segregating for flowering time by simulating it using the the Agricultural Production Systems sIMulator (APSIM) cropping systems model. For sites in the NE Australian wheat-belt, we used meteorological information as integrated by APSIM to classify years according to water, heat and frost stress. Results highlight that the frequency of years with more severe water and temperature stress has largely increased in recent years. Consequently, it is likely that future varieties will need to cope with more stressful conditions than in the past, making it important to select for flowering habits contributing to temperature and water-stress adaptation. Conditional on year types, we fitted yield response surfaces as functions of genotype, latitude and longitude to virtual multi-environment trials. Response surfaces were fitted by two-dimensional P-splines in a mixed-model framework to predict yield at high spatial resolution. Predicted yields demonstrated how relative genotype performance changed with location and year type and how genotype by environment interactions can be dissected. Predicted response surfaces for yield can be used for performance recommendations, quantification of yield stability and environmental characterization
    corecore