13 research outputs found

    QTL linkage analysis of connected populations using ancestral marker and pedigree information

    Get PDF
    The common assumption in quantitative trait locus (QTL) linkage mapping studies that parents of multiple connected populations are unrelated is unrealistic for many plant breeding programs. We remove this assumption and propose a Bayesian approach that clusters the alleles of the parents of the current mapping populations from locus-specific identity by descent (IBD) matrices that capture ancestral marker and pedigree information. Moreover, we demonstrate how the parental IBD data can be incorporated into a QTL linkage analysis framework by using two approaches: a Threshold IBD model (TIBD) and a Latent Ancestral Allele Model (LAAM). The TIBD and LAAM models are empirically tested via numerical simulation based on the structure of a commercial maize breeding program. The simulations included a pilot dataset with closely linked QTL on a single linkage group and 100 replicated datasets with five linkage groups harboring four unlinked QTL. The simulation results show that including parental IBD data (similarly for TIBD and LAAM) significantly improves the power and particularly accuracy of QTL mapping, e.g., position, effect size and individuals’ genotype probability without significantly increasing computational demand

    Mixed model approaches for the identification of QTLs within a maize hybrid breeding program

    Get PDF
    Two outlines for mixed model based approaches to quantitative trait locus (QTL) mapping in existing maize hybrid selection programs are presented: a restricted maximum likelihood (REML) and a Bayesian Markov Chain Monte Carlo (MCMC) approach. The methods use the in-silico-mapping procedure developed by Parisseaux and Bernardo (2004) as a starting point. The original single-point approach is extended to a multi-point approach that facilitates interval mapping procedures. For computational and conceptual reasons, we partition the full set of relationships from founders to parents of hybrids into two types of relations by defining so-called intermediate founders. QTL effects are defined in terms of those intermediate founders. Marker based identity by descent relationships between intermediate founders define structuring matrices for the QTL effects that change along the genome. The dimension of the vector of QTL effects is reduced by the fact that there are fewer intermediate founders than parents. Furthermore, additional reduction in the number of QTL effects follows from the identification of founder groups by various algorithms. As a result, we obtain a powerful mixed model based statistical framework to identify QTLs in genetic backgrounds relevant to the elite germplasm of a commercial breeding program. The identification of such QTLs will provide the foundation for effective marker assisted and genome wide selection strategies. Analyses of an example data set show that QTLs are primarily identified in different heterotic groups and point to complementation of additive QTL effects as an important factor in hybrid performance

    Use of crop growth models with whole-genome prediction: application to a maize multienvironment trial

    No full text
    High throughput genotyping, phenotyping, and envirotyping applied within plant breeding multienvironment trials (METs) provide the data foundations for selection and tackling genotype x environment interactions (GEIs) through whole-genome prediction (WGP). Crop growth models (CGM) can be used to enable predictions for yield and other traits for different genotypes and environments within a MET if genetic variation for the influential traits and their responses to environmental variation can be incorporated into the CGM framework. Furthermore, such CGMs can be integrated with WGP to enable wholegenome prediction with crop growth models (CGM-WGP) through use of computational methods such as approximate Bayesian computation. We previously used simulated data sets to demonstrate proof of concept for application of the CGM-WGP methodology to plant breeding METs. Here the CGM-WGP methodology is applied to an empirical maize (Zea mays L.) drought MET data set to evaluate the steps involved in reduction to practice. Positive prediction accuracy was achieved for hybrid grain yield in two drought environments for a sample of doubled haploids (DHs) from a cross. This was achieved by including genetic variation for five component traits into the CGM to enable the CGM-WGP methodology. The five component traits were a priori considered to be important for yield variation among the maize hybrids in the two target drought environments included in the MET. Here, we discuss lessons learned while applying the CGM-WGP methodology to the empirical data set. We also identify areas for further research to improve prediction accuracy and to advance the CGM-WGP for a broader range of situations relevant to plant breeding

    Integrating Crop Growth Models with Whole Genome Prediction through Approximate Bayesian Computation

    No full text
    <div><p>Genomic selection, enabled by whole genome prediction (WGP) methods, is revolutionizing plant breeding. Existing WGP methods have been shown to deliver accurate predictions in the most common settings, such as prediction of across environment performance for traits with additive gene effects. However, prediction of traits with non-additive gene effects and prediction of genotype by environment interaction (G×E), continues to be challenging. Previous attempts to increase prediction accuracy for these particularly difficult tasks employed prediction methods that are purely statistical in nature. Augmenting the statistical methods with biological knowledge has been largely overlooked thus far. Crop growth models (CGMs) attempt to represent the impact of functional relationships between plant physiology and the environment in the formation of yield and similar output traits of interest. Thus, they can explain the impact of G×E and certain types of non-additive gene effects on the expressed phenotype. <i>Approximate Bayesian computation</i> (ABC), a novel and powerful computational procedure, allows the incorporation of CGMs directly into the estimation of whole genome marker effects in WGP. Here we provide a proof of concept study for this novel approach and demonstrate its use with synthetic data sets. We show that this novel approach can be considerably more accurate than the benchmark WGP method GBLUP in predicting performance in environments represented in the estimation set as well as in previously unobserved environments for traits determined by non-additive gene effects. We conclude that this proof of concept demonstrates that using ABC for incorporating biological knowledge in the form of CGMs into WGP is a very promising and novel approach to improving prediction accuracy for some of the most challenging scenarios in plant breeding and applied genetics.</p></div

    Predicted vs. observed grain yield of 1500 DH lines in testing set for prediction methods CGM-WGP (top row) and GBLUP (bottom row).

    No full text
    <p>The estimation environment was 2012. Results shown are from a representative example data set. In this example, the accuracy for observed environment predictions was 0.83 (CGM-WGP) and 0.69 (GBLUP). For new environment predictions it was 0.39 (CGM-WGP) and 0.11 (GBLUP).</p

    Pseudocode of ABC rejection sampling algorithm.

    No full text
    <p>Basic ABC rejection sampling algorithm to sample from the approximate posterior distribution of <i>θ</i>.</p><p>Pseudocode of ABC rejection sampling algorithm.</p

    Relationship between total leaf number (TLN) and grain yield.

    No full text
    <p>Results shown are from a representative example data set.</p

    Predicting the future of plant breeding: complementing empirical evaluation with genetic prediction

    No full text
    For the foreseeable future, plant breeding methodology will continue to unfold as a practical application of the scaling of quantitative biology. These efforts to increase the effective scale of breeding programs will focus on the immediate and long-term needs of society. The foundations of the quantitative dimension will be integration of quantitative genetics, statistics, gene-to-phenotype knowledge of traits embedded within crop growth and development models. The integration will be enabled by advances in quantitative genetics methodology and computer simulation. The foundations of the biology dimension will be integrated experimental and functional gene-to-phenotype modelling approaches that advance our understanding of functional germplasm diversity, and gene-to-phenotype trait relationships for the native and transgenic variation utilised in agricultural crops. The trait genetic knowledge created will span scales of biology, extending from molecular genetics to multi-trait phenotypes embedded within evolving genotype-environment systems. The outcomes sought and successes achieved by plant breeding will be measured in terms of sustainable improvements in agricultural production of food, feed, fibre, biofuels and other desirable plant products that meet the needs of society. In this review, examples will be drawn primarily from our experience gained through commercial maize breeding. Implications for other crops, in both the private and public sectors, will be discussed

    Identity-by-Descent Matrix Decomposition Using Latent Ancestral Allele Models

    No full text
    Genetic linkage and association studies are empowered by proper modeling of relatedness among individuals. Such relatedness can be inferred from marker and/or pedigree information. In this study, the genetic relatedness among n inbred individuals at a particular locus is expressed as an n × n square matrix Q. The elements of Q are identity-by-descent probabilities, that is, probabilities that two individuals share an allele descended from a common ancestor. In this representation the definition of the ancestral alleles and their number remains implicit. For human inspection and further analysis, an explicit representation in terms of the ancestral allele origin and the number of alleles is desirable. To this purpose, we decompose the matrix Q by a latent class model with K classes (latent ancestral alleles). Let P be an n × K matrix with assignment probabilities of n individuals to K classes constrained such that every element is nonnegative and each row sums to 1. The problem then amounts to approximating Q by PPT, while disregarding the diagonal elements. This is not an eigenvalue problem because of the constraints on P. An efficient algorithm for calculating P is provided. We indicate the potential utility of the latent ancestral allele model. For representative locus-specific Q matrices constructed for a set of maize inbreds, the proposed model recovered the known ancestry
    corecore