70 research outputs found

    Genome and Environment Based Prediction Models and Methods of Complex Traits Incorporating Genotype × Environment Interaction

    Get PDF
    Genomic-enabled prediction models are of paramount importance for the successful implementation of genomic selection (GS) based on breeding values. As opposed to animal breeding, plant breeding includes extensive multienvironment and multiyear field trial data. Hence, genomic-enabled prediction models should include genotype × environment (G × E) interaction, which most of the time increases the prediction performance when the response of lines are different from environment to environment. In this chapter, we describe a historical timeline since 2012 related to advances of the GS models that take into account G × E interaction. We describe theoretical and practical aspects of those GS models, including the gains in prediction performance when including G × E structures for both complex continuous and categorical scale traits. Then, we detailed and explained the main G × E genomic prediction models for complex traits measured in continuous and noncontinuous (categorical) scale. Related to G × E interaction models this review also examine the analyses of the information generated with high-throughput phenotype data (phenomic) and the joint analyses of multitrait and multienvironment field trial data that is also employed in the general assessment of multitrait G × E interaction. The inclusion of nongenomic data in increasing the accuracy and biological reliability of the G × E approach is also outlined. We show the recent advances in large-scale envirotyping (enviromics), and how the use of mechanistic computational modeling can derive the crop growth and development aspects useful for predicting phenotypes and explaining G × E

    Scalable Sparse Testing Genomic Selection Strategy for Early Yield Testing Stage

    Get PDF
    To enable a scalable sparse testing genomic selection (GS) strategy at preliminary yield trials in the CIMMYT maize breeding program, optimal approaches to incorporate genotype by environment interaction (GEI) in genomic prediction models are explored. Two cross-validation schemes were evaluated: CV1, predicting the genetic merit of new bi-parental populations that have been evaluated in some environments and not others, and CV2, predicting the genetic merit of half of a bi-parental population that has been phenotyped in some environments and not others using the coefficient of determination (CDmean) to determine optimized subsets of a full-sib family to be evaluated in each environment. We report similar prediction accuracies in CV1 and CV2, however, CV2 has an intuitive appeal in that all bi-parental populations have representation across environments, allowing efficient use of information across environments. It is also ideal for building robust historical data because all individuals of a full-sib family have phenotypic data, albeit in different environments. Results show that grouping of environments according to similar growing/management conditions improved prediction accuracy and reduced computational requirements, providing a scalable, parsimonious approach to multi-environmental trials and GS in early testing stages. We further demonstrate that complementing the full-sib calibration set with optimized historical data results in improved prediction accuracy for the cross-validation schemes

    Optimisation des stratégies de génétique d'association et de sélection génomique pour des populations de diversité variable : Application au maïs

    No full text
    Major progresses have been achieved in genotyping technologies, which makes it easier to decipher the relationship between genotype and phenotype. This contributed to the understanding of the genetic architecture of traits (Genome Wide Association Studies, GWAS), and to better predictions of genetic value to improve breeding efficiency (Genomic Selection, GS). The objective of this thesis was to define efficient ways of leading these approaches. We first derived analytically the power from classical GWAS mixed model and showed that it was lower for markers with a small minimum allele frequency, a strong differentiation among population subgroups and that are strongly correlated with markers used for estimating the kinship matrix K. We considered therefore two alternative estimators of K. Simulations showed that these were as efficient as classical estimators to control false positive and provided more power. We confirmed these results on true datasets collected on two maize panels, and could increase by up to 40% the number of detected associations. These panels, genotyped with a 50k SNP-array and phenotyped for flowering and biomass traits, were used to characterize the diversity of Dent and Flint groups and detect QTLs. In GS, studies highlighted the importance of relationship between the calibration set (CS) and the predicted set on the accuracy of predictions. Considering low present genotyping cost, we proposed a sampling algorithm of the CS based on the G-BLUP model, which resulted in higher accuracies than other sampling strategies for all the traits considered. It could reach the same accuracy than a randomly sampled CS with half of the phenotyping effort.D'importants progrès ont été réalisés dans les domaines du génotypage et du séquençage, ce qui permet de mieux comprendre la relation génotype/phénotype. Il est possible d'analyser l'architecture génétique des caractères (génétique d'association, GA), ou de prédire la valeur génétique des candidats à la sélection (sélection génomique, SG). L'objectif de cette thèse était de développer des outils pour mener ces stratégies de manière optimale. Nous avons d'abord dérivé analytiquement la puissance du modèle mixte de GA, et montré que la puissance était plus faible pour les marqueurs présentant une faible diversité, une forte différentiation entre sous groupes et une forte corrélation avec les marqueurs utilisés pour estimer l'apparentement (K). Nous avons donc considéré deux estimateurs alternatifs de K. Des simulations ont montré qu'ils sont aussi efficaces que la méthode classique pour contrôler les faux positifs et augmentent la puissance. Ces résultats ont été confirmés sur les panels corné et denté du programme Cornfed, avec une augmentation de 40% du nombre de SNP détectés. Ces panels, génotypés avec une puce 50k SNP et phénotypés pour leur précocité et leur biomasse ont permis de décrire la diversité de ces groupes et de détecter des QTL. En SG, des études ont montré l'importance de la composition du jeu de calibration sur la fiabilité des prédictions. Nous avons proposé un algorithme d'échantillonnage dérivé de la théorie du G-BLUP permettant de maximiser la fiabilité des prédictions. Par rapport à un échantillon aléatoire, il permettrait de diminuer de moitié l'effort de phénotypage pour atteindre une même fiabilité de prédiction sur les panels Cornfed

    Optimization of association genetics and genomic selection strategies for populations of different diversity levels : Application in maize (Zea mays L.)

    No full text
    D'importants progrès ont été réalisés dans les domaines du génotypage et du séquençage, ce qui permet de mieux comprendre la relation génotype/phénotype. Il est possible d'analyser l'architecture génétique des caractères (génétique d'association, GA), ou de prédire la valeur génétique des candidats à la sélection (sélection génomique, SG). L'objectif de cette thèse était de développer des outils pour mener ces stratégies de manière optimale. Nous avons d'abord dérivé analytiquement la puissance du modèle mixte de GA, et montré que la puissance était plus faible pour les marqueurs présentant une faible diversité, une forte différentiation entre sous groupes et une forte corrélation avec les marqueurs utilisés pour estimer l'apparentement (K). Nous avons donc considéré deux estimateurs alternatifs de K. Des simulations ont montré qu'ils sont aussi efficaces que la méthode classique pour contrôler les faux positifs et augmentent la puissance. Ces résultats ont été confirmés sur les panels corné et denté du programme Cornfed, avec une augmentation de 40% du nombre de SNP détectés. Ces panels, génotypés avec une puce 50k SNP et phénotypés pour leur précocité et leur biomasse ont permis de décrire la diversité de ces groupes et de détecter des QTL. En SG, des études ont montré l'importance de la composition du jeu de calibration sur la fiabilité des prédictions. Nous avons proposé un algorithme d'échantillonnage dérivé de la théorie du G-BLUP permettant de maximiser la fiabilité des prédictions. Par rapport à un échantillon aléatoire, il permettrait de diminuer de moitié l'effort de phénotypage pour atteindre une même fiabilité de prédiction sur les panels Cornfed.Major progresses have been achieved in genotyping technologies, which makes it easier to decipher the relationship between genotype and phenotype. This contributed to the understanding of the genetic architecture of traits (Genome Wide Association Studies, GWAS), and to better predictions of genetic value to improve breeding efficiency (Genomic Selection, GS). The objective of this thesis was to define efficient ways of leading these approaches. We first derived analytically the power from classical GWAS mixed model and showed that it was lower for markers with a small minimum allele frequency, a strong differentiation among population subgroups and that are strongly correlated with markers used for estimating the kinship matrix K. We considered therefore two alternative estimators of K. Simulations showed that these were as efficient as classical estimators to control false positive and provided more power. We confirmed these results on true datasets collected on two maize panels, and could increase by up to 40% the number of detected associations. These panels, genotyped with a 50k SNP-array and phenotyped for flowering and biomass traits, were used to characterize the diversity of Dent and Flint groups and detect QTLs. In GS, studies highlighted the importance of relationship between the calibration set (CS) and the predicted set on the accuracy of predictions. Considering low present genotyping cost, we proposed a sampling algorithm of the CS based on the G-BLUP model, which resulted in higher accuracies than other sampling strategies for all the traits considered. It could reach the same accuracy than a randomly sampled CS with half of the phenotyping effort

    Phénotypage haut débit de l'architecture du système racinaire du blé

    No full text
    Phénotypage haut débit de l'architecture du système racinaire du blé. Journée scientifique de l'Association des Sélectionneurs Françai

    Recovering power in association mapping panels with variable levels of linkage disequilibrium

    Get PDF
    Association mapping has permitted the discovery of major QTL in many species. It can be applied to existing populations and, as a consequence, it is generally necessary to take into account structure and relatedness among individuals in the statistical model to control false positives. We analytically studied power in association studies by computing noncentrality parameter of the tests and its relationship with parameters characterizing diversity (genetic differentiation between groups and allele frequencies) and kinship between individuals. Investigation of three different maize diversity panels genotyped with the 50k SNPs array highlighted contrasted average power among panels and revealed gaps of power of classical mixed models in regions with high linkage disequilibrium (LD). These gaps could be related to the fact that markers are used for both testing association and estimating relatedness. We thus considered two alternative approaches to estimating the kinship matrix to recover power in regions of high LD. In the first one, we estimated the kinship with all the markers that are not located on the same chromosome than the tested SNP. In the second one, correlation between markers was taken into account to weight the contribution of each marker to the kinship. Simulations revealed that these two approaches were efficient to control false positives and were more powerful than classical models.Peer reviewe

    Maximizing the Reliability of Genomic Selection by Optimizing the Calibration Set of Reference Individuals: Comparison of Methods in Two Diverse Groups of Maize Inbreds (Zea mays L.)

    Get PDF
    Genomic selection refers to the use of genotypic information for predicting breeding values of selection candidates. A prediction formula is calibrated with the genotypes and phenotypes of reference individuals constituting the calibration set. The size and the composition of this set are essential parameters affecting the prediction reliabilities. The objective of this study was to maximize reliabilities by optimizing the calibration set. Different criteria based on the diversity or on the prediction error variance (PEV) derived from the realized additive relationship matrix–best linear unbiased predictions model (RA–BLUP) were used to select the reference individuals. For the latter, we considered the mean of the PEV of the contrasts between each selection candidate and the mean of the population (PEVmean) and the mean of the expected reliabilities of the same contrasts (CDmean). These criteria were tested with phenotypic data collected on two diversity panels of maize (Zea mays L.) genotyped with a 50k SNPs array. In the two panels, samples chosen based on CDmean gave higher reliabilities than random samples for various calibration set sizes. CDmean also appeared superior to PEVmean, which can be explained by the fact that it takes into account the reduction of variance due to the relatedness between individuals. Selected samples were close to optimality for a wide range of trait heritabilities, which suggests that the strategy presented here can efficiently sample subsets in panels of inbred lines. A script to optimize reference samples based on CDmean is available on request.This research was jointly supported as “Cornfed project” by the French National Agency for Research (ANR), the German Federal Ministry of Education and Research (BMBF), and the Spanish Ministry of Science and Innovation (MICINN). R. Rincent is jointly funded by Limagrain, Biogemma, Kleinwanzlebener Saatzucht AG (KWS), and the Association Nationale de la Recherche et de la Technologie (ANRT).ANRBMBFMICINNKWSANRTPeer reviewe

    Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations

    No full text
    Genomic selection refers to the use of genotypic information for predicting the performance of selection candidates. It has been shown that prediction accuracy depends on various parameters including the composition of the calibration set (CS). Assessing the level of accuracy of a given prediction scenario is of highest importance because it can be used to optimize CS sampling before collecting phenotypes, and once the breeding values are predicted it informs the breeders about the reliability of these predictions. Different criteria were proposed to optimize CS sampling in highly diverse panels, which can be useful to screen collections of genotypes. But plant breeders often work on structured material such as biparental or multiparental populations, for which these criteria are less adapted. We derived from the generalized coefficient of determination (CD) theory different criteria to optimize CS sampling and to assess the reliability associated to predictions in structured populations. These criteria were evaluated on two nested association mapping (NAM) populations and two highly diverse panels of maize. They were efficient to sample optimized CS in most situations. They could also estimate at least partly the reliability associated to predictions between NAM families, but they could not estimate differences in the reliability associated to the predictions of NAM families using the highly diverse panels as calibration sets. We illustrated that the CD criteria could be adapted to various prediction scenarios including inter and intra-family predictions, resulting in higher prediction accuracies
    corecore