91 research outputs found

    Genomic selection: the option for new robustness traits?

    Get PDF
    Genomic selection is rapidly becoming the state-of-the-art genetic selection methodology in dairy cattle breeding schemes around the world. The objective of this paper was to explore possibilities to apply genomic selection for traits related to dairy cow robustness. Deterministic simulations indicate that replacing progeny testing with genomic selection may favour genetic response for production traits at the expense of robustness traits, owing to a disproportional change in accuracies obtained across trait groups. Nevertheless, several options are available to improve the accuracy of genomic selection for robustness traits. Moreover, genomic selection opens up the opportunity to begin selection for new traits using specialised reference populations of limited size where phenotyping of large populations of animals is currently prohibitive. Reference populations for such traits may be nucleus-type herds, research herds or pooled data from (international) research experiments or research herds. The RobustMilk project has set an example for the latter approach, by collating international data for progesterone-based traits, feed intake and energy balance-related traits. Reference population design, both in terms of relatedness of the animals and variability in phenotypic performance, is important to optimise the accuracy of genomic selection. Use of indicator traits, combined with multi-trait genomic prediction models, can further contribute to improved accuracy of genomic prediction for robustness traits. Experience to date indicates that for newly recorded robustness traits that are negatively correlated with the main breeding goal, cow reference populations of =10 000 are required when genotyping is based on medium- or high-density single-nucleotide polymorphism arrays. Further genotyping advances (e.g. sequencing) combined with post-genomics technologies will enhance the opportunities for (genomic) selection to improve cow robustness

    Adding gene transcripts into genomic prediction improves accuracy and reveals sampling time dependence.

    Get PDF
    Recent developments allowed generating multiple high-quality \u27omics\u27 data that could increase the predictive performance of genomic prediction for phenotypes and genetic merit in animals and plants. Here, we have assessed the performance of parametric and nonparametric models that leverage transcriptomics in genomic prediction for 13 complex traits recorded in 478 animals from an outbred mouse population. Parametric models were implemented using the best linear unbiased prediction, while nonparametric models were implemented using the gradient boosting machine algorithm. We also propose a new model named GTCBLUP that aims to remove between-omics-layer covariance from predictors, whereas its counterpart GTBLUP does not do that. While gradient boosting machine models captured more phenotypic variation, their predictive performance did not exceed the best linear unbiased prediction models for most traits. Models leveraging gene transcripts captured higher proportions of the phenotypic variance for almost all traits when these were measured closer to the moment of measuring gene transcripts in the liver. In most cases, the combination of layers was not able to outperform the best single-omics models to predict phenotypes. Using only gene transcripts, the gradient boosting machine model was able to outperform best linear unbiased prediction for most traits except body weight, but the same pattern was not observed when using both single nucleotide polymorphism genotypes and gene transcripts. Although the GTCBLUP model was not able to produce the most accurate phenotypic predictions, it showed the highest accuracies for breeding values for 9 out of 13 traits. We recommend using the GTBLUP model for prediction of phenotypes and using the GTCBLUP for prediction of breeding values

    Prediction performance of linear models and gradient boosting machine on complex phenotypes in outbred mice.

    Get PDF
    We compared the performance of linear (GBLUP, BayesB, and elastic net) methods to a nonparametric tree-based ensemble (gradient boosting machine) method for genomic prediction of complex traits in mice. The dataset used contained genotypes for 50,112 SNP markers and phenotypes for 835 animals from 6 generations. Traits analyzed were bone mineral density, body weight at 10, 15, and 20 weeks, fat percentage, circulating cholesterol, glucose, insulin, triglycerides, and urine creatinine. The youngest generation was used as a validation subset, and predictions were based on all older generations. Model performance was evaluated by comparing predictions for animals in the validation subset against their adjusted phenotypes. Linear models outperformed gradient boosting machine for 7 out of 10 traits. For bone mineral density, cholesterol, and glucose, the gradient boosting machine model showed better prediction accuracy and lower relative root mean squared error than the linear models. Interestingly, for these 3 traits, there is evidence of a relevant portion of phenotypic variance being explained by epistatic effects. Using a subset of top markers selected from a gradient boosting machine model helped for some of the traits to improve the accuracy of prediction when these were fitted into linear and gradient boosting machine models. Our results indicate that gradient boosting machine is more strongly affected by data size and decreased connectedness between reference and validation sets than the linear models. Although the linear models outperformed gradient boosting machine for the polygenic traits, our results suggest that gradient boosting machine is a competitive method to predict complex traits with assumed epistatic effects

    QTLMAS 2009: simulated dataset

    Get PDF
    Background - The simulation of the data for the QTLMAS 2009 Workshop is described. Objective was to simulate observations from a growth curve which was influenced by a number of QTL. Results - The data consisted of markers, phenotypes and pedigree. Genotypes of 453 markers, distributed over 5 chromosomes of 1 Morgan each, were simulated for 2,025 individuals. From those, 25 individuals were parents of the other 2,000 individuals. The 25 parents were genetically related. Phenotypes were simulated according to a logistic growth curve and were made available for 1,000 of the 2,000 offspring individuals. The logistic growth curve was specified by three parameters. Each parameter was influenced by six Quantitative Trait Loci (QTL), positioned at the five chromosomes. For each parameter, one QTL had a large effect and five QTL had small effects. Variance of large QTL was five times the variance of small QTL. Simulated data was made available at http://www.qtlmas2009.wur.nl/UK/Dataset

    Comparison of analyses of the QTLMAS XIII common dataset. I: genomic selection

    Get PDF
    Background - Genomic selection, the use of markers across the whole genome, receives increasing amounts of attention and is having more and more impact on breeding programs. Development of statistical and computational methods to estimate breeding values based on markers is a very active area of research. A simulated dataset was analyzed by participants of the QTLMAS XIII workshop, allowing a comparison of the ability of different methods to estimate genomic breeding values. Methods - A best case scenario was analyzed by the organizers where QTL genotypes were known. Participants submitted estimated breeding values for 1000 unphenotyped individuals together with a description of the applied method(s). The submitted breeding values were evaluated for correlation with the simulated values (accuracy), rank correlation of the best 10% of individuals and error in predictions. Bias was tested by regression of simulated on estimated breeding values. Results - The accuracy obtained from the best case scenario was 0.94. Six research groups submitted 19 sets of estimated breeding values. Methods that assumed the same variance for markers showed accuracies, measured as correlations between estimated and simulated values, ranging from 0.75 to 0.89 and rank correlations between 0.58 and 0.70. Methods that allowed different marker variances showed accuracies ranging from 0.86 to 0.94 and rank correlations between 0.69 and 0.82. Methods assuming equal marker variances were generally more biased and showed larger prediction errors. Conclusions - The best performing methods achieved very high accuracies, close to accuracies achieved in a best case scenario where QTL genotypes were known without error. Methods that allowed different marker variances generally outperformed methods that assumed equal marker variances. Genomic selection methods performed well compared to traditional, pedigree only, methods; all methods showed higher accuracies than those obtained for breeding values estimated solely on pedigree relationship

    Across population genomic prediction scenarios in which Bayesian variable selection outperforms GBLUP

    Get PDF
    <p>Background: The use of information across populations is an attractive approach to increase the accuracy of genomic prediction for numerically small populations. However, accuracies of across population genomic prediction, in which reference and selection individuals are from different populations, are currently disappointing. It has been shown for within population genomic prediction that Bayesian variable selection models outperform GBLUP models when the number of QTL underlying the trait is low. Therefore, our objective was to identify across population genomic prediction scenarios in which Bayesian variable selection models outperform GBLUP in terms of prediction accuracy. In this study, high density genotype information of 1033 Holstein Friesian, 105 Groningen White Headed, and 147 Meuse-Rhine-Yssel cows were used. Phenotypes were simulated using two changing variables: (1) the number of QTL underlying the trait (3000, 300, 30, 3), and (2) the correlation between allele substitution effects of QTL across populations, i.e. the genetic correlation of the simulated trait between the populations (1.0, 0.8, 0.4). Results: The accuracy obtained by the Bayesian variable selection model was depending on the number of QTL underlying the trait, with a higher accuracy when the number of QTL was lower. This trend was more pronounced for across population genomic prediction than for within population genomic prediction. It was shown that Bayesian variable selection models have an advantage over GBLUP when the number of QTL underlying the simulated trait was small. This advantage disappeared when the number of QTL underlying the simulated trait was large. The point where the accuracy of Bayesian variable selection and GBLUP became similar was approximately the point where the number of QTL was equal to the number of independent chromosome segments (M <sub> e </sub>) across the populations. Conclusion: Bayesian variable selection models outperform GBLUP when the number of QTL underlying the trait is smaller than M <sub> e </sub>. Across populations, M <sub>e</sub> is considerably larger than within populations. So, it is more likely to find a number of QTL underlying a trait smaller than M <sub>e</sub> across populations than within population. Therefore Bayesian variable selection models can help to improve the accuracy of across population genomic prediction.</p

    Genomic and pedigree-based genetic parameters for scarcely recorded traits when some animals are genotyped

    Get PDF
    Genetic parameters were estimated using relationships between animals that were based either on pedigree, 43,011 single nucleotide polymorphisms, or a combination of these, considering genotyped and non-genotyped animals. The standard error of the estimates and a parametric bootstrapping procedure was used to investigate sampling properties of the estimated variance components. The data set contained milk yield, dry matter intake and body weight for 517 first-lactation heifers with genotypes and phenotypes, and another 112 heifers with phenotypes only. Multivariate models were fitted using the different relationships in ASReml software. Estimates of genetic variance were lower based on genomic relationships than using pedigree relationships. Genetic variances from genomic and pedigree relationships were, however, not directly comparable because they apply to different base populations. Standard errors indicated that using the genomic relationships gave more accurate estimates of heritability but equally accurate estimates of genetic correlation. However, the estimates of standard errors were affected by the differences in scale between the 2 relationship matrices, causing differences in values of the genetic parameters. The bootstrapping results (with genetic parameters at the same level), confirmed that both heritability and genetic correlations were estimated more accurately with genomic relationships in comparison with using the pedigree relationships. Animals without genotype were included in the analysis by merging genomic and pedigree relationships. This allowed all phenotypes to be used, including those from non-genotyped animals. This combination of genomic and pedigree relationships gave the most accurate estimates of genetic variance. When a small data set is available it might be more advantageous for the estimation of genetic parameters to genotype existing animals, rather than collecting more phenotypes

    Genomic prediction of dry matter intake in dairy cattle from an international data set consisting of research herds in Europe, North America, and Australasia

    Get PDF
    peer-reviewedFinancial support for gDMI from CRV (Arnhem, the Netherlands), ICBF (Cork, Ireland), CONAFE (Madrid, Spain), DairyCo (Warwickshire, UK) directly to the gDMI consortium, and The Natural Science and Engineering Research Council of Canada and DairyGen Council of Canadian Dairy Network (Guelph, ON, Canada) is gratefully appreciated, as well as the EU FP7 IRSES SEQSEL (Grant no. 317697).With the aim of increasing the accuracy of genomic estimated breeding values for dry matter intake (DMI) in Holstein-Friesian dairy cattle, data from 10 research herds in Europe, North America, and Australasia were combined. The DMI records were available on 10,701 parity 1 to 5 records from 6,953 cows, as well as on 1,784 growing heifers. Predicted DMI at 70 d in milk was used as the phenotype for the lactating animals, and the average DMI measured during a 60- to 70-d test period at approximately 200 d of age was used as the phenotype for the growing heifers. After editing, there were 583,375 genetic markers obtained from either actual high-density single nucleotide polymorphism (SNP) genotypes or imputed from 54,001 marker SNP genotypes. Genetic correlations between the populations were estimated using genomic REML. The accuracy of genomic prediction was evaluated for the following scenarios: (1) within-country only, by fixing the correlations among populations to zero, (2) using near-unity correlations among populations and assuming the same trait in each population, and (3) a sharing data scenario using estimated genetic correlations among populations. For these 3 scenarios, the data set was divided into 10 sub-populations stratified by progeny group of sires; 9 of these sub-populations were used (in turn) for the genomic prediction and the tenth was used for calculation of the accuracy (correlation adjusted for heritability). A fourth scenario to quantify the benefit for countries that do not record DMI was investigated (i.e., having an entire country as the validation population and excluding this country in the development of the genomic predictions). The optimal scenario, which was sharing data, resulted in a mean prediction accuracy of 0.44, ranging from 0.37 (Denmark) to 0.54 (the Netherlands). Assuming near-unity among-country genetic correlations, the mean accuracy of prediction dropped to 0.40, and the mean within-country accuracy was 0.30. If no records were available in a country, the accuracy based on the other populations ranged from 0.23 to 0.53 for the milking cows, but were only 0.03 and 0.19 for Australian and New Zealand heifers, respectively; the overall mean prediction accuracy was 0.37. Therefore, there is a benefit in collaboration, because phenotypic information for DMI from other countries can be used to augment the accuracy of genomic evaluations of individual countries.financial support for gDMI from CRV (Arnhem, the Netherlands), ICBF (Cork, Ireland), CONAFE (Madrid, Spain), DairyCo (Warwickshire, UK) directly to the gDMI consortium, and The Natural Science and Engineering Research Council of Canada and DairyGen Council of Canadian Dairy Network (Guelph, ON, Canada) is gratefully appreciated, as well as the EU FP7 IRSES SEQSEL (Grant no. 317697).financial support for gDMI from CRV (Arnhem, the Netherlands), ICBF (Cork, Ireland), CONAFE (Madrid, Spain), DairyCo (Warwickshire, UK) directly to the gDMI consortium, and The Natural Science and Engineering Research Council of Canada and DairyGen Council of Canadian Dairy Network (Guelph, ON, Canada) is gratefully appreciated, as well as the EU FP7 IRSES SEQSEL (Grant no. 317697)
    • …
    corecore