95 research outputs found
Genomic selection: the option for new robustness traits?
Genomic selection is rapidly becoming the state-of-the-art genetic selection methodology in dairy cattle breeding schemes around the world. The objective of this paper was to explore possibilities to apply genomic selection for traits related to dairy cow robustness. Deterministic simulations indicate that replacing progeny testing with genomic selection may favour genetic response for production traits at the expense of robustness traits, owing to a disproportional change in accuracies obtained across trait groups. Nevertheless, several options are available to improve the accuracy of genomic selection for robustness traits. Moreover, genomic selection opens up the opportunity to begin selection for new traits using specialised reference populations of limited size where phenotyping of large populations of animals is currently prohibitive. Reference populations for such traits may be nucleus-type herds, research herds or pooled data from (international) research experiments or research herds. The RobustMilk project has set an example for the latter approach, by collating international data for progesterone-based traits, feed intake and energy balance-related traits. Reference population design, both in terms of relatedness of the animals and variability in phenotypic performance, is important to optimise the accuracy of genomic selection. Use of indicator traits, combined with multi-trait genomic prediction models, can further contribute to improved accuracy of genomic prediction for robustness traits. Experience to date indicates that for newly recorded robustness traits that are negatively correlated with the main breeding goal, cow reference populations of =10 000 are required when genotyping is based on medium- or high-density single-nucleotide polymorphism arrays. Further genotyping advances (e.g. sequencing) combined with post-genomics technologies will enhance the opportunities for (genomic) selection to improve cow robustness
Adding gene transcripts into genomic prediction improves accuracy and reveals sampling time dependence.
Recent developments allowed generating multiple high-quality \u27omics\u27 data that could increase the predictive performance of genomic prediction for phenotypes and genetic merit in animals and plants. Here, we have assessed the performance of parametric and nonparametric models that leverage transcriptomics in genomic prediction for 13 complex traits recorded in 478 animals from an outbred mouse population. Parametric models were implemented using the best linear unbiased prediction, while nonparametric models were implemented using the gradient boosting machine algorithm. We also propose a new model named GTCBLUP that aims to remove between-omics-layer covariance from predictors, whereas its counterpart GTBLUP does not do that. While gradient boosting machine models captured more phenotypic variation, their predictive performance did not exceed the best linear unbiased prediction models for most traits. Models leveraging gene transcripts captured higher proportions of the phenotypic variance for almost all traits when these were measured closer to the moment of measuring gene transcripts in the liver. In most cases, the combination of layers was not able to outperform the best single-omics models to predict phenotypes. Using only gene transcripts, the gradient boosting machine model was able to outperform best linear unbiased prediction for most traits except body weight, but the same pattern was not observed when using both single nucleotide polymorphism genotypes and gene transcripts. Although the GTCBLUP model was not able to produce the most accurate phenotypic predictions, it showed the highest accuracies for breeding values for 9 out of 13 traits. We recommend using the GTBLUP model for prediction of phenotypes and using the GTCBLUP for prediction of breeding values
Prediction performance of linear models and gradient boosting machine on complex phenotypes in outbred mice.
We compared the performance of linear (GBLUP, BayesB, and elastic net) methods to a nonparametric tree-based ensemble (gradient boosting machine) method for genomic prediction of complex traits in mice. The dataset used contained genotypes for 50,112 SNP markers and phenotypes for 835 animals from 6 generations. Traits analyzed were bone mineral density, body weight at 10, 15, and 20 weeks, fat percentage, circulating cholesterol, glucose, insulin, triglycerides, and urine creatinine. The youngest generation was used as a validation subset, and predictions were based on all older generations. Model performance was evaluated by comparing predictions for animals in the validation subset against their adjusted phenotypes. Linear models outperformed gradient boosting machine for 7 out of 10 traits. For bone mineral density, cholesterol, and glucose, the gradient boosting machine model showed better prediction accuracy and lower relative root mean squared error than the linear models. Interestingly, for these 3 traits, there is evidence of a relevant portion of phenotypic variance being explained by epistatic effects. Using a subset of top markers selected from a gradient boosting machine model helped for some of the traits to improve the accuracy of prediction when these were fitted into linear and gradient boosting machine models. Our results indicate that gradient boosting machine is more strongly affected by data size and decreased connectedness between reference and validation sets than the linear models. Although the linear models outperformed gradient boosting machine for the polygenic traits, our results suggest that gradient boosting machine is a competitive method to predict complex traits with assumed epistatic effects
QTLMAS 2009: simulated dataset
Background - The simulation of the data for the QTLMAS 2009 Workshop is described. Objective was to simulate observations from a growth curve which was influenced by a number of QTL. Results - The data consisted of markers, phenotypes and pedigree. Genotypes of 453 markers, distributed over 5 chromosomes of 1 Morgan each, were simulated for 2,025 individuals. From those, 25 individuals were parents of the other 2,000 individuals. The 25 parents were genetically related. Phenotypes were simulated according to a logistic growth curve and were made available for 1,000 of the 2,000 offspring individuals. The logistic growth curve was specified by three parameters. Each parameter was influenced by six Quantitative Trait Loci (QTL), positioned at the five chromosomes. For each parameter, one QTL had a large effect and five QTL had small effects. Variance of large QTL was five times the variance of small QTL. Simulated data was made available at http://www.qtlmas2009.wur.nl/UK/Dataset
Comparison of analyses of the QTLMAS XIII common dataset. I: genomic selection
Background - Genomic selection, the use of markers across the whole genome, receives increasing amounts of attention and is having more and more impact on breeding programs. Development of statistical and computational methods to estimate breeding values based on markers is a very active area of research. A simulated dataset was analyzed by participants of the QTLMAS XIII workshop, allowing a comparison of the ability of different methods to estimate genomic breeding values. Methods - A best case scenario was analyzed by the organizers where QTL genotypes were known. Participants submitted estimated breeding values for 1000 unphenotyped individuals together with a description of the applied method(s). The submitted breeding values were evaluated for correlation with the simulated values (accuracy), rank correlation of the best 10% of individuals and error in predictions. Bias was tested by regression of simulated on estimated breeding values. Results - The accuracy obtained from the best case scenario was 0.94. Six research groups submitted 19 sets of estimated breeding values. Methods that assumed the same variance for markers showed accuracies, measured as correlations between estimated and simulated values, ranging from 0.75 to 0.89 and rank correlations between 0.58 and 0.70. Methods that allowed different marker variances showed accuracies ranging from 0.86 to 0.94 and rank correlations between 0.69 and 0.82. Methods assuming equal marker variances were generally more biased and showed larger prediction errors. Conclusions - The best performing methods achieved very high accuracies, close to accuracies achieved in a best case scenario where QTL genotypes were known without error. Methods that allowed different marker variances generally outperformed methods that assumed equal marker variances. Genomic selection methods performed well compared to traditional, pedigree only, methods; all methods showed higher accuracies than those obtained for breeding values estimated solely on pedigree relationship
Across population genomic prediction scenarios in which Bayesian variable selection outperforms GBLUP
<p>Background: The use of information across populations is an attractive approach to increase the accuracy of genomic prediction for numerically small populations. However, accuracies of across population genomic prediction, in which reference and selection individuals are from different populations, are currently disappointing. It has been shown for within population genomic prediction that Bayesian variable selection models outperform GBLUP models when the number of QTL underlying the trait is low. Therefore, our objective was to identify across population genomic prediction scenarios in which Bayesian variable selection models outperform GBLUP in terms of prediction accuracy. In this study, high density genotype information of 1033 Holstein Friesian, 105 Groningen White Headed, and 147 Meuse-Rhine-Yssel cows were used. Phenotypes were simulated using two changing variables: (1) the number of QTL underlying the trait (3000, 300, 30, 3), and (2) the correlation between allele substitution effects of QTL across populations, i.e. the genetic correlation of the simulated trait between the populations (1.0, 0.8, 0.4). Results: The accuracy obtained by the Bayesian variable selection model was depending on the number of QTL underlying the trait, with a higher accuracy when the number of QTL was lower. This trend was more pronounced for across population genomic prediction than for within population genomic prediction. It was shown that Bayesian variable selection models have an advantage over GBLUP when the number of QTL underlying the simulated trait was small. This advantage disappeared when the number of QTL underlying the simulated trait was large. The point where the accuracy of Bayesian variable selection and GBLUP became similar was approximately the point where the number of QTL was equal to the number of independent chromosome segments (M <sub> e </sub>) across the populations. Conclusion: Bayesian variable selection models outperform GBLUP when the number of QTL underlying the trait is smaller than M <sub> e </sub>. Across populations, M <sub>e</sub> is considerably larger than within populations. So, it is more likely to find a number of QTL underlying a trait smaller than M <sub>e</sub> across populations than within population. Therefore Bayesian variable selection models can help to improve the accuracy of across population genomic prediction.</p
Genomic and pedigree-based genetic parameters for scarcely recorded traits when some animals are genotyped
Genetic parameters were estimated using relationships between animals that were based either on pedigree, 43,011 single nucleotide polymorphisms, or a combination of these, considering genotyped and non-genotyped animals. The standard error of the estimates and a parametric bootstrapping procedure was used to investigate sampling properties of the estimated variance components. The data set contained milk yield, dry matter intake and body weight for 517 first-lactation heifers with genotypes and phenotypes, and another 112 heifers with phenotypes only. Multivariate models were fitted using the different relationships in ASReml software. Estimates of genetic variance were lower based on genomic relationships than using pedigree relationships. Genetic variances from genomic and pedigree relationships were, however, not directly comparable because they apply to different base populations. Standard errors indicated that using the genomic relationships gave more accurate estimates of heritability but equally accurate estimates of genetic correlation. However, the estimates of standard errors were affected by the differences in scale between the 2 relationship matrices, causing differences in values of the genetic parameters. The bootstrapping results (with genetic parameters at the same level), confirmed that both heritability and genetic correlations were estimated more accurately with genomic relationships in comparison with using the pedigree relationships. Animals without genotype were included in the analysis by merging genomic and pedigree relationships. This allowed all phenotypes to be used, including those from non-genotyped animals. This combination of genomic and pedigree relationships gave the most accurate estimates of genetic variance. When a small data set is available it might be more advantageous for the estimation of genetic parameters to genotype existing animals, rather than collecting more phenotypes
Inheritance of genomic regions and genes associated with number of oocytes and embryos in Gir cattle through daughter design.
Over the past decades, daughter designs, including genotyped sires and their genotyped daughters, have been used as an approach to identify QTL related to economic traits. The aim of this study was to identify genomic regions inherited by Gir sire families and genes associated with number of viable oocytes (VO), total number of oocytes (TO), and number of embryos (EMBR) based on a daughter design approach. In total, 15 Gir sire families were selected. The number of daughters per family ranged from 26 to 395, which were genotyped with different SNP panels and imputed to the Illumina BovineHD BeadChip (777K) and had phenotypes for oocyte and embryo production. Daughters had phenotypic data for VO, TO, and EMBR. The search for QTL was performed through GWAS based on GBLUP. The QTL were found for each trait among and within families based on the top 10 genomic windows with the greatest genetic variance. For EMBR, genomic windows identified among families were located on BTA4, BTA5, BTA6, BTA7, BTA8, BTA13, BTA16, and BTA17, and they were most frequent on BTA7 within families. For VO, genomic windows were located on BTA2, BTA4, BTA5, BTA7, BTA17, BTA21, BTA22, BTA23, and BTA27 among families, being most frequent on BTA8 within families. For TO, the top 10 genomic windows were identified on BTA2, BTA4, BTA5, BTA7, BTA17, BTA21, BTA22, BTA26, and BTA27, being most frequent on BTA7 and BTA8 within families. Considering all results, the greatest number of genomic windows was found on BTA7, where the VCAN, XRCC4, TRNAC-ACA, HAPLN1, and EDIL3 genes were identified in the common regions. In conclusion, 15 Gir sire families with 26 to 395 daughters per family with phenotypes for oocyte and embryo production helped to identify the inheritance of several genomic regions, especially on BTA7, where the EDIL3, HAPLN1, and VCAN candidate genes were associated with number of oocytes and embryos in Gir cattle families
Parent-of-origin effects for the number of oocytes and embryos in Gir cattle.
Imprinting is a phenomenon that alters the expression of genes according to the parental origin of their alleles. A quantitative form to evaluate the imprinting effect is known as parent-of-origin effect. Our aim with this work is to identify parent-of-origin effects that influence the number of oocytes and embryos in Gir dairy cattle. A dataset with 17,526 Ovum Pick Up observations from 1641 Gir donors was used to estimate parent-of-origin effects for the traits number of total oocytes (TO), number of viable oocytes (VO) and number of embryos (EM). To identify parent-of-origin effects, dam and sire gametic effects were included, individually or together, in an animal model for TO, VO and EM traits. For TO, inclusion of paternal origin effects in the model was significant (P < 0.05), and explained 6 % of the total phenotypic variance. For VO and EM no significant parent-of-origin effects were found for either parental line. In conclusion, paternal effects appear to influence the total oocyte production in the Gir cattle breed
- …