42 research outputs found

    Comparing SNP Panels and Statistical Methods for Estimating Genomic Breed Composition of Individual Animals in Ten Cattle Breeds

    Get PDF
    Background: SNPs are informative to estimate genomic breed composition (GBC) of individual animals, but selected SNPs for this purpose were not made available in the commercial bovine SNP chips prior to the present study. The primary objective of the present study was to select five common SNP panels for estimating GBC of individual animals initially involving 10 cattle breeds (two dairy breeds and eight beef breeds). The performance of the five common SNP panels was evaluated based on admixture model and linear regression model, respectively. Finally, the downstream implication of GBC on genomic prediction accuracies was investigated and discussed in a Santa Gertrudis cattle population. Results: There were 15,708 common SNPs across five currently-available commercial bovine SNP chips. From this set, four subsets (1,000, 3,000, 5,000, and 10,000 SNPs) were selected by maximizing average Euclidean distance (AED) of SNP allelic frequencies among the ten cattle breeds. For 198 animals presented as Akaushi, estimated GBC of the Akaushi breed (GBCA) based on the admixture model agreed very well among the five SNP panels, identifying 166 animals with GBCA = 1. Using the same SNP panels, the linear regression approach reported fewer animals with GBCA = 1. Nevertheless, estimated GBCA using both models were highly correlated (r = 0.953 to 0.992). In the genomic prediction of a Santa Gertrudis population (and crosses), the results showed that the predictability of molecular breeding values using SNP effects obtained from 1,225 animals with no less than 0.90 GBC of Santa Gertrudis (GBCSG) decreased on crossbred animals with lower GBCSG. Conclusions: Of the two statistical models used to compute GBC, the admixture model gave more consistent results among the five selected SNP panels than the linear regression model. The availability of these common SNP panels facilitates identification and estimation of breed compositions using currently-available bovine SNP chips. In view of utility, the 1 K panel is the most cost effective and it is convenient to be included as add-on content in future development of bovine SNP chips, whereas the 10 K and 16 K SNP panels can be more resourceful if used independently for imputation to intermediate or high-density genotypes

    Comparing strategies for selection of low-density SNPs for imputation-mediated genomic prediction in U.S. Holsteins

    Get PDF
    SNP chips are commonly used for genotyping animals in genomic selection but strategies for selecting low-density (LD) SNPs for imputation-mediated genomic selection have not been addressed adequately. The main purpose of the present study was to compare the performance of eight LD (6K) SNP panels, each selected by a different strategy exploiting a combination of three major factors: evenly-spaced SNPs, increased minor allele frequencies, and SNP-trait associations either for single traits independently or for all the three traits jointly. The imputation accuracies from 6K to 80K SNP genotypes were between 96.2 and 98.2%. Genomic prediction accuracies obtained using imputed 80K genotypes were between 0.817 and 0.821 for daughter pregnancy rate, between 0.838 and 0.844 for fat yield, and between 0.850 and 0.863 for milk yield. The two SNP panels optimized on the three major factors had the highest genomic prediction accuracy (0.821–0.863), and these accuracies were very close to those obtained using observed 80K genotypes (0.825–0.868). Further exploration of the underlying relationships showed that genomic prediction accuracies did not respond linearly to imputation accuracies, but were significantly affected by genotype (imputation) errors of SNPs in association with the traits to be predicted. SNPs optimal for map coverage and MAF were favorable for obtaining accurate imputation of genotypes whereas trait-associated SNPs improved genomic prediction accuracies. Thus, optimal LD SNP panels were the ones that combined both strengths. The present results have practical implications on the design of LD SNP chips for imputation-enabled genomic prediction

    Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications

    Get PDF
    Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with more computing time. Nevertheless, the differences diminished when \u3e5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with \u3e3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The utility of this MOLO algorithm was also demonstrated in a real application, in which a 6K SNP panel was optimized conditional on 5,260 obligatory SNP selected based on SNP-trait association in U.S. Holstein animals. With this MOLO algorithm, both imputation error rate and genomic prediction error rate were minimal

    Accuracies of genomic breeding values in American Angus beef cattle using K-means clustering for cross-validation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction.</p> <p>Methods</p> <p>Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values.</p> <p>Results</p> <p>Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied.</p> <p>Conclusions</p> <p>These results suggest that genomic estimates of genetic merit can be produced in beef cattle at a young age but the recurrent inclusion of genotyped sires in retraining analyses will be necessary to routinely produce for the industry the direct genomic values with the highest accuracy.</p

    Novel analytical method, Birth Date Selection Mapping, detects response of the Angus (Bos taurus) genome to selection on complex traits

    Get PDF
    Background: Several methods have recently been developed to identify regions of the genome that have been exposed to strong selection. However, recent theoretical and empirical work suggests that polygenic models are required to identify the genomic regions that are more moderately responding to ongoing selection on complex traits. We examine the effects of multi-trait selection on the genome of a population of US registered Angus beef cattle born over a 50-year period representing approximately 10 generations of selection. We present results from the application of a quantitative genetic model, called Birth Date Selection Mapping, to identify signatures of recent ongoing selection.Results: We show that US Angus cattle have been systematically selected to alter their mean additive genetic merit for most of the 16 production traits routinely recorded by breeders. Using Birth Date Selection Mapping, we estimate the time-dependency of allele frequency for 44,817 SNP loci using genomic best linear unbiased prediction, generalized least squares, and BayesCpi analyses. Finally, we reconstruct the primary phenotypes that have historically been exposed to selection from a genome-wide analysis of the 16 production traits and gene ontology enrichment analysis.Conclusions: We demonstrate that Birth Date Selection Mapping utilizing mixed models corrects for time-dependent pedigree sampling effects that lead to spurious SNP associations and reveals genomic signatures of ongoing selection on complex traits. Because multiple traits have historically been selected in concert and most quantitative trait loci have small effects, selection has incrementally altered allele frequencies throughout the genome. Two quantitative trait loci of large effect were not the most strongly selected of the loci due to their antagonistic pleiotropic effects on strongly selected phenotypes. Birth Date Selection Mapping may readily be extended to temporally-stratified human or model organism populations.Peer reviewedAnimal Scienc

    Breed-specific haplotypes for polled phenotypes in cattle

    No full text
    The presence of horns within commercial cattle populations increases the chances of injuries. The Polled (hornless) condition in cattle has existed since domestication, and it has been selected by breeders because of its economic importance and ease of management. A dominant mutation is believed to cause the polled phenotype. The present invention relates to the identification of breed-specific single nucleotide polymorphisms (SNPs) and their haplotypes, in the bovine genome, in or near gene(s) encoding polypeptides associated with the Polled phenotyped in beef production. The invention further encompasses methods and systems, including network-based processes, to manage the SNP data, haplotype data and other data relating to specific animals and herds of animals, veterinarian care, diagnostic and quality control data and management of livestock which, based on genotyping, have predictable productions quality traits, husbandry conditions, animal welfare, food safety information, audit of existing processes and data from field location.U

    Breed-specific haplotypes for polled phenotypes in cattle

    No full text
    The presence of horns within commercial cattle populations increases the chances of injuries. The Polled (hornless) condition in cattle has existed since domestication, and it has been selected by breeders because of its economic importance and ease of management. A dominant mutation is believed to cause the polled phenotype. The present invention relates to the identification of breed-specific single nucleotide polymorphisms (SNPs) and their haplotypes, in the bovine genome, in or near gene(s) encoding polypeptides associated with the Polled phenotyped in beef production. The invention further encompasses methods and systems, including network-based processes, to manage the SNP data, haplotype data and other data relating to specific animals and herds of animals, veterinarian care, diagnostic and quality control data and management of livestock which, based on genotyping, have predictable productions quality traits, husbandry conditions, animal welfare, food safety information, audit of existing processes and data from field location.U
    corecore