1,953 research outputs found

    Estimation of pedigree errors in the UK dairy population using microsatellite markers and the impact on selection

    Get PDF
    The proportion of cows in the UK dairy herd whose sires were misidentified was estimated using DNA markers. Genetic marker genotypes were determined on 568 cows (from 168 milk samples and 400 hair samples) and 96 putative sires (from semen samples). The estimated pedigree error rate from the hair samples was 8.8%, and from the milk samples, 13.1%, giving an overall estimate of the error rate of 10%. This level of pedigree errors will have a relatively large impact on the efficiency of progeny testing and the accuracy of cow predicted breeding values. We predict a loss of response to selection of approximately 2 to 3% given this error rate

    Estimating Effects and Making Predictions from Genome-Wide Marker Data

    Full text link
    In genome-wide association studies (GWAS), hundreds of thousands of genetic markers (SNPs) are tested for association with a trait or phenotype. Reported effects tend to be larger in magnitude than the true effects of these markers, the so-called ``winner's curse.'' We argue that the classical definition of unbiasedness is not useful in this context and propose to use a different definition of unbiasedness that is a property of the estimator we advocate. We suggest an integrated approach to the estimation of the SNP effects and to the prediction of trait values, treating SNP effects as random instead of fixed effects. Statistical methods traditionally used in the prediction of trait values in the genetics of livestock, which predates the availability of SNP data, can be applied to analysis of GWAS, giving better estimates of the SNP effects and predictions of phenotypic and genetic values in individuals.Comment: Published in at http://dx.doi.org/10.1214/09-STS306 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    What if we had whole-genome sequence data for millions of individuals?

    Get PDF

    Analysis of pooled DNA samples on high density arrays without prior knowledge of differential hybridization rates

    Get PDF
    Array based DNA pooling techniques facilitate genome-wide scale genotyping of large samples. We describe a structured analysis method for pooled data using internal replication information in large scale genotyping sets. The method takes advantage of information from single nucleotide polymorphisms (SNPs) typed in parallel on a high density array to construct a test statistic with desirable statistical properties. We utilize a general linear model to appropriately account for the structured multiple measurements available with array data. The method does not require the use of additional arrays for the estimation of unequal hybridization rates and hence scales readily to accommodate arrays with several hundred thousand SNPs. Tests for differences between cases and controls can be conducted with very few arrays. We demonstrate the method on 384 endometriosis cases and controls, typed using Affymetrix Genechip© HindIII 50 K arrays. For a subset of this data there were accurate measures of hybridization rates available. Assuming equal hybridization rates is shown to have a negligible effect upon the results. With a total of only six arrays, the method extracted one-third of the information (in terms of equivalent sample size) available with individual genotyping (requiring 768 arrays). With 20 arrays (10 for cases, 10 for controls), over half of the information could be extracted from this sample

    QTL detection and allelic effects for growth and fat traits in outbred pig populations

    Get PDF
    Quantitative trait loci (QTL) for growth and fatness traits have previously been identified on chromosomes 4 and 7 in several experimental pig populations. The segregation of these QTL in commercial pigs was studied in a sample of 2713 animals from five different populations. Variance component analysis (VCA) using a marker-based identity by descent (IBD) matrix was applied. The IBD coefficient was estimated with simple deterministic (SMD) and Markov chain Monte Carlo (MCMC) methods. Data for two growth traits, average daily gain on test and whole life daily gain, and back fat thickness were analysed. With both methods, seven out of 26 combinations of population, chromosome and trait, were significant. Additionally, QTL genotypic and allelic effects were estimated when the QTL effect was significant. The range of QTL genotypic effects in a population varied from 4.8% to 10.9% of the phenotypic mean for growth traits and 7.9% to 19.5% for back fat trait. Heritabilities of the QTL genotypic values ranged from 8.6% to 18.2% for growth traits, and 14.5% to 19.2% for back fat. Very similar results were obtained with both SMD and MCMC. However, the MCMC method required a large number of iterations, and hence computation time, especially when the QTL test position was close to the marker

    Prediction of individual genetic risk to disease from genome-wide association studies

    Get PDF
    Empirical studies suggest that the effect sizes of individual causal risk alleles underlying complex genetic diseases are small, with most genotype relative risks in the range of 1.1-2.0. Although the increased risk of disease for a carrier is small for any single locus, knowledge of multiple-risk alleles throughout the genome could allow the identification of individuals that are at high risk. In this study, we investigate the number and effect size of risk loci that underlie complex disease constrained by the disease parameters of prevalence and heritability. Then we quantify the value of prediction of genetic risk to disease using a range of realistic combinations of the number, size, and distribution of risk effects that underlie complex diseases. We propose an approach to assess the genetic risk of a disease in healthy individuals, based on dense genome-wide SNP panels. We test this approach using simulation. When the number of loci contributing to the disease is >50, a large case-control study is needed to identify a set of risk loci for use in predicting the disease risk of healthy people not included in the case-control study. For diseases controlled by 1000 loci of mean relative risk of only 1.04, a case-control study with 10,000 cases and controls can lead to selection of ∼75 loci that explain >50% of the genetic variance. The 5% of people with the highest predicted risk are three to seven times more likely to suffer the disease than the population average, depending on heritability and disease prevalence. Whether an individual with known genetic risk develops the disease depends on known and unknown environmental factors

    Genetic architecture of body size in mammals

    Get PDF
    Much of the heritability for human stature is caused by mutations of small-to-medium effect. This is because detrimental pleiotropy restricts large-effect mutations to very low frequencies

    Explaining additional genetic variation in complex traits

    Get PDF
    Genome-wide association studies (GWAS) have provided valuable insights into the genetic basis of complex traits, discovering >6000 variants associated with >500 quantitative traits and common complex diseases in humans. The associations identified so far represent only a fraction of those that influence phenotype, because there are likely to be many variants across the entire frequency spectrum, each of which influences multiple traits, with only a small average contribution to the phenotypic variance. This presents a considerable challenge to further dissection of the remaining unexplained genetic variance within populations, which limits our ability to predict disease risk, identify new drug targets, improve and maintain food sources, and understand natural diversity. This challenge will be met within the current framework through larger sample size, better phenotyping, including recording of nongenetic risk factors, focused study designs, and an integration of multiple sources of phenotypic and genetic information. The current evidence supports the application of quantitative genetic approaches, and we argue that one should retain simpler theories until simplicity can be traded for greater explanatory power
    corecore