623 research outputs found

    Estimating Effects and Making Predictions from Genome-Wide Marker Data

    Full text link
    In genome-wide association studies (GWAS), hundreds of thousands of genetic markers (SNPs) are tested for association with a trait or phenotype. Reported effects tend to be larger in magnitude than the true effects of these markers, the so-called ``winner's curse.'' We argue that the classical definition of unbiasedness is not useful in this context and propose to use a different definition of unbiasedness that is a property of the estimator we advocate. We suggest an integrated approach to the estimation of the SNP effects and to the prediction of trait values, treating SNP effects as random instead of fixed effects. Statistical methods traditionally used in the prediction of trait values in the genetics of livestock, which predates the availability of SNP data, can be applied to analysis of GWAS, giving better estimates of the SNP effects and predictions of phenotypic and genetic values in individuals.Comment: Published in at http://dx.doi.org/10.1214/09-STS306 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    The use of epigenetic phenomena for the improvement of sheep and cattle

    Get PDF
    This review considers the evidence for inheritance across generations of epigenetic marks and how this phenomenon could be exploited in the cattle and sheep industries. Epigenetic marks are chemical changes in the chromosomes that affect the expression of genes and hence the phenotype of the cell and are passed on during mitosis so that the daughter cells have the same chemical changes or epigenetic marks as the parent cell. Although most epigenetic marks are wiped clean in the process of forming a new zygote, some epigenetic marks (epimutations) may be passed on from parent to offspring. The inheritance of epigenetic marks across generations is difficult to prove as there are usually alternative explanations possible. There are few well documented cases, mainly using inbred strains of mice. The epimutations are unstable and revert to wild type after a few generations. Although, there are no known cases in sheep or cattle, it is likely that inherited epimutations occur in these species but it is unlikely that they explain a large part of the inherited or genetic variation. There is limited evidence in mice and rats that an environmental treatment can cause a change in the epigenetic marks of an animal and that this change can be passed on the next generation. If inherited epimutations occur in sheep and cattle, they will already be utilized to some extent by existing genetic improvement programs. It would be possible to modify the statistical models used in the calculation of estimated breeding values to better recognize the variance controlled by epimutations, but it would probably have, at best, a small effect on the rate on genetic (inherited) gain achieved. Although not a genetic improvement, the inheritance of epigenetic marks caused by the environment experienced by the sire offers a new opportunity in sheep and cattle breeding. However, at present we do not know if this occurs or, if it does, what environmental treatment might have a beneficial effect

    Multi-locus models of genetic risk of disease

    Get PDF
    Background: Evidence for genetic contribution to complex diseases is described by recurrence risks to relatives of diseased individuals. Genome-wide association studies allow a description of the genetics of the same diseases in terms of risk loci, their effects and allele frequencies. To reconcile the two descriptions requires a model of how risks from individual loci combine to determine an individual's overall risk

    Genetic architecture of body size in mammals

    Get PDF
    Much of the heritability for human stature is caused by mutations of small-to-medium effect. This is because detrimental pleiotropy restricts large-effect mutations to very low frequencies

    Prediction of individual genetic risk to disease from genome-wide association studies

    Get PDF
    Empirical studies suggest that the effect sizes of individual causal risk alleles underlying complex genetic diseases are small, with most genotype relative risks in the range of 1.1-2.0. Although the increased risk of disease for a carrier is small for any single locus, knowledge of multiple-risk alleles throughout the genome could allow the identification of individuals that are at high risk. In this study, we investigate the number and effect size of risk loci that underlie complex disease constrained by the disease parameters of prevalence and heritability. Then we quantify the value of prediction of genetic risk to disease using a range of realistic combinations of the number, size, and distribution of risk effects that underlie complex diseases. We propose an approach to assess the genetic risk of a disease in healthy individuals, based on dense genome-wide SNP panels. We test this approach using simulation. When the number of loci contributing to the disease is >50, a large case-control study is needed to identify a set of risk loci for use in predicting the disease risk of healthy people not included in the case-control study. For diseases controlled by 1000 loci of mean relative risk of only 1.04, a case-control study with 10,000 cases and controls can lead to selection of ∼75 loci that explain >50% of the genetic variance. The 5% of people with the highest predicted risk are three to seven times more likely to suffer the disease than the population average, depending on heritability and disease prevalence. Whether an individual with known genetic risk develops the disease depends on known and unknown environmental factors

    Immune cell census in murine atherosclerosis: cytometry by time of flight illuminates vascular myeloid cell diversity

    Get PDF
    Aims: Atherosclerosis is characterised by the abundant infiltration of myeloid cells starting at early stages of disease. Myeloid cells are key players in vascular immunity during atherogenesis. However, the subsets of vascular myeloid cells have eluded resolution due to shared marker expression and atypical heterogeneity in vascular tissues. We applied the high-dimensionality of mass cytometry to the study of myeloid cell subsets in atherosclerosis. Methods and Results: Apolipoprotein E-deficient (ApoE-/-) mice were fed a chow or a high fat (western) diet for 12 weeks. Single cell aortic preparations were probed with a panel of 35 metal-conjugated antibodies using Cytometry by time of flight (CyTOF). Clustering of marker expression on live CD45+ cells from the aortas of ApoE-/- mice identified 13 broad populations of leucocytes. Monocyte, macrophage, type 1 and type 2 conventional dendritic cell (cDC1 and cDC2), plasmacytoid dendritic cell (pDC), neutrophil, eosinophil, B cell, CD4+ and CD8+ T cell, γδ T cell, natural killer (NK) cell and innate lymphoid (ILC) cell populations accounted for approximately 95% of the live CD45+ aortic cells. Automated clustering algorithms applied to the Lin-CD11blo-hi cells revealed 20 clusters of myeloid cells. Comparison between chow and high fat fed animals revealed increases in monocytes (both Ly6C+ and Ly6C-), pDC and a CD11c+ macrophage subset with high fat feeding. Concomitantly, the proportions of CD206+ CD169+ subsets of macrophages were significantly reduced as were cDC2. Conclusions: A CyTOF-based comprehensive mapping of the immune cell subsets within atherosclerotic aortas from ApoE-/- mice offers tools for myeloid cell discrimination within the vascular compartment and it reveals that high fat feeding skews the myeloid cell repertoire towards inflammatory monocyte-macrophage populations rather than resident macrophage phenotypes and cDC2 during atherogenesis

    Sensitivity of genomic selection to using different prior distributions

    Get PDF
    Genomic selection describes a selection strategy based on genomic estimated breeding values (GEBV) predicted from dense genetic markers such as single nucleotide polymorphism (SNP) data. Different Bayesian models have been suggested to derive the prediction equation, with the main difference centred around the specification of the prior distributions

    Genome position specific priors for genomic prediction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The accuracy of genomic prediction is highly dependent on the size of the reference population. For small populations, including information from other populations could improve this accuracy. The usual strategy is to pool data from different populations; however, this has not proven as successful as hoped for with distantly related breeds. BayesRS is a novel approach to share information across populations for genomic predictions. The approach allows information to be captured even where the phase of SNP alleles and casuative mutation alleles are reversed across populations, or the actual casuative mutation is different between the populations but affects the same gene. Proportions of a four-distribution mixture for SNP effects in segments of fixed size along the genome are derived from one population and set as location specific prior proportions of distributions of SNP effects for the target population. The model was tested using dairy cattle populations of different breeds: 540 Australian Jersey bulls, 2297 Australian Holstein bulls and 5214 Nordic Holstein bulls. The traits studied were protein-, fat- and milk yield. Genotypic data was Illumina 777K SNPs, real or imputed.</p> <p>Results</p> <p>Results showed an increase in accuracy of up to 3.5% for the Jersey population when using BayesRS with a prior derived from Australian Holstein compared to a model without location specific priors. The increase in accuracy was however lower than was achieved when reference populations were combined to estimate SNP effects, except in the case of fat yield. The small size of the Jersey validation set meant that these improvements in accuracy were not significant using a Hotelling-Williams t-test at the 5% level. An increase in accuracy of 1-2% for all traits was observed in the Australian Holstein population when using a prior derived from the Nordic Holstein population compared to using no prior information. These improvements were significant (P<0.05) using the Hotelling Williams t-test for protein- and fat yield.</p> <p>Conclusion</p> <p>For some traits the method might be advantageous compared to pooling of reference data for distantly related populations, but further investigation is needed to confirm the results. For closely related populations the method does not perform better than pooling reference data. However, it does give an increased accuracy compared to analysis based on only one reference population, without an increased computational burden. The approach described here provides a general setup for inclusion of location specific priors: the approach could be used to include biological information in genomic predictions.</p

    The use of mid-infrared spectra to map genes affecting milk composition.

    Get PDF
    The aim of this study was to investigate the feasibility of using mid-infrared (MIR) spectroscopy analysis of milk samples to increase the power and precision of genome-wide association studies (GWAS) for milk composition and to better distinguish linked quantitative trait loci (QTL). To achieve this goal, we analyzed phenotypic data of milk composition traits, related MIR spectra, and genotypic data comprising 626,777 SNP on 5,202 Holstein, Jersey, and crossbred cows. We performed a conventional GWAS on protein, lactose, fat, and fatty acid concentrations in milk, a GWAS on individual MIR wavenumbers, and a partial least squares regression (PLS), which is equivalent to a multi-trait GWAS, exploiting MIR data simultaneously to predict SNP genotypes. The PLS detected most of the QTL identified using single-trait GWAS, usually with a higher significance value, as well as previously undetected QTL for milk composition. Each QTL tends to have a different pattern of effects across the MIR spectrum and this explains the increased power. Because SNP tracking different QTL tend to have different patterns of effect, it was possible to distinguish closely linked QTL. Overall, the results of this study suggest that using MIR data through either GWAS or PLS analysis applied to genomic data can provide a powerful tool to distinguish milk composition QTL

    On the Relationship between the Uniqueness of the Moonshine Module and Monstrous Moonshine

    Full text link
    We consider the relationship between the conjectured uniqueness of the Moonshine Module, V{\cal V}^\natural, and Monstrous Moonshine, the genus zero property of the modular invariance group for each Monster group Thompson series. We first discuss a family of possible ZnZ_n meromorphic orbifold constructions of V{\cal V}^\natural based on automorphisms of the Leech lattice compactified bosonic string. We reproduce the Thompson series for all 51 non-Fricke classes of the Monster group MM together with a new relationship between the centralisers of these classes and 51 corresponding Conway group centralisers (generalising a well-known relationship for 5 such classes). Assuming that V{\cal V}^\natural is unique, we then consider meromorphic orbifoldings of V{\cal V}^\natural and show that Monstrous Moonshine holds if and only if the only meromorphic orbifoldings of V{\cal V}^\natural give V{\cal V}^\natural itself or the Leech theory. This constraint on the meromorphic orbifoldings of V{\cal V}^\natural therefore relates Monstrous Moonshine to the uniqueness of V{\cal V}^\natural in a new way.Comment: 53 pages, PlainTex, DIAS-STP-93-0
    corecore