149 research outputs found

    Computational and statistical approaches to analyzing variants identified by exome sequencing

    Get PDF
    New sequencing technology has enabled the identification of thousands of single nucleotide polymorphisms in the exome, and many computational and statistical approaches to identify disease-association signals have emerged.National Institutes of Health (U.S.) (Grant R01-MH084676)National Institutes of Health (U.S.) (Grant R01-GM078598)National Institutes of Health (U.S.) (Training grant T32-HL07604-25)Brigham and Women's Hospital (Division of Cardiovascular Medicine

    Adaptive Mutations in the JC Virus Protein Capsid Are Associated with Progressive Multifocal Leukoencephalopathy (PML)

    Get PDF
    PML is a progressive and mostly fatal demyelinating disease caused by JC virus infection and destruction of infected oligodendrocytes in multiple brain foci of susceptible individuals. While JC virus is highly prevalent in the human population, PML is a rare disease that exclusively afflicts only a small percentage of immunocompromised individuals including those affected by HIV (AIDS) or immunosuppressive drugs. Viral- and/or host-specific factors, and not simply immune status, must be at play to account for the very large discrepancy between viral prevalence and low disease incidence. Here, we show that several amino acids on the surface of the JC virus capsid protein VP1 display accelerated evolution in viral sequences isolated from PML patients but not in sequences isolated from healthy subjects. We provide strong evidence that at least some of these mutations are involved in binding of sialic acid, a known receptor for the JC virus. Using statistical methods of molecular evolution, we performed a comprehensive analysis of JC virus VP1 sequences isolated from 55 PML patients and 253 sequences isolated from the urine of healthy individuals and found that a subset of amino acids found exclusively among PML VP1 sequences is acquired via adaptive evolution. By modeling of the 3-D structure of the JC virus capsid, we showed that these residues are located within the sialic acid binding site, a JC virus receptor for cell infection. Finally, we go on to demonstrate the involvement of some of these sites in receptor binding by demonstrating a profound reduction in hemagglutination properties of viral-like particles made of the VP1 protein carrying these mutations. Collectively, these results suggest that a more virulent PML causing phenotype of JC virus is acquired via adaptive evolution that changes viral specificity for its cellular receptor(s)

    Balancing Selection on a Regulatory Region Exhibiting Ancient Variation That Predates Human–Neandertal Divergence

    Get PDF
    Ancient population structure shaping contemporary genetic variation has been recently appreciated and has important implications regarding our understanding of the structure of modern human genomes. We identified a ∼36-kb DNA segment in the human genome that displays an ancient substructure. The variation at this locus exists primarily as two highly divergent haplogroups. One of these haplogroups (the NE1 haplogroup) aligns with the Neandertal haplotype and contains a 4.6-kb deletion polymorphism in perfect linkage disequilibrium with 12 single nucleotide polymorphisms (SNPs) across diverse populations. The other haplogroup, which does not contain the 4.6-kb deletion, aligns with the chimpanzee haplotype and is likely ancestral. Africans have higher overall pairwise differences with the Neandertal haplotype than Eurasians do for this NE1 locus (p<10−15). Moreover, the nucleotide diversity at this locus is higher in Eurasians than in Africans. These results mimic signatures of recent Neandertal admixture contributing to this locus. However, an in-depth assessment of the variation in this region across multiple populations reveals that African NE1 haplotypes, albeit rare, harbor more sequence variation than NE1 haplotypes found in Europeans, indicating an ancient African origin of this haplogroup and refuting recent Neandertal admixture. Population genetic analyses of the SNPs within each of these haplogroups, along with genome-wide comparisons revealed significant FST (p = 0.00003) and positive Tajima's D (p = 0.00285) statistics, pointing to non-neutral evolution of this locus. The NE1 locus harbors no protein-coding genes, but contains transcribed sequences as well as sequences with putative regulatory function based on bioinformatic predictions and in vitro experiments. We postulate that the variation observed at this locus predates Human–Neandertal divergence and is evolving under balancing selection, especially among European populations

    ARTICLE Pooled Association Tests for Rare Variants in Exon-Resequencing Studies

    Get PDF
    Deep sequencing will soon generate comprehensive sequence information in large disease samples. Although the power to detect association with an individual rare variant is limited, pooling variants by gene or pathway into a composite test provides an alternative strategy for identifying susceptibility genes. We describe a statistical method for detecting association of multiple rare variants in protein-coding genes with a quantitative or dichotomous trait. The approach is based on the regression of phenotypic values on individuals&apos; genotype scores subject to a variable allele-frequency threshold, incorporating computational predictions of the functional effects of missense variants. Statistical significance is assessed by permutation testing with variable thresholds. We used a rigorous population-genetics simulation framework to evaluate the power of the method, and we applied the method to empirical sequencing data from three disease studies

    Individuality and temporal stability of the human gut microbiome

    Get PDF
    Introduction: The breakthrough of next generation sequencing-technologies has enabled large-scale studies of natural microbial communities and the 16S rRNA genes have been widely used as a phylogenetic marker to study community structure. However, major limitations of this approach are that neither strain-level resolution nor genomic context of microorganisms can be provided. This information, however, is crucial to answer fundamental questions about the temporal stability and distinctiveness of natural microbial communities.Material and methods: We developed a methodological framework for metagenomic single nucleotide polymorphism (SNP) variation analysis and applied it to publicly available data from 252 human fecal samples from 207 European and North American individuals. We further analyzed samples from 43 healthy subjects that were sampled at least twice over time intervals of up to one year and measured population similarities of dominant gut species.Results: We detected 10.3 million SNPs in 101 species, which nearly amounts to the number identified in more than 1,000 humans.Conclusion: The most striking result was that host-specific strains appear to be retained over long time periods. This indicates that individual-specific strains are not easily exchanged with the environment and furthermore, that an individuals appear to have a unique metagenomic genotype. This, in turn, is linked to implications for human gut physiology, such as the stability of antibiotic resistance potential

    Assessing the Evolutionary Impact of Amino Acid Mutations in the Human Genome

    Get PDF
    Quantifying the distribution of fitness effects among newly arising mutations in the human genome is key to resolving important debates in medical and evolutionary genetics. Here, we present a method for inferring this distribution using Single Nucleotide Polymorphism (SNP) data from a population with non-stationary demographic history (such as that of modern humans). Application of our method to 47,576 coding SNPs found by direct resequencing of 11,404 protein coding-genes in 35 individuals (20 European Americans and 15 African Americans) allows us to assess the relative contribution of demographic and selective effects to patterning amino acid variation in the human genome. We find evidence of an ancient population expansion in the sample with African ancestry and a relatively recent bottleneck in the sample with European ancestry. After accounting for these demographic effects, we find strong evidence for great variability in the selective effects of new amino acid replacing mutations. In both populations, the patterns of variation are consistent with a leptokurtic distribution of selection coefficients (e.g., gamma or log-normal) peaked near neutrality. Specifically, we predict 27–29% of amino acid changing (nonsynonymous) mutations are neutral or nearly neutral (|s|<0.01%), 30–42% are moderately deleterious (0.01%<|s|<1%), and nearly all the remainder are highly deleterious or lethal (|s|>1%). Our results are consistent with 10–20% of amino acid differences between humans and chimpanzees having been fixed by positive selection with the remainder of differences being neutral or nearly neutral. Our analysis also predicts that many of the alleles identified via whole-genome association mapping may be selectively neutral or (formerly) positively selected, implying that deleterious genetic variation affecting disease phenotype may be missed by this widely used approach for mapping genes underlying complex traits
    • …
    corecore