96 research outputs found

    Analysis of Rare, Exonic Variation amongst Subjects with Autism Spectrum Disorders and Population Controls

    Get PDF
    We report on results from whole-exome sequencing (WES) of 1,039 subjects diagnosed with autism spectrum disorders (ASD) and 870 controls selected from the NIMH repository to be of similar ancestry to cases. The WES data came from two centers using different methods to produce sequence and to call variants from it. Therefore, an initial goal was to ensure the distribution of rare variation was similar for data from different centers. This proved straightforward by filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. Results were evaluated using seven samples sequenced at both centers and by results from the association study. Next we addressed how the data and/or results from the centers should be combined. Gene-based analyses of association was an obvious choice, but should statistics for association be combined across centers (meta-analysis) or should data be combined and then analyzed (mega-analysis)? Because of the nature of many gene-based tests, we showed by theory and simulations that mega-analysis has better power than meta-analysis. Finally, before analyzing the data for association, we explored the impact of population structure on rare variant analysis in these data. Like other recent studies, we found evidence that population structure can confound case-control studies by the clustering of rare variants in ancestry space; yet, unlike some recent studies, for these data we found that principal component-based analyses were sufficient to control for ancestry and produce test statistics with appropriate distributions. After using a variety of gene-based tests and both meta- and mega-analysis, we found no new risk genes for ASD in this sample. Our results suggest that standard gene-based tests will require much larger samples of cases and controls before being effective for gene discovery, even for a disorder like ASD. © 2013 Liu et al

    Deep resequencing reveals excess rare recent variants consistent with explosive population growth

    Get PDF
    Accurately determining the distribution of rare variants is an important goal of human genetics, but resequencing of a sample large enough for this purpose has been unfeasible until now. Here, we applied Sanger sequencing of genomic PCR amplicons to resequence the diabetes-associated genes KCNJ11 and HHEX in 13,715 people (10,422 European Americans and 3,293 African Americans) and validated amplicons potentially harbouring rare variants using 454 pyrosequencing. We observed far more variation (expected variant-site count ∼578) than would have been predicted on the basis of earlier surveys, which could only capture the distribution of common variants. By comparison with earlier estimates based on common variants, our model shows a clear genetic signal of accelerating population growth, suggesting that humanity harbours a myriad of rare, deleterious variants, and that disease risk and the burden of disease in contemporary populations may be heavily influenced by the distribution of rare variants

    MEPE loss-of-function variant associates with decreased bone mineral density and increased fracture risk

    Get PDF
    A major challenge in genetic association studies is that most associated variants fall in the non-coding part of the human genome. We searched for variants associated with bone mineral density (BMD) after enriching the discovery cohort for loss-of-function (LoF) mutations by sequencing a subset of the Nord-Trøndelag Health Study, followed by imputation in the remaining sample (N = 19,705), and identified ten known BMD loci. However, one previously unreported variant, LoF mutation in MEPE, p.(Lys70IlefsTer26, minor allele frequency [MAF] = 0.8%), was associated with decreased ultradistal forearm BMD (P-value = 2.1 × 10−18), and increased osteoporosis (P-value = 4.2 × 10−5) and fracture risk (P-value = 1.6 × 10−5). The MEPE LoF association with BMD and fractures was further evaluated in 279,435 UK (MAF = 0.05%, heel bone estimated BMD P-value = 1.2 × 10−16, any fracture P-value = 0.05) and 375,984 Icelandic samples (MAF = 0.03%, arm BMD P-value = 0.12, forearm fracture P-value = 0.005). Screening for the MEPE LoF mutations before adulthood could potentially prevent osteoporosis and fractures due to the lifelong effect on BMD observed in the study. A key implication for precision medicine is that high-impact functional variants missing from the publicly available cosmopolitan panels could be clinically more relevant than polygenic risk scores.publishedVersio

    Subtle genetic changes enhance virulence of methicillin resistant and sensitive Staphylococcus aureus

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Community acquired (CA) methicillin-resistant <it>Staphylococcus aureus </it>(MRSA) increasingly causes disease worldwide. USA300 has emerged as the predominant clone causing superficial and invasive infections in children and adults in the USA. Epidemiological studies suggest that USA300 is more virulent than other CA-MRSA. The genetic determinants that render virulence and dominance to USA300 remain unclear.</p> <p>Results</p> <p>We sequenced the genomes of two pediatric USA300 isolates: one CA-MRSA and one CA-methicillin susceptible (MSSA), isolated at Texas Children's Hospital in Houston. DNA sequencing was performed by Sanger dideoxy whole genome shotgun (WGS) and 454 Life Sciences pyrosequencing strategies. The sequence of the USA300 MRSA strain was rigorously annotated. In USA300-MRSA 2658 chromosomal open reading frames were predicted and 3.1 and 27 kilobase (kb) plasmids were identified. USA300-MSSA contained a 20 kb plasmid with some homology to the 27 kb plasmid found in USA300-MRSA. Two regions found in US300-MRSA were absent in USA300-MSSA. One of these carried the arginine deiminase operon that appears to have been acquired from <it>S. epidermidis</it>. The USA300 sequence was aligned with other sequenced <it>S. aureus </it>genomes and regions unique to USA300 MRSA were identified.</p> <p>Conclusion</p> <p>USA300-MRSA is highly similar to other MRSA strains based on whole genome alignments and gene content, indicating that the differences in pathogenesis are due to subtle changes rather than to large-scale acquisition of virulence factor genes. The USA300 Houston isolate differs from another sequenced USA300 strain isolate, derived from a patient in San Francisco, in plasmid content and a number of sequence polymorphisms. Such differences will provide new insights into the evolution of pathogens.</p

    Somatic mutations affect key pathways in lung adenocarcinoma

    Full text link
    Determining the genetic basis of cancer requires comprehensive analyses of large collections of histopathologically well- classified primary tumours. Here we report the results of a collaborative study to discover somatic mutations in 188 human lung adenocarcinomas. DNA sequencing of 623 genes with known or potential relationships to cancer revealed more than 1,000 somatic mutations across the samples. Our analysis identified 26 genes that are mutated at significantly high frequencies and thus are probably involved in carcinogenesis. The frequently mutated genes include tyrosine kinases, among them the EGFR homologue ERBB4; multiple ephrin receptor genes, notably EPHA3; vascular endothelial growth factor receptor KDR; and NTRK genes. These data provide evidence of somatic mutations in primary lung adenocarcinoma for several tumour suppressor genes involved in other cancers - including NF1, APC, RB1 and ATM - and for sequence changes in PTPRD as well as the frequently deleted gene LRP1B. The observed mutational profiles correlate with clinical features, smoking status and DNA repair defects. These results are reinforced by data integration including single nucleotide polymorphism array and gene expression array. Our findings shed further light on several important signalling pathways involved in lung adenocarcinoma, and suggest new molecular targets for treatment.National Human Genome Research InstituteWe thank A. Lash, M.F. Zakowski, M.G. Kris and V. Rusch for intellectual contributions, and many members of the Baylor Human Genome Sequencing Center, the Broad Institute of Harvard and MIT, and the Genome Center at Washington University for support. This work was funded by grants from the National Human Genome Research Institute to E.S.L., R.A.G. and R.K.W.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/62885/1/nature07423.pd

    Genome-wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure

    Get PDF
    Heart failure (HF) is a leading cause of morbidity and mortality worldwide. A small proportion of HF cases are attributable to monogenic cardiomyopathies and existing genome-wide association studies (GWAS) have yielded only limited insights, leaving the observed heritability of HF largely unexplained. We report results from a GWAS meta-analysis of HF comprising 47,309 cases and 930,014 controls. Twelve independent variants at 11 genomic loci are associated with HF, all of which demonstrate one or more associations with coronary artery disease (CAD), atrial fibrillation, or reduced left ventricular function, suggesting shared genetic aetiology. Functional analysis of non-CAD-associated loci implicate genes involved in cardiac development (MYOZ1, SYNPO2L), protein homoeostasis (BAG3), and cellular senescence (CDKN1A). Mendelian randomisation analysis supports causal roles for several HF risk factors, and demonstrates CAD-independent effects for atrial fibrillation, body mass index, and hypertension. These findings extend our knowledge of the pathways underlying HF and may inform new therapeutic strategies

    Comprehensive Pan-Genomic Characterization of Adrenocortical Carcinoma

    Get PDF
    SummaryWe describe a comprehensive genomic characterization of adrenocortical carcinoma (ACC). Using this dataset, we expand the catalogue of known ACC driver genes to include PRKAR1A, RPL22, TERF2, CCNE1, and NF1. Genome wide DNA copy-number analysis revealed frequent occurrence of massive DNA loss followed by whole-genome doubling (WGD), which was associated with aggressive clinical course, suggesting WGD is a hallmark of disease progression. Corroborating this hypothesis were increased TERT expression, decreased telomere length, and activation of cell-cycle programs. Integrated subtype analysis identified three ACC subtypes with distinct clinical outcome and molecular alterations which could be captured by a 68-CpG probe DNA-methylation signature, proposing a strategy for clinical stratification of patients based on molecular markers

    TRY plant trait database – enhanced coverage and open access

    Get PDF
    Plant traits - the morphological, anatomical, physiological, biochemical and phenological characteristics of plants - determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait‐based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits - almost complete coverage for ‘plant growth form’. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait–environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
    corecore