85 research outputs found

    High Functional Diversity in Mycobacterium tuberculosis Driven by Genetic Drift and Human Demography

    Get PDF
    Mycobacterium tuberculosis infects one third of the human world population and kills someone every 15 seconds. For more than a century, scientists and clinicians have been distinguishing between the human- and animal-adapted members of the M. tuberculosis complex (MTBC). However, all human-adapted strains of MTBC have traditionally been considered to be essentially identical. We surveyed sequence diversity within a global collection of strains belonging to MTBC using seven megabase pairs of DNA sequence data. We show that the members of MTBC affecting humans are more genetically diverse than generally assumed, and that this diversity can be linked to human demographic and migratory events. We further demonstrate that these organisms are under extremely reduced purifying selection and that, as a result of increased genetic drift, much of this genetic diversity is likely to have functional consequences. Our findings suggest that the current increases in human population, urbanization, and global travel, combined with the population genetic characteristics of M. tuberculosis described here, could contribute to the emergence and spread of drug-resistant tuberculosis

    Population-specific genetic modification of Huntington\u27s disease in Venezuela.

    Get PDF
    Modifiers of Mendelian disorders can provide insights into disease mechanisms and guide therapeutic strategies. A recent genome-wide association (GWA) study discovered genetic modifiers of Huntington\u27s disease (HD) onset in Europeans. Here, we performed whole genome sequencing and GWA analysis of a Venezuelan HD cluster whose families were crucial for the original mapping of the HD gene defect. The Venezuelan HD subjects develop motor symptoms earlier than their European counterparts, implying the potential for population-specific modifiers. The main Venezuelan HD family inherits HTT haplotype hap.03, which differs subtly at the sequence level from European HD hap.03, suggesting a different ancestral origin but not explaining the earlier age at onset in these Venezuelans. GWA analysis of the Venezuelan HD cluster suggests both population-specific and population-shared genetic modifiers. Genome-wide significant signals at 7p21.2-21.1 and suggestive association signals at 4p14 and 17q21.2 are evident only in Venezuelan HD, but genome-wide significant association signals at the established European chromosome 15 modifier locus are improved when Venezuelan HD data are included in the meta-analysis. Venezuelan-specific association signals on chromosome 7 center on SOSTDC1, which encodes a bone morphogenetic protein antagonist. The corresponding SNPs are associated with reduced expression of SOSTDC1 in non-Venezuelan tissue samples, suggesting that interaction of reduced SOSTDC1 expression with a population-specific genetic or environmental factor may be responsible for modification of HD onset in Venezuela. Detection of population-specific modification in Venezuelan HD supports the value of distinct disease populations in revealing novel aspects of a disease and population-relevant therapeutic strategies

    Rare variants implicate NMDA receptor signaling and cerebellar gene networks in risk for bipolar disorder

    Get PDF
    Bipolar disorder is an often-severe mental health condition characterized by alternation between extreme mood states of mania and depression. Despite strong heritability and the recent identification of 64 common variant risk loci of small effect, pathophysiological mechanisms remain unknown. Here, we analyzed genome sequences from 41 multiply-affected pedigrees and identified variants in 741 genes with nominally significant linkage or association with bipolar disorder. These 741 genes overlapped known risk genes for neurodevelopmental disorders and clustered within gene networks enriched for synaptic and nuclear functions. The top variant in this analysis - prioritized by statistical association, predicted deleteriousness, and network centrality - was a missense variant in the gene encoding D-amino acid oxidase (DAOG131V). Heterologous expression of DAOG131V in human cells resulted in decreased DAO protein abundance and enzymatic activity. In a knock-in mouse model of DAOG131, DaoG130V/+, we similarly found decreased DAO protein abundance in hindbrain regions, as well as enhanced stress susceptibility and blunted behavioral responses to pharmacological inhibition of N-methyl-D-aspartate receptors (NMDARs). RNA sequencing of cerebellar tissue revealed that DaoG130V resulted in decreased expression of two gene networks that are enriched for synaptic functions and for genes expressed, respectively, in Purkinje neurons or granule neurons. These gene networks were also down-regulated in the cerebellum of patients with bipolar disorder compared to healthy controls and were enriched for additional rare variants associated with bipolar disorder risk. These findings implicate dysregulation of NMDAR signaling and of gene expression in cerebellar neurons in bipolar disorder pathophysiology and provide insight into its genetic architecture

    Application of affymetrix array and massively parallel signature sequencing for identification of genes involved in prostate cancer progression

    Get PDF
    BACKGROUND: Affymetrix GeneChip Array and Massively Parallel Signature Sequencing (MPSS) are two high throughput methodologies used to profile transcriptomes. Each method has certain strengths and weaknesses; however, no comparison has been made between the data derived from Affymetrix arrays and MPSS. In this study, two lineage-related prostate cancer cell lines, LNCaP and C4-2, were used for transcriptome analysis with the aim of identifying genes associated with prostate cancer progression. METHODS: Affymetrix GeneChip array and MPSS analyses were performed. Data was analyzed with GeneSpring 6.2 and in-house perl scripts. Expression array results were verified with RT-PCR. RESULTS: Comparison of the data revealed that both technologies detected genes the other did not. In LNCaP, 3,180 genes were only detected by Affymetrix and 1,169 genes were only detected by MPSS. Similarly, in C4-2, 4,121 genes were only detected by Affymetrix and 1,014 genes were only detected by MPSS. Analysis of the combined transcriptomes identified 66 genes unique to LNCaP cells and 33 genes unique to C4-2 cells. Expression analysis of these genes in prostate cancer specimens showed CA1 to be highly expressed in bone metastasis but not expressed in primary tumor and EPHA7 to be expressed in normal prostate and primary tumor but not bone metastasis. CONCLUSION: Our data indicates that transcriptome profiling with a single methodology will not fully assess the expression of all genes in a cell line. A combination of transcription profiling technologies such as DNA array and MPSS provides a more robust means to assess the expression profile of an RNA sample. Finally, genes that were differentially expressed in cell lines were also differentially expressed in primary prostate cancer and its metastases

    Uncovering a Macrophage Transcriptional Program by Integrating Evidence from Motif Scanning and Expression Dynamics

    Get PDF
    Macrophages are versatile immune cells that can detect a variety of pathogen-associated molecular patterns through their Toll-like receptors (TLRs). In response to microbial challenge, the TLR-stimulated macrophage undergoes an activation program controlled by a dynamically inducible transcriptional regulatory network. Mapping a complex mammalian transcriptional network poses significant challenges and requires the integration of multiple experimental data types. In this work, we inferred a transcriptional network underlying TLR-stimulated murine macrophage activation. Microarray-based expression profiling and transcription factor binding site motif scanning were used to infer a network of associations between transcription factor genes and clusters of co-expressed target genes. The time-lagged correlation was used to analyze temporal expression data in order to identify potential causal influences in the network. A novel statistical test was developed to assess the significance of the time-lagged correlation. Several associations in the resulting inferred network were validated using targeted ChIP-on-chip experiments. The network incorporates known regulators and gives insight into the transcriptional control of macrophage activation. Our analysis identified a novel regulator (TGIF1) that may have a role in macrophage activation

    Insights into hominid evolution from the gorilla genome sequence.

    Get PDF
    Gorillas are humans' closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution

    An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge

    Get PDF
    There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors. The challenge was to analyze and interpret these data with the goals of identifying disease-causing variants and reporting the findings in a clinically useful format. Participating contestant groups were solicited broadly, and an independent panel of judges evaluated their performance. RESULTS: A total of 30 international groups were engaged. The entries reveal a general convergence of practices on most elements of the analysis and interpretation process. However, even given this commonality of approach, only two groups identified the consensus candidate variants in all disease cases, demonstrating a need for consistent fine-tuning of the generally accepted methods. There was greater diversity of the final clinical report content and in the patient consenting process, demonstrating that these areas require additional exploration and standardization. CONCLUSIONS: The CLARITY Challenge provides a comprehensive assessment of current practices for using genome sequencing to diagnose and report genetic diseases. There is remarkable convergence in bioinformatic techniques, but medical interpretation and reporting are areas that require further development by many groups

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    No full text
    Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts
    corecore