75 research outputs found

    The effects of protein interactions, gene essentiality and regulatory regions on expression variation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identifying factors affecting gene expression variation is a challenging problem in genetics. Previous studies have shown that the presence of TATA box, the number of <it>cis</it>-regulatory elements, gene essentiality, and protein interactions significantly affect gene expression variation. Nonetheless, the need to obtain a more complete understanding of such factors and how their interactions influence gene expression variation remains a challenge. The growth rates of yeast cells under several DNA-damaging conditions have been studied and a gene's toxicity degree is defined as the number of such conditions that the growth rate of the yeast deletion strain is significantly affected. Since toxicity degree reflects a gene's importance to cell survival under DNA-damaging conditions, we expect that it is negatively associated with gene expression variation. Mutations in both <it>cis</it>-regulatory elements and transcription factors (TF) regulating a gene affect the gene's expression and thus we study the relationship between gene expression variation and the number of TFs regulating a gene. Most importantly we study how these factors interact with each other influencing gene expression variation.</p> <p>Results</p> <p>Using yeast as a model system, we evaluated the effects of four separate factors and their interactions on gene expression variation: protein interaction degree, toxicity degree, number of TFs, and the presence of TATA box. Results showed that 1) gene expression variation is negatively correlated with the protein interaction degree in the protein interaction network, 2) essential genes tend to have less expression variation than non-essential genes and gene expression variation decreases with toxicity degree, and 3) the number of TFs regulating a gene is the most important factor influencing gene expression variation (R<sup>2 </sup>= 8–14%). In addition, the number of TFs regulating a gene was found to be an important factor influencing gene expression variation for both TATA-containing and non-TATA-containing genes, but with different association strength. Moreover, gene expression variation was significantly negatively correlated with toxicity degree only for TATA-containing genes.</p> <p>Conclusion</p> <p>The finding that distinct mechanisms may influence gene expression variation in TATA-containing and non-TATA-containing genes, provides new insights into the mechanisms that underlie the evolution of gene expression.</p

    Chromatin Regulation and Gene Centrality Are Essential for Controlling Fitness Pleiotropy in Yeast

    Get PDF
    There are a wide range of phenotypes that are due to loss-of-function or null mutations. Previously, the functions of gene products that distinguish essential from nonessential genes were characterized. However, the functions of products of non-essential genes that contribute to fitness remain minimally understood.Using data from Saccharomyces cerevisiae, we investigated several gene characteristics, which we are able to measure, that are significantly associated with a gene's fitness pleiotropy. Fitness pleiotropy is a measurement of the gene's importance to fitness. These characteristics include: 1) whether the gene's product functions in chromatin regulation, 2) whether the regulation of the gene is influenced by chromatin state, measured by chromatin regulation effect (CRE), 3) whether the gene's product functions as a transcription factor (TF) and the number of genes a TF regulates, 4) whether the gene contains TATA-box, and 5) whether the gene's product is central in a protein interaction network. Partial correlation analysis was used to study how these characteristics interact to influence fitness pleiotropy. We show that all five characteristics that were measured are statistically significantly associated with fitness pleiotropy. However, fitness pleiotropy is not associated with the presence of TATA-box when CRE is controlled. In particular, two characteristics: 1) whether the regulation of a gene is more likely to be influenced by chromatin state, and 2) whether the gene product is central in a protein interaction network measured by the number of protein interactions were found to play the most important roles affecting a gene's fitness pleiotropy.These findings highlight the significance of both epigenetic gene regulation and protein interaction networks in influencing the fitness pleiotropy

    Mapping Genetically Compensatory Pathways from Synthetic Lethal Interactions in Yeast

    Get PDF
    Background: Synthetic lethal genetic interaction analysis has been successfully applied to predicting the functions of genes and their pathway identities. In the context of synthetic lethal interaction data alone, the global similarity of synthetic lethal interaction patterns between two genes is used to predict gene function. With physical interaction data, such as proteinprotein interactions, the enrichment of physical interactions within subsets of genes and the enrichment of synthetic lethal interactions between those subsets of genes are used as an indication of compensatory pathways. Result: In this paper, we propose a method of mapping genetically compensatory pathways from synthetic lethal interactions. Our method is designed to discover pairs of gene-sets in which synthetic lethal interactions are depleted among the genes in an individual set and where such gene-set pairs are connected by many synthetic lethal interactions. By its nature, our method could select compensatory pathway pairs that buffer the deleterious effect of the failure of either one, without the need of physical interaction data. By focusing on compensatory pathway pairs where genes in each individual pathway have a highly homogenous cellular function, we show that many cellular functions have genetically compensatory properties. Conclusion: We conclude that synthetic lethal interaction data are a powerful source to map genetically compensatory pathways, especially in systems lacking physical interaction information, and that the cellular function network contain

    A highly efficient and effective motif discovery method for ChIP-seq/ChIP-chip data using positional information

    Get PDF
    Identification of DNA motifs from ChIP-seq/ChIP-chip [chromatin immunoprecipitation (ChIP)] data is a powerful method for understanding the transcriptional regulatory network. However, most established methods are designed for small sample sizes and are inefficient for ChIP data. Here we propose a new k-mer occurrence model to reflect the fact that functional DNA k-mers often cluster around ChIP peak summits. With this model, we introduced a new measure to discover functional k-mers. Using simulation, we demonstrated that our method is more robust against noises in ChIP data than available methods. A novel word clustering method is also implemented to group similar k-mers into position weight matrices (PWMs). Our method was applied to a diverse set of ChIP experiments to demonstrate its high sensitivity and specificity. Importantly, our method is much faster than several other methods for large sample sizes. Thus, we have developed an efficient and effective motif discovery method for ChIP experiments

    Comprehensive molecular and clinical characterization of NUP98 fusions in pediatric acute myeloid leukemia

    Get PDF
    NUP98 fusions comprise a family of rare recurrent alterations in AML, associated with adverse outcomes. In order to define the underlying biology and clinical implications of this family of fusions, we performed comprehensive transcriptome, epigenome, and immunophenotypic profiling of 2,235 children and young adults with AML and identified 160 NUP98 rearrangements (7.2%), including 108 NUP98-NSD1 (4.8%), 32 NUP98-KDM5A (1.4%) and 20 NUP98-X cases (0.9%) with 13 different fusion partners. Fusion partners defined disease characteristics and biology; patients with NUP98-NSD1 or NUP98-KDM5A had distinct immunophenotypic, transcriptomic, and epigenomic profiles. Unlike the two most prevalent NUP98 fusions, NUP98-X variants are typically not cryptic. Furthermore, NUP98-X cases are associated with WT1 mutations, and have epigenomic profiles that resemble either NUP98-NSD1 or NUP98-KDM5A. Cooperating FLT3-ITD and WT1 mutations define NUP98-NSD1, and chromosome 13 aberrations are highly enriched in NUP98-KDM5A. Importantly, we demonstrate that NUP98 fusions portend dismal overall survival, with the noteworthy exception of patients bearing abnormal chromosome 13 (clinicaltrials gov. Identifiers: NCT00002798, NCT00070174, NCT00372593, NCT01371981).</p

    Comprehensive molecular and clinical characterization of NUP98 fusions in pediatric acute myeloid leukemia

    Get PDF
    NUP98 fusions comprise a family of rare recurrent alterations in AML, associated with adverse outcomes. In order to define the underlying biology and clinical implications of this family of fusions, we performed comprehensive transcriptome, epigenome, and immunophenotypic profiling of 2,235 children and young adults with AML and identified 160 NUP98 rearrangements (7.2%), including 108 NUP98-NSD1 (4.8%), 32 NUP98-KDM5A (1.4%) and 20 NUP98-X cases (0.9%) with 13 different fusion partners. Fusion partners defined disease characteristics and biology; patients with NUP98-NSD1 or NUP98-KDM5A had distinct immunophenotypic, transcriptomic, and epigenomic profiles. Unlike the two most prevalent NUP98 fusions, NUP98-X variants are typically not cryptic. Furthermore, NUP98-X cases are associated with WT1 mutations, and have epigenomic profiles that resemble either NUP98-NSD1 or NUP98-KDM5A. Cooperating FLT3-ITD and WT1 mutations define NUP98-NSD1, and chromosome 13 aberrations are highly enriched in NUP98-KDM5A. Importantly, we demonstrate that NUP98 fusions portend dismal overall survival, with the noteworthy exception of patients bearing abnormal chromosome 13 (clinicaltrials gov. Identifiers: NCT00002798, NCT00070174, NCT00372593, NCT01371981).</p

    Synchronization of cytoplasmic and transferred mitochondrial ribosomal protein gene expression in land plants is linked to Telo-box motif enrichment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Chloroplasts and mitochondria evolved from the endosymbionts of once free-living eubacteria, and they transferred most of their genes to the host nuclear genome during evolution. The mechanisms used by plants to coordinate the expression of such transferred genes, as well as other genes in the host nuclear genome, are still poorly understood.</p> <p>Results</p> <p>In this paper, we use nuclear-encoded chloroplast (cpRPGs), as well as mitochondrial (mtRPGs) and cytoplasmic (euRPGs) ribosomal protein genes to study the coordination of gene expression between organelles and the host. Results show that the mtRPGs, but not the cpRPGs, exhibit strongly synchronized expression with euRPGs in all investigated land plants and that this phenomenon is linked to the presence of a <it>telo</it>-box DNA motif in the promoter regions of mtRPGs and euRPGs. This motif is also enriched in the promoter regions of genes involved in DNA replication. Sequence analysis further indicates that mtRPGs, in contrast to cpRPGs, acquired <it>telo</it>-box from the host nuclear genome.</p> <p>Conclusions</p> <p>Based on our results, we propose a model of plant nuclear genome evolution where coordination of activities in mitochondria and chloroplast and other cellular functions, including cell cycle, might have served as a strong selection pressure for the differential acquisition of <it>telo</it>-box between mtRPGs and cpRPGs. This research also highlights the significance of physiological needs in shaping transcriptional regulatory evolution.</p

    Bivalent-Like Chromatin Markers Are Predictive for Transcription Start Site Distribution in Human

    Get PDF
    Deep sequencing of 5′ capped transcripts has revealed a variety of transcription initiation patterns, from narrow, focused promoters to wide, broad promoters. Attempts have already been made to model empirically classified patterns, but virtually no quantitative models for transcription initiation have been reported. Even though both genetic and epigenetic elements have been associated with such patterns, the organization of regulatory elements is largely unknown. Here, linear regression models were derived from a pool of regulatory elements, including genomic DNA features, nucleosome organization, and histone modifications, to predict the distribution of transcription start sites (TSS). Importantly, models including both active and repressive histone modification markers, e.g. H3K4me3 and H4K20me1, were consistently found to be much more predictive than models with only single-type histone modification markers, indicating the possibility of “bivalent-like” epigenetic control of transcription initiation. The nucleosome positions are proposed to be coded in the active component of such bivalent-like histone modification markers. Finally, we demonstrated that models trained on one cell type could successfully predict TSS distribution in other cell types, suggesting that these models may have a broader application range

    Deregulation of DUX4 and ERG in acute lymphoblastic leukemia

    Get PDF
    Chromosomal rearrangements deregulating hematopoietic transcription factors are common in acute lymphoblastic leukemia (ALL).1,2 Here, we show that deregulation of the homeobox transcription factor gene DUX4 and the ETS transcription factor gene ERG are hallmarks of a subtype of B-progenitor ALL that comprises up to 7% of B-ALL. DUX4 rearrangement and overexpression was present in all cases, and was accompanied by transcriptional deregulation of ERG, expression of a novel ERG isoform, ERGalt, and frequent ERG deletion. ERGalt utilizes a non-canonical first exon whose transcription was initiated by DUX4 binding. ERGalt retains the DNA-binding and transactivating domains of ERG, but inhibits wild-type ERG transcriptional activity and is transforming. These results illustrate a unique paradigm of transcription factor deregulation in leukemia, in which DUX4 deregulation results in loss-of-function of ERG, either by deletion or induction of expression of an isoform that is a dominant negative inhibitor of wild type ERG function
    corecore