23,214 research outputs found

    Pan-Cancer Analysis of lncRNA Regulation Supports Their Targeting of Cancer Genes in Each Tumor Context

    Get PDF
    Long noncoding RNAs (lncRNAs) are commonly dys-regulated in tumors, but only a handful are known toplay pathophysiological roles in cancer. We inferredlncRNAs that dysregulate cancer pathways, onco-genes, and tumor suppressors (cancer genes) bymodeling their effects on the activity of transcriptionfactors, RNA-binding proteins, and microRNAs in5,185 TCGA tumors and 1,019 ENCODE assays.Our predictions included hundreds of candidateonco- and tumor-suppressor lncRNAs (cancerlncRNAs) whose somatic alterations account for thedysregulation of dozens of cancer genes and path-ways in each of 14 tumor contexts. To demonstrateproof of concept, we showed that perturbations tar-geting OIP5-AS1 (an inferred tumor suppressor) andTUG1 and WT1-AS (inferred onco-lncRNAs) dysre-gulated cancer genes and altered proliferation ofbreast and gynecologic cancer cells. Our analysis in-dicates that, although most lncRNAs are dysregu-lated in a tumor-specific manner, some, includingOIP5-AS1, TUG1, NEAT1, MEG3, and TSIX, synergis-tically dysregulate cancer pathways in multiple tumorcontexts

    Characterization of the ZFX family of transcription factors that bind downstream of the start site of CpG island promoters

    Get PDF
    Our study focuses on a family of ubiquitously expressed human C₂H₂ zinc finger proteins comprised of ZFX, ZFY and ZNF711. Although their protein structure suggests that ZFX, ZFY and ZNF711 are transcriptional regulators, the mechanisms by which they influence transcription have not yet been elucidated. We used CRISPR-mediated deletion to create bi-allelic knockouts of ZFX and/or ZNF711 in female HEK293T cells (which naturally lack ZFY). We found that loss of either ZFX or ZNF711 reduced cell growth and that the double knockout cells have major defects in proliferation. RNA-seq analysis revealed that thousands of genes showed altered expression in the double knockout clones, suggesting that these TFs are critical regulators of the transcriptome. To gain insight into how these TFs regulate transcription, we created mutant ZFX proteins and analyzed them for DNA binding and transactivation capability. We found that zinc fingers 11–13 are necessary and sufficient for DNA binding and, in combination with the N terminal region, constitute a functional transactivator. Our functional analyses of the ZFX family provides important new insights into transcriptional regulation in human cells by members of the large, but under-studied family of C₂H₂ zinc finger proteins

    USF binding sequences from the HS4 insulator element impose early replication timing on a vertebrate replicator

    Get PDF
    The nuclear genomes of vertebrates show a highly organized program of DNA replication where GC-rich isochores are replicated early in S-phase, while AT-rich isochores are late replicating. GC-rich regions are gene dense and are enriched for active transcription, suggesting a connection between gene regulation and replication timing. Insulator elements can organize independent domains of gene transcription and are suitable candidates for being key regulators of replication timing. We have tested the impact of inserting a strong replication origin flanked by the β-globin HS4 insulator on the replication timing of naturally late replicating regions in two different avian cell types, DT40 (lymphoid) and 6C2 (erythroid). We find that the HS4 insulator has the capacity to impose a shift to earlier replication. This shift requires the presence of HS4 on both sides of the replication origin and results in an advance of replication timing of the target locus from the second half of S-phase to the first half when a transcribed gene is positioned nearby. Moreover, we find that the USF transcription factor binding site is the key cis-element inside the HS4 insulator that controls replication timing. Taken together, our data identify a combination of cis-elements that might constitute the basic unit of multi-replicon megabase-sized early domains of DNA replication

    Quantitative model for inferring dynamic regulation of the tumour suppressor gene p53

    Get PDF
    Background: The availability of various "omics" datasets creates a prospect of performing the study of genome-wide genetic regulatory networks. However, one of the major challenges of using mathematical models to infer genetic regulation from microarray datasets is the lack of information for protein concentrations and activities. Most of the previous researches were based on an assumption that the mRNA levels of a gene are consistent with its protein activities, though it is not always the case. Therefore, a more sophisticated modelling framework together with the corresponding inference methods is needed to accurately estimate genetic regulation from "omics" datasets. Results: This work developed a novel approach, which is based on a nonlinear mathematical model, to infer genetic regulation from microarray gene expression data. By using the p53 network as a test system, we used the nonlinear model to estimate the activities of transcription factor (TF) p53 from the expression levels of its target genes, and to identify the activation/inhibition status of p53 to its target genes. The predicted top 317 putative p53 target genes were supported by DNA sequence analysis. A comparison between our prediction and the other published predictions of p53 targets suggests that most of putative p53 targets may share a common depleted or enriched sequence signal on their upstream non-coding region. Conclusions: The proposed quantitative model can not only be used to infer the regulatory relationship between TF and its down-stream genes, but also be applied to estimate the protein activities of TF from the expression levels of its target genes

    Automated design of bacterial genome sequences

    Get PDF
    Background: Organisms have evolved ways of regulating transcription to better adapt to varying environments. Could the current functional genomics data and models support the possibility of engineering a genome with completely rearranged gene organization while the cell maintains its behavior under environmental challenges? How would we proceed to design a full nucleotide sequence for such genomes? Results: As a first step towards answering such questions, recent work showed that it is possible to design alternative transcriptomic models showing the same behavior under environmental variations than the wild-type model. A second step would require providing evidence that it is possible to provide a nucleotide sequence for a genome encoding such transcriptional model. We used computational design techniques to design a rewired global transcriptional regulation of Escherichia coli, yet showing a similar transcriptomic response than the wild-type. Afterwards, we “compiled” the transcriptional networks into nucleotide sequences to obtain the final genome sequence. Our computational evolution procedure ensures that we can maintain the genotype-phenotype mapping during the rewiring of the regulatory network. We found that it is theoretically possible to reorganize E. coli genome into 86% fewer regulated operons. Such refactored genomes are constituted by operons that contain sets of genes sharing around the 60% of their biological functions and, if evolved under highly variable environmental conditions, have regulatory networks, which turn out to respond more than 20% faster to multiple external perturbations. Conclusions: This work provides the first algorithm for producing a genome sequence encoding a rewired transcriptional regulation with wild-type behavior under alternative environments

    Network-based approaches to explore complex biological systems towards network medicine

    Get PDF
    Network medicine relies on different types of networks: from the molecular level of protein–protein interactions to gene regulatory network and correlation studies of gene expression. Among network approaches based on the analysis of the topological properties of protein–protein interaction (PPI) networks, we discuss the widespread DIAMOnD (disease module detection) algorithm. Starting from the assumption that PPI networks can be viewed as maps where diseases can be identified with localized perturbation within a specific neighborhood (i.e., disease modules), DIAMOnD performs a systematic analysis of the human PPI network to uncover new disease-associated genes by exploiting the connectivity significance instead of connection density. The past few years have witnessed the increasing interest in understanding the molecular mechanism of post-transcriptional regulation with a special emphasis on non-coding RNAs since they are emerging as key regulators of many cellular processes in both physiological and pathological states. Recent findings show that coding genes are not the only targets that microRNAs interact with. In fact, there is a pool of different RNAs—including long non-coding RNAs (lncRNAs) —competing with each other to attract microRNAs for interactions, thus acting as competing endogenous RNAs (ceRNAs). The framework of regulatory networks provides a powerful tool to gather new insights into ceRNA regulatory mechanisms. Here, we describe a data-driven model recently developed to explore the lncRNA-associated ceRNA activity in breast invasive carcinoma. On the other hand, a very promising example of the co-expression network is the one implemented by the software SWIM (switch miner), which combines topological properties of correlation networks with gene expression data in order to identify a small pool of genes—called switch genes—critically associated with drastic changes in cell phenotype. Here, we describe SWIM tool along with its applications to cancer research and compare its predictions with DIAMOnD disease genes

    Cis-regulatory control of the SM50 gene, an early marker of skeletogenic lineage specification in the sea urchin embryo

    Get PDF
    The SM50 gene encodes a minor matrix protein of the sea urchin embryo spicule. We carried out a detailed functional analysis of a cis-regulatory region of this gene, extending 440 bp upstream and 120 bp downstream of the transcription start site, that had been shown earlier to confer accurate skeletogenic expression of an injected expression vector. The distal portion of this fragment contains elements controlling amplitude of expression, while the region from −200 to +105 contains spatial control elements that position expression accurately in the skeletogenic lineages of the embryo. A systematic mutagenesis analysis of this region revealed four adjacent regulatory elements, viz two copies of a positively acting sequence (element D) that are positioned just upstream of the transcription start site; an indispensable spatial control element (element C) that is positioned downstream of the start site; and further downstream, a second positively acting sequence (element A). We then constructed a series of synthetic expression constructs. These contained oligonucleotides representing normal and mutated versions of elements D, C, and A, in various combinations. We also changed the promoter of the SM50 gene from a TATA-less to a canonical TATA box form, without any effect on function. Perfect spatial regulation was also produced by a final series of constructs that consisted entirely of heterologous enhancers from the CyIIIa gene, the SV40 early promoter, and synthetic D, C, and A elements. We demonstrate that element C exercises the primary spatial control function of the region we analyzed. We term this a ‘locator’ element. This differs from conventional ‘tissue-specific enhancers’ in that while it is essential for expression, it has no transcriptional activity on its own, and it requires other, separable, positive regulatory elements for activity. In the normal configuration these ancillary positive functions are mediated by elements A and D. Only positively acting control elements were observed in the SM50 regulatory domain throughout this analysis

    chroGPS, a global chromatin positioning system for the functional analysis and visualization of the epigenome

    Get PDF
    Development of tools to jointly visualize the genome and the epigenome remains a challenge. chroGPS is a computational approach that addresses this question. chroGPS uses multidimensional scaling techniques to represent similarity between epigenetic factors, or between genetic elements on the basis of their epigenetic state, in 2D/3D reference maps. We emphasize biological interpretability, statistical robustness, integration of genetic and epigenetic data from heterogeneous sources, and computational feasibility. Although chroGPS is a general methodology to create reference maps and study the epigenetic state of any class of genetic element or genomic region, we focus on two specific kinds of maps: chroGPSfactors, which visualizes functional similarities between epigenetic factors, and chroGPSgenes, which describes the epigenetic state of genes and integrates gene expression and other functional data. We use data from the modENCODE project on the genomic distribution of a large collection of epigenetic factors in Drosophila, a model system extensively used to study genome organization and function. Our results show that the maps allow straightforward visualization of relationships between factors and elements, capturing relevant information about their functional properties that helps to interpret epigenetic information in a functional context and derive testable hypotheses
    corecore