327 research outputs found

    CleanEx: a database of heterogeneous gene expression data based on a consistent gene nomenclature and linked to an improved annotation system

    Get PDF
    Résumé: L'automatisation du séquençage et de l'annotation des génomes, ainsi que l'application à large échelle de méthodes de mesure de l'expression génique, génèrent une quantité phénoménale de données pour des organismes modèles tels que l'homme ou la souris. Dans ce déluge de données, il devient très difficile d'obtenir des informations spécifiques à un organisme ou à un gène, et une telle recherche aboutit fréquemment à des réponses fragmentées, voir incomplètes. La création d'une base de données capable de gérer et d'intégrer aussi bien les données génomiques que les données transcriptomiques peut grandement améliorer la vitesse de recherche ainsi que la qualité des résultats obtenus, en permettant une comparaison directe de mesures d'expression des gènes provenant d'expériences réalisées grâce à des techniques différentes. L'objectif principal de ce projet, appelé CleanEx, est de fournir un accès direct aux données d'expression publiques par le biais de noms de gènes officiels, et de représenter des données d'expression produites selon des protocoles différents de manière à faciliter une analyse générale et une comparaison entre plusieurs jeux de données. Une mise à jour cohérente et régulière de la nomenclature des gènes est assurée en associant chaque expérience d'expression de gène à un identificateur permanent de la séquence-cible, donnant une description physique de la population d'ARN visée par l'expérience. Ces identificateurs sont ensuite associés à intervalles réguliers aux catalogues, en constante évolution, des gènes d'organismes modèles. Cette procédure automatique de traçage se fonde en partie sur des ressources externes d'information génomique, telles que UniGene et RefSeq. La partie centrale de CleanEx consiste en un index de gènes établi de manière hebdomadaire et qui contient les liens à toutes les données publiques d'expression déjà incorporées au système. En outre, la base de données des séquences-cible fournit un lien sur le gène correspondant ainsi qu'un contrôle de qualité de ce lien pour différents types de ressources expérimentales, telles que des clones ou des sondes Affymetrix. Le système de recherche en ligne de CleanEx offre un accès aux entrées individuelles ainsi qu'à des outils d'analyse croisée de jeux de donnnées. Ces outils se sont avérés très efficaces dans le cadre de la comparaison de l'expression de gènes, ainsi que, dans une certaine mesure, dans la détection d'une variation de cette expression liée au phénomène d'épissage alternatif. Les fichiers et les outils de CleanEx sont accessibles en ligne (http://www.cleanex.isb-sib.ch/). Abstract: The automatic genome sequencing and annotation, as well as the large-scale gene expression measurements methods, generate a massive amount of data for model organisms. Searching for genespecific or organism-specific information througout all the different databases has become a very difficult task, and often results in fragmented and unrelated answers. The generation of a database which will federate and integrate genomic and transcriptomic data together will greatly improve the search speed as well as the quality of the results by allowing a direct comparison of expression results obtained by different techniques. The main goal of this project, called the CleanEx database, is thus to provide access to public gene expression data via unique gene names and to represent heterogeneous expression data produced by different technologies in a way that facilitates joint analysis and crossdataset comparisons. A consistent and uptodate gene nomenclature is achieved by associating each single gene expression experiment with a permanent target identifier consisting of a physical description of the targeted RNA population or the hybridization reagent used. These targets are then mapped at regular intervals to the growing and evolving catalogues of genes from model organisms, such as human and mouse. The completely automatic mapping procedure relies partly on external genome information resources such as UniGene and RefSeq. The central part of CleanEx is a weekly built gene index containing crossreferences to all public expression data already incorporated into the system. In addition, the expression target database of CleanEx provides gene mapping and quality control information for various types of experimental resources, such as cDNA clones or Affymetrix probe sets. The Affymetrix mapping files are accessible as text files, for further use in external applications, and as individual entries, via the webbased interfaces . The CleanEx webbased query interfaces offer access to individual entries via text string searches or quantitative expression criteria, as well as crossdataset analysis tools, and crosschip gene comparison. These tools have proven to be very efficient in expression data comparison and even, to a certain extent, in detection of differentially expressed splice variants. The CleanEx flat files and tools are available online at: http://www.cleanex.isbsib. ch/

    Human MAF1 targets and represses active RNA polymerase III genes by preventing recruitment rather than inducing long-term transcriptional arrest.

    Get PDF
    RNA polymerase III (Pol III) is tightly controlled in response to environmental cues, yet a genomic-scale picture of Pol III regulation and the role played by its repressor MAF1 is lacking. Here, we describe genome-wide studies in human fibroblasts that reveal a dynamic and gene-specific adaptation of Pol III recruitment to extracellular signals in an mTORC1-dependent manner. Repression of Pol III recruitment and transcription are tightly linked to MAF1, which selectively localizes at Pol III loci, even under serum-replete conditions, and increasingly targets transcribing Pol III in response to serum starvation. Combining Pol III binding profiles with EU-labeling and high-throughput sequencing of newly synthesized small RNAs, we show that Pol III occupancy closely reflects ongoing transcription. Our results exclude the long-term, unproductive arrest of Pol III on the DNA as a major regulatory mechanism and identify previously uncharacterized, differential coordination in Pol III binding and transcription under different growth conditions

    HCF-2 inhibits cell proliferation and activates differentiation-gene expression programs.

    Get PDF
    HCF-2 is a member of the host-cell-factor protein family, which arose in early vertebrate evolution as a result of gene duplication. Whereas its paralog, HCF-1, is known to act as a versatile chromatin-associated protein required for cell proliferation and differentiation, much less is known about HCF-2. Here, we show that HCF-2 is broadly present in human and mouse cells, and possesses activities distinct from HCF-1. Unlike HCF-1, which is excluded from nucleoli, HCF-2 is nucleolar-an activity conferred by one and a half C-terminal Fibronectin type 3 repeats and inhibited by the HCF-1 nuclear localization signal. Elevated HCF-2 synthesis in HEK-293 cells results in phenotypes reminiscent of HCF-1-depleted cells, including inhibition of cell proliferation and mitotic defects. Furthermore, increased HCF-2 levels in HEK-293 cells lead to inhibition of cell proliferation and metabolism gene-expression programs with parallel activation of differentiation and morphogenesis gene-expression programs. Thus, the HCF ancestor appears to have evolved into a small two-member protein family possessing contrasting nuclear versus nucleolar localization, and cell proliferation and differentiation functions

    Transcriptional interference by RNA polymerase III affects expression of the Polr3e gene.

    Get PDF
    Overlapping gene arrangements can potentially contribute to gene expression regulation. A mammalian interspersed repeat (MIR) nested in antisense orientation within the first intron of the Polr3e gene, encoding an RNA polymerase III (Pol III) subunit, is conserved in mammals and highly occupied by Pol III. Using a fluorescence assay, CRISPR/Cas9-mediated deletion of the MIR in mouse embryonic stem cells, and chromatin immunoprecipitation assays, we show that the MIR affects Polr3e expression through transcriptional interference. Our study reveals a mechanism by which a Pol II gene can be regulated at the transcription elongation level by transcription of an embedded antisense Pol III gene

    CleanEx: new data extraction and merging tools based on MeSH term annotation

    Get PDF
    The CleanEx expression database (http://www.cleanex.isb-sib.ch) provides access to public gene expression data via unique gene names as well as via experiments biomedical characteristics. To reach this, a dual annotation of both sequences and experiments has been generated. First, the system links official gene symbols to any kind of sequences used for gene expression measurements (cDNA, Affymetrix, oligonucleotide arrays, SAGE or MPSS tags, Expressed Sequence Tags or other mRNA sequences, etc.). For the biomedical annotation, we re-annotate each experiment from the CleanEx database with the MeSH (Medical Subject Headings) terms, primarily used by NLM (National Library of Medicine) for indexing articles for the MEDLINE/PubMED database. This annotation allows a fast and easy retrieval of expression data with common biological or medical features. The numerical data can then be exported as matrix-like tab-delimited text files. Data can be extracted from either one dataset or from heterogeneous datasets

    CleanEx: a database of heterogeneous gene expression data based on a consistent gene nomenclature

    Get PDF
    The main goal of CleanEx is to provide access to public gene expression data via unique gene names. A second objective is to represent heterogeneous expression data produced by different technologies in a way that facilitates joint analysis and cross-data set comparisons. A consistent and up-to-date gene nomenclature is achieved by associating each single experiment with a permanent target identifier consisting of a physical description of the targeted RNA population or the hybridization reagent used. These targets are then mapped at regular intervals to the growing and evolving catalogues of human genes and genes from model organisms. The completely automatic mapping procedure relies partly on external genome information resources such as UniGene and RefSeq. The central part of CleanEx is a weekly built gene index containing cross-references to all public expression data already incorporated into the system. In addition, the expression target database of CleanEx provides gene mapping and quality control information for various types of experimental resource, such as cDNA clones or Affymetrix probe sets. The web-based query interfaces offer access to individual entries via text string searches or quantitative expression criteria. CleanEx is accessible at: http://www.cleanex.isb-sib.ch/

    Genomic Study of RNA Polymerase II and III SNAP(c)-Bound Promoters Reveals a Gene Transcribed by Both Enzymes and a Broad Use of Common Activators.

    Get PDF
    SNAP(c) is one of a few basal transcription factors used by both RNA polymerase (pol) II and pol III. To define the set of active SNAP(c)-dependent promoters in human cells, we have localized genome-wide four SNAP(c) subunits, GTF2B (TFIIB), BRF2, pol II, and pol III. Among some seventy loci occupied by SNAP(c) and other factors, including pol II snRNA genes, pol III genes with type 3 promoters, and a few un-annotated loci, most are primarily occupied by either pol II and GTF2B, or pol III and BRF2. A notable exception is the RPPH1 gene, which is occupied by significant amounts of both polymerases. We show that the large majority of SNAP(c)-dependent promoters recruit POU2F1 and/or ZNF143 on their enhancer region, and a subset also recruits GABP, a factor newly implicated in SNAP(c)-dependent transcription. These activators associate with pol II and III promoters in G1 slightly before the polymerase, and ZNF143 is required for efficient transcription initiation complex assembly. The results characterize a set of genes with unique properties and establish that polymerase specificity is not absolute in vivo

    MAF1 is a chronic repressor of RNA polymerase III transcription in the mouse.

    Get PDF
    Maf1 <sup>-/-</sup> mice are lean, obesity-resistant and metabolically inefficient. Their increased energy expenditure is thought to be driven by a futile RNA cycle that reprograms metabolism to meet an increased demand for nucleotides stemming from the deregulation of RNA polymerase (pol) III transcription. Metabolic changes consistent with this model have been reported in both fasted and refed mice, however the impact of the fasting-refeeding-cycle on pol III function has not been examined. Here we show that changes in pol III occupancy in the liver of fasted versus refed wild-type mice are largely confined to low and intermediate occupancy genes; high occupancy genes are unchanged. However, in Maf1 <sup>-/-</sup> mice, pol III occupancy of the vast majority of active loci in liver and the levels of specific precursor tRNAs in this tissue and other organs are higher than wild-type in both fasted and refed conditions. Thus, MAF1 functions as a chronic repressor of active pol III loci and can modulate transcription under different conditions. Our findings support the futile RNA cycle hypothesis, elaborate the mechanism of pol III repression by MAF1 and demonstrate a modest effect of MAF1 on global translation via reduced mRNA levels and translation efficiencies for several ribosomal proteins

    Mechanism of selective recruitment of RNA polymerases II and III to snRNA gene promoters

    Get PDF
    RNA polymerase II (Pol II) small nuclear RNA (snRNA) promoters and type 3 Pol III promoters have highly similar structures; both contain an interchangeable enhancer and "proximal sequence element" (PSE), which recruits the SNAP complex (SNAPc). The main distinguishing feature is the presence, in the type 3 promoters only, of a TATA box, which determines Pol III specificity. To understand the mechanism by which the absence or presence of a TATA box results in specific Pol recruitment, we examined how SNAPc and general transcription factors required for Pol II or Pol III transcription of SNAPc-dependent genes (i.e., TATA-box-binding protein [TBP], TFIIB, and TFIIA for Pol II transcription and TBP and BRF2 for Pol III transcription) assemble to ensure specific Pol recruitment. TFIIB and BRF2 could each, in a mutually exclusive fashion, be recruited to SNAPc. In contrast, TBP-TFIIB and TBP-BRF2 complexes were not recruited unless a TATA box was present, which allowed selective and efficient recruitment of the TBP-BRF2 complex. Thus, TBP both prevented BRF2 recruitment to Pol II promoters and enhanced BRF2 recruitment to Pol III promoters. On Pol II promoters, TBP recruitment was separate from TFIIB recruitment and enhanced by TFIIA. Our results provide a model for specific Pol recruitment at SNAPc-dependent promoters
    corecore