4 research outputs found

    Genome-wide evidence for an essential role of the human Staf/ZNF143 transcription factor in bidirectional transcription

    Get PDF
    In the human genome, ∌10% of the genes are arranged head to head so that their transcription start sites reside within <1 kbp on opposite strands. In this configuration, a bidirectional promoter generally drives expression of the two genes. How bidirectional expression is performed from these particular promoters constitutes a puzzling question. Here, by a combination of in silico and biochemical approaches, we demonstrate that hStaf/ZNF143 is involved in controlling expression from a subset of divergent gene pairs. The binding sites for hStaf/ZNF143 (SBS) are overrepresented in bidirectional versus unidirectional promoters. Chromatin immunoprecipitation assays with a significant set of bidirectional promoters containing putative SBS revealed that 93% of them are associated with hStaf/ZNF143. Expression of dual reporter genes directed by bidirectional promoters are dependent on the SBS integrity and requires hStaf/ZNF143. Furthermore, in some cases, functional SBS are located in bidirectional promoters of gene pairs encoding a noncoding RNA and a protein gene. Remarkably, hStaf/ZNF143 per se exhibits an inherently bidirectional transcription activity, and together our data provide the demonstration that hStaf/ZNF143 is indeed a transcription factor controlling the expression of divergent protein–protein and protein–non-coding RNA gene pairs

    Establishment of biostatistics-based bioinformatics platform for integration and analysis of genomic, epigenetic and phylogenetic data : application to hSTAF/ZNF143 transcription factor binding sites

    No full text
    Le facteur STAF est une protĂ©ine deux rĂ©gions distinctes d’activation de la transcription, selon la machinerie de transcription mobilisĂ©e. Une Ă©tude rĂ©cente faisant Ă©tat d'un millier de sites potentiels au sein des promoteurs de gĂšnes protĂ©iques et dont 400 furent validĂ©s expĂ©rimentalement laisse supposer que le nombre de sites Ă  l'Ă©chelle du gĂ©nome soit encore plus nombreux, forçant Ă  dĂ©velopper d'autres mĂ©thodes de caractĂ©risation Ă  mĂȘme de questionner le gĂ©nome entier et de forts volumes de donnĂ©es. Ce problĂšme adresse un challenge d'envergure supĂ©rieure : faire Ă©merger de la connaissance Ă  partir d'une rĂ©gion gĂ©nomique. Afin de savoir quelle connaissance est pertinente, il est indispensable d'Ă©valuer en quoi celle-ci s'Ă©carte des valeurs attendues du gĂ©nome et donc de connaitre ces valeurs. Dans cette optique nous avons dĂ©veloppĂ© l'architecture GeCo, solution soutenue par une base de donnĂ©es automatisĂ©e et son portail web, et dont la puissance repose sur son aptitude Ă  dĂ©terminer les valeurs statistiques des gĂšnes et du gĂ©nome complet. CaractĂ©risĂ© par un ensemble de descripteurs (sĂ©quence, Ă©pigĂ©nĂ©tique, phylogĂ©nĂ©tique), le contexte qui en Ă©merge est utilisĂ© pour replacer tout questionnement gĂ©nomique dans son environnement global, de maniĂšre rapide et fiable. Au delĂ  du contexte gĂ©nomique, c'est une philosophie de dĂ©veloppement basĂ©e sur de solides outils statistiques que nous avons dĂ©veloppĂ©. Son message est que produire des rĂ©sultats ne suffit plus et qu'il est impĂ©ratif de les remettre dans leur contexte. Cette architecture permit de mettre en Ă©vidence plusieurs milliers de sites de STAF et est d’ores et dĂ©jĂ  connectĂ©e aux autres projets du laboratoire.STAF factor is a protein with two distinct regions for activation of transcription using two transcription machineries. A recent study reporting a thousand potential sites within the promoters of genes and proteins of which 400 were validated experimentally suggests that the number of sites across the genome may be even higher, forcing them to develop other characterization methods to question the whole genome and high volumes of data. This issue addresses a major challenge: the emergence of knowledge from a genomic region. To determine which knowledge is relevant, it is essential to assess how it deviates from the expected values of the genome and therefore to know these values. In this context we developed the architecture GeCo, a solution supported by a computerized database and its web portal, and whose power lies in its ability to determine the statistical values of genes and genome. Characterized by a set of descriptors (sequence, epigenetic, phylogenetic) the emerging context is used to quickly and reliably replace any genomic questioning inside its overall environment. Beyond the genomic context, it is a coding philosophy based on reliable statistical tools that we developed. Its message is that producing results is no longer sufficient and it is imperative to put in back in its context. This architecture allowed revealing thousands of STAF binding sites and is already connected to other projects in the laboratory

    Etablissement d'une architecture bioinformatique et biostatistique d'intégration et d'analyse des données génomiques, épigénétiques et phylogénétiques du génome humain (Application aux sites de fixation du facteur de transcription hStaf/ZNF143)

    No full text
    Le facteur STAF est une protĂ©ine deux rĂ©gions distinctes d activation de la transcription, selon la machinerie de transcription mobilisĂ©e. Une Ă©tude rĂ©cente faisant Ă©tat d'un millier de sites potentiels au sein des promoteurs de gĂšnes protĂ©iques et dont 400 furent validĂ©s expĂ©rimentalement laisse supposer que le nombre de sites Ă  l'Ă©chelle du gĂ©nome soit encore plus nombreux, forçant Ă  dĂ©velopper d'autres mĂ©thodes de caractĂ©risation Ă  mĂȘme de questionner le gĂ©nome entier et de forts volumes de donnĂ©es. Ce problĂšme adresse un challenge d'envergure supĂ©rieure : faire Ă©merger de la connaissance Ă  partir d'une rĂ©gion gĂ©nomique. Afin de savoir quelle connaissance est pertinente, il est indispensable d'Ă©valuer en quoi celle-ci s'Ă©carte des valeurs attendues du gĂ©nome et donc de connaitre ces valeurs. Dans cette optique nous avons dĂ©veloppĂ© l'architecture GeCo, solution soutenue par une base de donnĂ©es automatisĂ©e et son portail web, et dont la puissance repose sur son aptitude Ă  dĂ©terminer les valeurs statistiques des gĂšnes et du gĂ©nome complet. CaractĂ©risĂ© par un ensemble de descripteurs (sĂ©quence, Ă©pigĂ©nĂ©tique, phylogĂ©nĂ©tique), le contexte qui en Ă©merge est utilisĂ© pour replacer tout questionnement gĂ©nomique dans son environnement global, de maniĂšre rapide et fiable. Au delĂ  du contexte gĂ©nomique, c'est une philosophie de dĂ©veloppement basĂ©e sur de solides outils statistiques que nous avons dĂ©veloppĂ©. Son message est que produire des rĂ©sultats ne suffit plus et qu'il est impĂ©ratif de les remettre dans leur contexte. Cette architecture permit de mettre en Ă©vidence plusieurs milliers de sites de STAF et est d ores et dĂ©jĂ  connectĂ©e aux autres projets du laboratoire.STAF factor is a protein with two distinct regions for activation of transcription using two transcription machineries. A recent study reporting a thousand potential sites within the promoters of genes and proteins of which 400 were validated experimentally suggests that the number of sites across the genome may be even higher, forcing them to develop other characterization methods to question the whole genome and high volumes of data. This issue addresses a major challenge: the emergence of knowledge from a genomic region. To determine which knowledge is relevant, it is essential to assess how it deviates from the expected values of the genome and therefore to know these values. In this context we developed the architecture GeCo, a solution supported by a computerized database and its web portal, and whose power lies in its ability to determine the statistical values of genes and genome. Characterized by a set of descriptors (sequence, epigenetic, phylogenetic) the emerging context is used to quickly and reliably replace any genomic questioning inside its overall environment. Beyond the genomic context, it is a coding philosophy based on reliable statistical tools that we developed. Its message is that producing results is no longer sufficient and it is imperative to put in back in its context. This architecture allowed revealing thousands of STAF binding sites and is already connected to other projects in the laboratory.STRASBOURG-Sc. et Techniques (674822102) / SudocSudocFranceF

    PARSEC: PAtteRn SEarch and Contextualization

    No full text
    SUMMARY: We present PARSEC (PAtteRn Search and Contextualization), a new open source platform for guided discovery, allowing localization and biological characterization of short genomic sites in entire eukaryotic genomes. PARSEC can search for a sequence or a degenerated pattern. The retrieved set of genomic sites can be characterized in terms of (i) conservation in model organisms, (ii) genomic context (proximity to genes) and (iii) function of neighboring genes. These modules allow the user to explore, visualize, filter and extract biological knowledge from a set of short genomic regions such as transcription factor binding sites. AVAILABILITY: Web site implemented in Java, JavaScript and C++, with all major browsers supported. Freely available at lbgi.fr/parsec. Source code is freely available at sourceforge.net/projects/genomicparsec
    corecore