12 research outputs found

    Signal search analysis server

    Get PDF
    Signal search analysis is a general method to discover and characterize sequence motifs that are positionally correlated with a functional site (e.g. a transcription or translation start site). The method has played an instrumental role in the analysis of eukaryotic promoter elements. The signal search analysis server provides access to four different computer programs as well as to a large number of precompiled functional site collections. The programs offered allow: (i) the identification of non-random sequence regions under evolutionary constraint; (ii) the detection of consensus sequence-based motifs that are over- or under-represented at a particular distance from a functional site; (iii) the analysis of the positional distribution of a consensus sequence- or weight matrix-based sequence motif around a functional site; and (iv) the optimization of a weight matrix description of a locally over-represented sequence motif. These programs can be accessed at: http://www.isrec.isb-sib.ch/ss

    EPD in its twentieth year: towards complete promoter coverage of selected model organisms

    Get PDF
    The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters, experimentally defined by a transcription start site (TSS). Access to promoter sequences is provided by pointers to positions in the corresponding genomes. Promoter evidence comes from conventional TSS mapping experiments for individual genes, or, starting from release 73, from mass genome annotation projects. Subsets of promoter sequences with customized 5′ and 3′ extensions can be downloaded from the EPD website. The focus of current development efforts is to reach complete promoter coverage for important model organisms as soon as possible. To speed up this process, a new class of preliminary promoter entries has been introduced as of release 83, which requires less stringent admission criteria. As part of a continuous integration process, new web-based interfaces have been developed, which allow joint analysis of promoter sequences with other bioinformatics resources developed by our group, in particular programs offered by the Signal Search Analysis Server, and gene expression data stored in the CleanEx database. EPD can be accessed at

    The Eukaryotic Promoter Database EPD: the impact of in silico primer extension

    Get PDF
    The Eukaryotic Promoter Database (EPD) is an annotated non‐redundant collection of eukaryotic POL II promoters, experimentally defined by a transcription start site (TSS). There may be multiple promoter entries for a single gene. The underlying experimental evidence comes from journal articles and, starting from release 73, from 5′ ESTs of full‐length cDNA clones used for so‐called in silico primer extension. Access to promoter sequences is provided by pointers to TSS positions in nucleotide sequence entries. The annotation part of an EPD entry includes a description of the type and source of the initiation site mapping data, links to other biological databases and bibliographic references. EPD is structured in a way that facilitates dynamic extraction of biologically meaningful promoter subsets for comparative sequence analysis. Web‐based interfaces have been developed that enable the user to view EPD entries in different formats, to select and extract promoter sequences according to a variety of criteria and to navigate to related databases exploiting different cross‐references. Tools for analysing sequence motifs around TSSs defined in EPD are provided by the signal search analysis server. EPD can be accessed at http://www.epd. isb‐sib.c

    EPD and EPDnew, high-quality promoter resources in the next-generation sequencing era

    Get PDF
    The Eukaryotic Promoter Database (EPD), available online at http://epd.vital-it.ch, is a collection of experimentally defined eukaryotic POL II promoters which has been maintained for more than 25 years. A promoter is represented by a single position in the genome, typically the major transcription start site (TSS). EPD primarily serves biologists interested in analysing the motif content, chromatin structure or DNA methylation status of co-regulated promoter subsets. Initially, promoter evidence came from TSS mapping experiments targeted at single genes and published in journal articles. Today, the TSS positions provided by EPD are inferred from next-generation sequencing data distributed in electronic form. Traditionally, EPD has been a high-quality database with low coverage. The focus of recent efforts has been to reach complete gene coverage for important model organisms. To this end, we introduced a new section called EPDnew, which is automatically assembled from multiple, carefully selected input datasets. As another novelty, we started to use chromatin signatures in addition to mRNA 5′tags to locate promoters of weekly expressed genes. Regarding user interfaces, we introduced a new promoter viewer which enables users to explore promoter-defining experimental evidence in a UCSC genome browser windo

    Transcript Profiling in Host–Pathogen Interactions

    Get PDF
    Using genomic technologies, it is now possible to address research hypotheses in the context of entire developmental or biochemical pathways, gene networks, and chromosomal location of relevant genes and their inferred evolutionary history. Through a range of platforms, researchers can survey an entire transcriptome under a variety of experimental and field conditions. Interpretation of such data has led to new insights and revealed previously undescribed phenomena. In the area of plant-pathogen interactions, transcript profiling has provided unparalleled perception into the mechanisms underlying gene-for-gene resistance and basal defense, host vs nonhost resistance, biotrophy vs necrotrophy, and pathogenicity of vascular vs nonvascular pathogens, among many others. In this way, genomic technologies have facilitated a system-wide approach to unifying themes and unique features in the interactions of hosts and pathogens

    The Eukaryotic Promoter Database: expansion of EPDnew and new promoter analysis tools

    Get PDF
    We present an update of EPDNew (http://epd. vital-it. ch), a recently introduced new part of the Eukaryotic Promoter Database (EPD) which has been described in more detail in a previous NAR Database Issue. EPD is an old database of experimentally characterized eukaryotic POL II promoters, which are conceptually defined as transcription initiation sites or regions. EPDnew is a collection of automatically compiled, organism-specific promoter lists complementing the old corpus of manually compiled promoter entries of EPD. This new part is exclusively derived from next generation sequencing data from highthroughput promoter mapping experiments. We report on the recent growth of EPDnew, its extension to additional model organisms and its improved integration with other bioinformatics resources developed by our group, in particular the Signal Search Analysis and ChIP-Seq web servers

    Transcriptional regulatory logic of the diurnal cycle in the mouse liver.

    Get PDF
    Many organisms exhibit temporal rhythms in gene expression that propel diurnal cycles in physiology. In the liver of mammals, these rhythms are controlled by transcription-translation feedback loops of the core circadian clock and by feeding-fasting cycles. To better understand the regulatory interplay between the circadian clock and feeding rhythms, we mapped DNase I hypersensitive sites (DHSs) in the mouse liver during a diurnal cycle. The intensity of DNase I cleavages cycled at a substantial fraction of all DHSs, suggesting that DHSs harbor regulatory elements that control rhythmic transcription. Using chromatin immunoprecipitation followed by DNA sequencing (ChIP-seq), we found that hypersensitivity cycled in phase with RNA polymerase II (Pol II) loading and H3K27ac histone marks. We then combined the DHSs with temporal Pol II profiles in wild-type (WT) and Bmal1-/- livers to computationally identify transcription factors through which the core clock and feeding-fasting cycles control diurnal rhythms in transcription. While a similar number of mRNAs accumulated rhythmically in Bmal1-/- compared to WT livers, the amplitudes in Bmal1-/- were generally lower. The residual rhythms in Bmal1-/- reflected transcriptional regulators mediating feeding-fasting responses as well as responses to rhythmic systemic signals. Finally, the analysis of DNase I cuts at nucleotide resolution showed dynamically changing footprints consistent with dynamic binding of CLOCK:BMAL1 complexes. Structural modeling suggested that these footprints are driven by a transient heterotetramer binding configuration at peak activity. Together, our temporal DNase I mappings allowed us to decipher the global regulation of diurnal transcription rhythms in the mouse liver

    CleanEx: a database of heterogeneous gene expression data based on a consistent gene nomenclature and linked to an improved annotation system

    Get PDF
    Résumé: L'automatisation du séquençage et de l'annotation des génomes, ainsi que l'application à large échelle de méthodes de mesure de l'expression génique, génèrent une quantité phénoménale de données pour des organismes modèles tels que l'homme ou la souris. Dans ce déluge de données, il devient très difficile d'obtenir des informations spécifiques à un organisme ou à un gène, et une telle recherche aboutit fréquemment à des réponses fragmentées, voir incomplètes. La création d'une base de données capable de gérer et d'intégrer aussi bien les données génomiques que les données transcriptomiques peut grandement améliorer la vitesse de recherche ainsi que la qualité des résultats obtenus, en permettant une comparaison directe de mesures d'expression des gènes provenant d'expériences réalisées grâce à des techniques différentes. L'objectif principal de ce projet, appelé CleanEx, est de fournir un accès direct aux données d'expression publiques par le biais de noms de gènes officiels, et de représenter des données d'expression produites selon des protocoles différents de manière à faciliter une analyse générale et une comparaison entre plusieurs jeux de données. Une mise à jour cohérente et régulière de la nomenclature des gènes est assurée en associant chaque expérience d'expression de gène à un identificateur permanent de la séquence-cible, donnant une description physique de la population d'ARN visée par l'expérience. Ces identificateurs sont ensuite associés à intervalles réguliers aux catalogues, en constante évolution, des gènes d'organismes modèles. Cette procédure automatique de traçage se fonde en partie sur des ressources externes d'information génomique, telles que UniGene et RefSeq. La partie centrale de CleanEx consiste en un index de gènes établi de manière hebdomadaire et qui contient les liens à toutes les données publiques d'expression déjà incorporées au système. En outre, la base de données des séquences-cible fournit un lien sur le gène correspondant ainsi qu'un contrôle de qualité de ce lien pour différents types de ressources expérimentales, telles que des clones ou des sondes Affymetrix. Le système de recherche en ligne de CleanEx offre un accès aux entrées individuelles ainsi qu'à des outils d'analyse croisée de jeux de donnnées. Ces outils se sont avérés très efficaces dans le cadre de la comparaison de l'expression de gènes, ainsi que, dans une certaine mesure, dans la détection d'une variation de cette expression liée au phénomène d'épissage alternatif. Les fichiers et les outils de CleanEx sont accessibles en ligne (http://www.cleanex.isb-sib.ch/). Abstract: The automatic genome sequencing and annotation, as well as the large-scale gene expression measurements methods, generate a massive amount of data for model organisms. Searching for genespecific or organism-specific information througout all the different databases has become a very difficult task, and often results in fragmented and unrelated answers. The generation of a database which will federate and integrate genomic and transcriptomic data together will greatly improve the search speed as well as the quality of the results by allowing a direct comparison of expression results obtained by different techniques. The main goal of this project, called the CleanEx database, is thus to provide access to public gene expression data via unique gene names and to represent heterogeneous expression data produced by different technologies in a way that facilitates joint analysis and crossdataset comparisons. A consistent and uptodate gene nomenclature is achieved by associating each single gene expression experiment with a permanent target identifier consisting of a physical description of the targeted RNA population or the hybridization reagent used. These targets are then mapped at regular intervals to the growing and evolving catalogues of genes from model organisms, such as human and mouse. The completely automatic mapping procedure relies partly on external genome information resources such as UniGene and RefSeq. The central part of CleanEx is a weekly built gene index containing crossreferences to all public expression data already incorporated into the system. In addition, the expression target database of CleanEx provides gene mapping and quality control information for various types of experimental resources, such as cDNA clones or Affymetrix probe sets. The Affymetrix mapping files are accessible as text files, for further use in external applications, and as individual entries, via the webbased interfaces . The CleanEx webbased query interfaces offer access to individual entries via text string searches or quantitative expression criteria, as well as crossdataset analysis tools, and crosschip gene comparison. These tools have proven to be very efficient in expression data comparison and even, to a certain extent, in detection of differentially expressed splice variants. The CleanEx flat files and tools are available online at: http://www.cleanex.isbsib. ch/

    Signal search analysis server

    Get PDF
    Signal search analysis is a general method to discover and characterize sequence motifs that are positionally correlated with a functional site (e.g. a transcription or translation start site). The method has played an instrumental role in the analysis of eukaryotic promoter elements. The signal search analysis server provides access to four different computer programs as well as to a large number of precompiled functional site collections. The programs offered allow: (i) the identification of non-random sequence regions under evolutionary constraint; (ii) the detection of consensus sequence-based motifs that are over- or under-represented at a particular distance from a functional site; (iii) the analysis of the positional distribution of a consensus sequence- or weight matrix-based sequence motif around a functional site; and (iv) the optimization of a weight matrix description of a locally over-represented sequence motif. These programs can be accessed at: http://www.isrec.isb-sib.ch/ssa/
    corecore