7 research outputs found

    Non random distribution of transcription factor binding sites in the Arabidopsis thaliana genome

    Get PDF
    Das Genom der kleinen Pflanze Arabidopsis thaliana ist komplett durchsequenziert und annotiert, wodurch es für die Bioinformatik gut geeignet ist. In der hier vorliegenden Arbeit wurden Verteilungsanalysen von Transkriptionsfaktor-Bindungsstellen (TFBSn) durchgeführt. Da die genomische Verteilung von TFBSn mit dem AT/GC Gehalt zusammenhängen kann, wurde zunächst der AT Gehalt der Regionen innerhalb von Genen, wie UTRs, Introns, Exons, sowie des intergenischen Bereichs untersucht. Es konnte gezeigt werden, dass der Beginn und das Ende der UTRs, Introns und Exons bevorzugte Nukleotidkompositionen aufweisen. In einem weiteren Schritt fand eine genomweite Suche nach TFBSn für verschiedene Transkriptionsfaktoren (TFen) statt. Die Suche basierte auf Matrizen, die z.B. aus der Literatur extrahiert wurden. Danach wurden Verteilungsanalysen der TFBSn relativ zum Translationsstartpunkt und zu den oben genannten Genregionen durchgeführt. Drei verschiedene Verteilungsmuster relativ zum Translationsstartpunkt konnten identifiziert werden, upstream, downstream und indifferent. Es wurde gezeigt, dass eine upstream Verteilung nicht das vorherrschende Verteilungsmuster darstellte. Es war festzustellen, dass die Verteilungsmuster nicht nur vom AT/GC Gehalt abhängen. In einer weiteren Analyse konnte demonstriert werden, dass einige TFBSn (z.B. von AtMYB77) an konservierten Positionen relativ zu den oben benannten Genregionen verschiedener Gene Anreicherungen aufwiesen. Diese konservierte Lokalisation kann auf ähnliche Expressionsmuster der putativen Zielgene hindeuten. Um dies zu analysieren wurden die Expressionsmuster der möglichen Zielgene unter Einsatz von Microarray Genexpressionsdaten untersucht. Signifikante Korrelationen konnten bei den TFs AtMYB77, AtMYB84, AGL15, PIF3 und ATHB5 und ihren putativen Zielgenen identifiziert werden, was auf eine mögliche Koregulation dieser Gene hindeutet.The genome of the small flowering plant Arabidopsis thaliana is almost completely sequenced and annotated and, therefore, is a valuable resource for bioinformatics. In the studies underlying this thesis, distribution patterns for various transcription factor binding sites (TFBSs) within the A. thaliana genome were determined. Since the genomic distribution pattern of TFBSs may be connected with their AT/GC content, the AT content of gene regions such as UTRs, Introns, Exons, and intergenic regions was assessed at first. It was shown that the starts and the ends of UTRs, Introns, and Exons have preferred nucleotide compositions. In a further step, a genome-wide search for TFBSs of various transcription factors (TFs) was carried out. The search was based on matrices derived e.g. from literature. Then, TFBS distribution patterns relative to translation start site and within the above mentioned gene regions were investigated. Three different distribution patterns, i) upstream, ii) downstream, and iii) indifferent relative to the translation start site were detected. It was shown, that upstream was not the most prominent pattern. It was remarkable that the distribution patterns do not only depend on the AT/GC content. In a further analysis, it was shown that some TFBSs (e.g. of AtMYB77) are located at conserved positions relative to the above mentioned gene regions. This conserved location may indicate a similar gene expression pattern of the putative target genes. To analyse this, the gene expression patterns of candidate genes were determined by using microarrray gene expression data. Significant correlations were found between the TFs AtMYB77, AtMYB84, AGL15, PIF3, and ATHB5 and their corresponding putative target genes indicating a coregulation of these genes

    AthaMap web tools for the analysis and identification of co-regulated genes

    Get PDF
    The AthaMap database generates a map of cis-regulatory elements for the whole Arabidopsis thaliana genome. This database has been extended by new tools to identify common cis-regulatory elements in specific regions of user-provided gene sets. A resulting table displays all cis-regulatory elements annotated in AthaMap including positional information relative to the respective gene. Further tables show overviews with the number of individual transcription factor binding sites (TFBS) present and TFBS common to the whole set of genes. Over represented cis-elements are easily identified. These features were used to detect specific enrichment of drought-responsive elements in cold-induced genes. For identification of co-regulated genes, the output table of the colocalization function was extended to show the closest genes and their relative distances to the colocalizing TFBS. Gene sets determined by this function can be used for a co-regulation analysis in microarray gene expression databases such as Genevestigator or PathoPlant. Additional improvements of AthaMap include display of the gene structure in the sequence window and a significant data increase. AthaMap is freely available at

    AthaMap web tools for database-assisted identification of combinatorial cis-regulatory elements and the display of highly conserved transcription factor binding sites in Arabidopsis thaliana

    Get PDF
    The AthaMap database generates a map of cis-regulatory elements for the Arabidopsis thaliana genome. AthaMap contains more than 7.4 Ă— 10(6) putative binding sites for 36 transcription factors (TFs) from 16 different TF families. A newly implemented functionality allows the display of subsets of higher conserved transcription factor binding sites (TFBSs). Furthermore, a web tool was developed that permits a user-defined search for co-localizing cis-regulatory elements. The user can specify individually the level of conservation for each TFBS and a spacer range between them. This web tool was employed for the identification of co-localizing sites of known interacting TFs and TFs containing two DNA-binding domains. More than 1.8 Ă— 10(5) combinatorial elements were annotated in the AthaMap database. These elements can also be used to identify more complex co-localizing elements consisting of up to four TFBSs. The AthaMap database and the connected web tools are a valuable resource for the analysis and the prediction of gene expression regulation at

    AthaMap: an online resource for in silico transcription factor binding sites in the Arabidopsis thaliana genome

    No full text
    Gene expression is controlled mainly by the binding of transcription factors to regulatory sequences. To generate a genomic map for regulatory sequences, the Arabidopsis thaliana genome was screened for putative transcription factor binding sites. Using publicly available data from the TRANSFAC database and from publications, alignment matrices for 23 transcription factors of 13 different factor families were used with the pattern search program Patser to determine the genomic positions of more than 2.4 Ă— 10(6) putative binding sites. Due to the dense clustering of genes and the observation that regulatory sequences are not restricted to upstream regions, the prediction of binding sites was performed for the whole genome. The genomic positions and the underlying data were imported into the newly developed AthaMap database. This data can be accessed by positional information or the Arabidopsis Genome Initiative identification number. Putative binding sites are displayed in the defined region. Data on the matrices used and on the thresholds applied in these screens are given in the database. Considering the high density of sites it will be a valuable resource for generating models on gene expression regulation. The data are available at http://www.athamap.de

    IOS Press AthaMap: From in silico Datato Real Transcription Factor Binding Sites

    No full text
    ABSTRACT: AthaMap generates a map for cis-regulatory sequences for the whole Arabidopsis thaliana genome. AthaMap was initially developed by matrix-based detection of putative transcription factor binding sites (TFBS) mostly determined from random binding site selection experiments. Now, also experimentally verified TFBS have been included for 48 different Arabidopsis thaliana transcription factors (TF). Based on these sequences, 89,416 very similar putative TFBS were determined within the genome of A. thaliana and annotated to AthaMap. Matrix- and single sequence-based binding sites can be included in colocalization analysis for the identification of combinatorial cis-regulatory elements. As an example, putative target genes of the WRKY18 transcription factor that is involved in plant-pathogen interaction were determined. New functions of AthaMap include descriptions for all annotated Arabidopsis thaliana genes and direct links to TAIR, TIGR and MIPS. Transcription factors used in the binding site determination are linked to TAIR and TRANSFAC databases. AthaMap is freely available a
    corecore