15 research outputs found

    Genome wide prediction of HNF4α functional binding sites by the use of local and global sequence context

    Get PDF
    An application of machine learning algorithms enables prediction of the functional context of transcription factor binding sites in the human genome

    Genetic networks of antibacterial responses of eukaryotic cells. Bioinformatics analysis and modeling

    Get PDF
    This work describes the development of new methods to construction of promoter models as one of necessary steps of regulatory networks construction. Identification of characteristic promoter features shows the role of specific transcription factors (TFs) in triggering the response, which in turn sheds light on the signaling pathways activating these TFs. Treating reported results of microarray analyses together with other available information about the genes expressed in different cellular systems under consideration, we search for distinguishing features of the promoters of coexpressed genes. The application of such promoter models enables to identify additional candidate genes belonging to the same regulatory network. Four novel approaches are presented in this work: (i) subtractive approach to matrix generation; (ii) distance distribution approach; (iii) "seed" sets approach; (iv) complementary pairs approach. These approaches help to solve serious problems in promoter model construction such as the doubtful reliability of positive training sets ("seed" sets approach) and lack of knowledge about the exact signaling pathways triggering the gene expression (complementary pairs approach); the subtractive approach to matrix generation allows to refine positional weight matrices (PWM) for heterogeneous sets of binding sites, thus to improve the PWM search for single TFBS. A significant improvement of the specificity of promoter analysis has been achieved by applying statistical methods for characterizing TFBS combinations at over-represented distances rather than the mere identification of single potential TFBS (distance distributions approach). The newly developed methods were applied to the description of four defensive eukaryotic systems in terms of transcription regulation. The obtained models enabled us to gain better insights into the pathways of the corresponding signaling networks.Diese Arbeit beschreibt die Entwicklung mehrerer neuer Methoden zur Konstruktion von Promotormodellen als einen der notwendigen Schritte zur Konstruktion regulatorischer Netzwerke. Die Identifizierung charakteristischer Eigenschaften von Promotoren zeigt die Rolle bestimmter Transkriptionsfaktoren (TF) beim Auslösen spezifischer Antworten auf, was wiederum Aufschluss über die Signalwege zur Aktivierung dieser TF gibt. Durch Verarbeitung von Ergebnissen aus Microarray-Analysen zusammen mit weiteren verfügbaren Informationen über die in den betrachteten zellulären Systemen exprimierten Gene suchen wir nach kennzeichnenden Eigenschaften koregulierter Promotoren. Die Applikation solcher Promotermodelle ermöglicht die Identifizierung zusätzlicher Kandidatengene, die demselben regulatorischen Netzwerk angehören. Vier neue Ansätze werden in dieser Arbeit präsentiert: (i) der subtraktive Ansatz zur Matrixerzeugung; (ii) der Distanzverteilungsansatz; (iii) der "seed"-Set-Ansatz; (iv) der Ansatz komplementärer Paare. Diese Ansätze helfen, beträchtliche Probleme der Promotormodellkonstruktion zu lösen, wie die zweifelhafte Zuverlässigkeit positiver Trainingsets ("seed"-Set-Ansatz) und der Mangel an Wissen über die präzisen Signalwege, die bestimmte Genexpressionsereignisse auslösen (Ansatz komplementärer Paare). Der subtraktive Ansatz zur Matrixerzeugung erlaubt, Positionsgewichtungsmatrizen (PWM) für heterogene Sets von Bindungsstellen zu verfeinern und dadurch die PWM-Suche für einzelne TFBSs zur verbessern. Eine signifikante Verbesserung der Spezifität der Promotoranalyse wurde durch die Anwendung statistischer Methoden zur Charakterisierung von TFBS-Kombinationen in überrepräsentierten Distanzen anstelle der bloßen Identifizierung einzelner potentieller TFBSs erreicht. Die neuentwickelten Methoden wurden zur Beschreibung von vier eukaryotischen Abwehrsystemen verwendet. Die erhaltenen Modelle eröffneten tiefergehende Einsichten in die Pfade der zugehörigen Signalnetzwerke

    The Use of Functional Genomics in Synthetic Promoter Design

    Get PDF

    Prediction of synergistic transcription factors by function conservation

    Get PDF
    A new strategy is proposed for identifying synergistic transcription factors by function conservation, leading to the identification of 51 homotypic transcription-factor combinations

    Most transcription factor binding sites are in a few mosaic classes of the human genome

    Get PDF
    Background: Many algorithms for finding transcription factor binding sites have concentrated on the characterisation of the binding site itself: and these algorithms lead to a large number of false positive sites. The DNA sequence which does not bind has been modeled only to the extent necessary to complement this formulation. Results We find that the human genome may be described by 19 pairs of mosaic classes, each defined by its base frequencies, (or more precisely by the frequencies of doublets), so that typically a run of 10 to 100 bases belongs to the same class. Most experimentally verified binding sites are in the same four pairs of classes. In our sample of seventeen transcription factors — taken from different families of transcription factors — the average proportion of sites in this subset of classes was 75%, with values for individual factors ranging from 48% to 98%. By contrast these same classes contain only 26% of the bases of the genome and only 31% of occurrences of the motifs of these factors — that is places where one might expect the factors to bind. These results are not a consequence of the class composition in promoter regions. Conclusions:This method of analysis will help to find transcription factor binding sites and assist with the problem of false positives. These results also imply a profound difference between the mosaic classes
    corecore