135 research outputs found

    Transcriptional regulatory codes underlying Arabidopsis stress responses

    Get PDF
    Plant adaptation to stress is dependent upon the initialisation of molecular signalling networks that regulate the expression of stress-related genes. By examining high-resolution microarray datasets it has been possible to track gene expression changes over time during senescence and in response to infection by fungal pathogen Botrytis cineria in the model organism Arabidopsis thaliana. Dramatic variations in gene expression are observed at the onset of stress with different groups of genes showing different expression time-courses. This observation must, for a large part, be down to the action of different transcription factors (TFs) binding to the cis-regulatory DNA in the promoters of genes in each group and it is this regulatory code that underpins the gene regulatory networks that regulate stress responses. This thesis presents an interdisciplinary investigation of the regulatory codes that are responsible for controlling plant stress responses. Computational analysis of non-coding sequences provides a powerful approach to identify patterns within DNA that may function to regulate gene expression. This thesis covers the development of Analysis of Plant Promoter-Linked Elements (APPLES), an object-orientated software framework for the analysis of non-coding DNA. Within this environment, methods were developed to probe the regulatory codes that exist within these non-coding sequences and identify regulatory motifs that may function to regulate stress responses in Arabidopsis. APPLES methods were used to identify a novel motif that is likely to play a role in regulating drought responses in Arabidopsis, with experimental approaches providing support for this view. Using known motifs that describe previously characterised TF binding sites, it was possible to identify motifs that are associated with clusters of co-regulated genes identified from the senescence and Botrytis microarray time-course datasets. This analysis revealed cis-regulatory elements that may contribute to generating the observed expression patterns. In a contrasting approach to in silico identification of regulatory elements, the Yeast-1-Hybrid (Y1H) assay was used to experimentally identify interactions between TFs and non-coding DNA. The use of a TF library allowed the ability of approximately 1400 Arabidopsis TFs to interact with a given DNA sequence in a single assay. Using the stress-associated ANAC092 promoter as a test case, it was possible to use this highthroughput procedure to identify TFs that can bind to the promoter of this gene. This high-throughput Y1H system was then used to perform a detailed mapping of protein- DNA interactions that can occur across the core promoters of three highly related stress inducible TF-encoding genes, ANAC019, ANAC055 and ANAC072. Microarrays were used to assess the regulatory consequence of a subset of these interactions by perturbing the expression of interacting TFs and observing the effect on target gene expression during multiple stresses. This approach confirmed predicted regulatory relationships and therefore enhanced the current understanding of the transcriptional regulatory networks that operate during stress responses in Arabidopsis

    Transcriptional networks controlling the cell cycle.

    Get PDF
    In this work, we map the transcriptional targets of 107 previously identified Drosophila genes whose loss caused the strongest cell-cycle phenotypes in a genome-wide RNA interference screen and mine the resulting data computationally. Besides confirming existing knowledge, the analysis revealed several regulatory systems, among which were two highly-specific and interconnected feedback circuits, one between the ribosome and the proteasome that controls overall protein homeostasis, and the other between the ribosome and Myc/Max that regulates the protein synthesis capacity of cells. We also identified a set of genes that alter the timing of mitosis without affecting gene expression, indicating that the cyclic transcriptional program that produces the components required for cell division can be partially uncoupled from the cell division process itself. These genes all have a function in a pathway that regulates the phosphorylation state of Cdk1. We provide evidence showing that this pathway is involved in regulation of cell size, indicating that a Cdk1-regulated cell size checkpoint exists in metazoans

    Transcriptional regulatory codes underlying Arabidopsis stress responses

    Get PDF
    Plant adaptation to stress is dependent upon the initialisation of molecular signalling networks that regulate the expression of stress-related genes. By examining high-resolution microarray datasets it has been possible to track gene expression changes over time during senescence and in response to infection by fungal pathogen Botrytis cineria in the model organism Arabidopsis thaliana. Dramatic variations in gene expression are observed at the onset of stress with different groups of genes showing different expression time-courses. This observation must, for a large part, be down to the action of different transcription factors (TFs) binding to the cis-regulatory DNA in the promoters of genes in each group and it is this regulatory code that underpins the gene regulatory networks that regulate stress responses. This thesis presents an interdisciplinary investigation of the regulatory codes that are responsible for controlling plant stress responses. Computational analysis of non-coding sequences provides a powerful approach to identify patterns within DNA that may function to regulate gene expression. This thesis covers the development of Analysis of Plant Promoter-Linked Elements (APPLES), an object-orientated software framework for the analysis of non-coding DNA. Within this environment, methods were developed to probe the regulatory codes that exist within these non-coding sequences and identify regulatory motifs that may function to regulate stress responses in Arabidopsis. APPLES methods were used to identify a novel motif that is likely to play a role in regulating drought responses in Arabidopsis, with experimental approaches providing support for this view. Using known motifs that describe previously characterised TF binding sites, it was possible to identify motifs that are associated with clusters of co-regulated genes identified from the senescence and Botrytis microarray time-course datasets. This analysis revealed cis-regulatory elements that may contribute to generating the observed expression patterns. In a contrasting approach to in silico identification of regulatory elements, the Yeast-1-Hybrid (Y1H) assay was used to experimentally identify interactions between TFs and non-coding DNA. The use of a TF library allowed the ability of approximately 1400 Arabidopsis TFs to interact with a given DNA sequence in a single assay. Using the stress-associated ANAC092 promoter as a test case, it was possible to use this highthroughput procedure to identify TFs that can bind to the promoter of this gene. This high-throughput Y1H system was then used to perform a detailed mapping of protein- DNA interactions that can occur across the core promoters of three highly related stress inducible TF-encoding genes, ANAC019, ANAC055 and ANAC072. Microarrays were used to assess the regulatory consequence of a subset of these interactions by perturbing the expression of interacting TFs and observing the effect on target gene expression during multiple stresses. This approach confirmed predicted regulatory relationships and therefore enhanced the current understanding of the transcriptional regulatory networks that operate during stress responses in Arabidopsis.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Vertikal integration, globale und Modularanalyse von molekularen Wechselwirkungen Netzwerke von Escherichia coli

    Get PDF
    Phenotypical characteristics of cells often arise from interactions between genes, proteins and metabolites. For a complete understanding of cellular processes and their regulations it is necessary to vertically integrate the molecular networks into an interactome and understand its global structure. In this thesis,, an integrated molecular network (IMN) of Escherichia coli was reconstructed which comprises metabolic reactions, metabolite-protein interactions (MPI) and transcriptional regulation data. Three fundamental aspects of cellular processes were studied: (i) feedback regulation of gene expression, (ii) network motifs and (iii) global organization. Intriguingly, this work found that feedback regulation of gene expression in E. coli is mediated by MPIs and 69 such feedback loops (FBLs) were identified. Motif studies identified the FBL as a significant pattern and detected 12 other three-node motifs comprising five composite motifs. Connectivity analysis discovered the existence of bow-tie architecture and motif analysis in the bow-tie components revealed that 77% of them interconnect to form the giant strong component which is the backbone of the bow-tie. Further in this work, cluster and modular analyses were performed on the integrated molecular network of E. coli constructed from diverse collection of datasets involving metabolic reactions, metabolite protein interactions and transcriptional regulation. Modularity was used as the parameter of an appropriate, fast and robust method for clustering such a heterogeneous molecular circuitry of interactions. This work revealed that clustering this complex network significantly grouped together genes of known similar function in well-defined physiologically related modules. Identification of network motifs and correlating them with the modules of highly connected nodes may define their potential functional role. To this end, twelve highly significant three-node network motifs among which four are composite network motifs comprising multiple types of interactions were detected and analyzed. Distribution analysis of these motifs within and between the various functional modules supported the fact that these motifs represent basic patterns of regulation and organization of genes into modules. This thesis illustrates the potential of data integration of molecular networks to detect the feedback interactions in regulatory networks and its global analysis for better understanding cellular processes and their regulation. Moreover this work also presents a basic framework for detecting functional modules and their interaction with various motifs in an integrated E.coli system.Phenotypische Eigenschaften von Zellen entstehen häufig aus Wechselwirkungen zwischen Genen, Proteinen und Metaboliten. Für ein ganzheitliches Verstehen von Zellprozessen und ihrer Regulation ist es notwendig, die molekularen Netzwerke vertikal in ein Interactom zu integrieren und seine globale Struktur zu verstehen. In dieser Arbeit wurde ein integriertes molekulares Netzwerk (IMN) von Escherichia coli modelliert, dass aus den metabolischen Reaktionen, Metabolit-Protein-Wechselwirkungen (MPI) und den transkriptional-regulatorischen Elementen bestand. Drei grundsätzliche Aspekte von Zellprozessen wurden untersucht: (i) Feedback-Regulierung der Genexpression, (ii) Netzwerkmotive und (iii) globale Organisation. Diese Arbeit lieferte faszinierende Ergebnisse: Es konnte aufgezeigt werden, dass die Feedback-Regulierung der Genexpression in E. coli durch MPIs vermittelt wird und 69 solcher Feedback-Schleifen (FBLs) identifiziert werden konnten. Motiv-Untersuchungen identifizierten die FBLs als ein bedeutendes Muster und entdeckten 12 andere Drei-Knoten-Motive, die fünf zerlegbare Motive umfassen. Konnektivitätsanalysen zeigten die Existenz der Bow-tie-Struktur auf und Motivanalyse der Bow-tie-Komponenten offenbarte, dass 77 % davon das GSC (giant strong component) bilden, welches das Rückgrat des Bow-tie darstellt. Weiterhin wurden Cluster- und Modularanalysen im integrierten-molekularen Netwerk von E. coli durchgeführt, die auf diversen Sammlungen von Daten beruhten, die metabolische Reaktionen, Metabolit-Protein-Wechselwirkungen und transkriptionelle Regulierung beinhalteten. Modularität wurde als Parameter einer geeigneten, schnellen und robusten Methode zur Clusterung solcher heterogenen molekularen Schaltung von Wechselwirkungen genutzt. Diese Arbeit zeigte, dass die Clusterung dieses komplexen Netzwerkes Gene bekannter ähnlicher Funktion in wohl-definierten physiologisch verwandten Modulen signifikant gruppierte. Die Identifizierung von Netzwerk-Motiven und die Korrelation dieser mit Modulen hochverzweigter Knoten mag ihre potentielle funktionelle Rolle definieren. Zu diesem Zweck wurden zwölf hochsignifikante 3-Knoten-Motive, von denen vier zusammengesetzte Netzwerkmotive multiple Typen von Interaktionen darstellen, entdeckt und analysiert. Verteilungsanalyse dieser Motive innerhalb und zwischen verschiedenen funktionellen Modulen unterstützte die Tatsache, dass diese Motive Grundmuster der Regulation und Organisation von Genen in Modulen darstellen. Diese These illustriert das Potential der Datenintegrierung molekularer Netzwerke zur Entdeckung von Feedback-Interaktionen in regulatorischen Netzwerken und seiner globalen Analyse zur besseren Erkenntnis zellulärer Prozesse und ihrer Regulierung. Darüberhinaus zeigt diese Arbeit einen Grundrahmen für die Entdeckung funktioneller Module und ihrer Wechselwirkungen mit verschiedenen Motiven in einem integrierten System von E. coli auf

    Evaluation of statistical correlation and validation methods for construction of gene co-expression networks

    Get PDF
    High-throughput technologies such as microarrays have led to the rapid accumulation of large scale genomic data providing opportunities to systematically infer gene function and co-expression networks. Typical steps of co-expression network analysis using microarray data consist of estimation of pair-wise gene co-expression using some similarity measure, construction of co-expression networks, identification of clusters of co-expressed genes and post-cluster analyses such as cluster validation. This dissertation is primarily concerned with development and evaluation of approaches for the first and the last steps – estimation of gene co-expression matrices and validation of network clusters. Since clustering methods are not a focus, only a paraclique clustering algorithm will be used in this evaluation. First, a novel Bayesian approach is presented for combining the Pearson correlation with prior biological information from Gene Ontology, yielding a biologically relevant estimate of gene co-expression. The addition of biological information by the Bayesian approach reduced noise in the paraclique gene clusters as indicated by high silhouette and increased homogeneity of clusters in terms of molecular function. Standard similarity measures including correlation coefficients from Pearson, Spearman, Kendall’s Tau, Shrinkage, Partial, and Mutual information, and Euclidean and Manhattan distance measures were evaluated. Based on quality metrics such as cluster homogeneity and stability with respect to ontological categories, clusters resulting from partial correlation and mutual information were more biologically relevant than those from any other correlation measures. Second, statistical quality of clusters was evaluated using approaches based on permutation tests and Mantel correlation to identify significant and informative clusters that capture most of the covariance in the dataset. Third, the utility of statistical contrasts was studied for classification of temporal patterns of gene expression. Specifically, polynomial and Helmert contrast analyses were shown to provide a means of labeling the co-expressed gene sets because they showed similar temporal profiles

    Introducing biological information in the superparamagnetic clustering algorithm of gene expression data

    Get PDF
    Tesis (Doctorado en Nanociencias y Nanotecnología)"Los microarreglos proporcionan informaciòn de la actividad a nivel transcripcional de los genes de un organismo, bajo distintas circunstancias. Esto puede llevar al descubrimiento de genes clave en procesos celulares, clasificación molecular de enfermedades o identificar funciones para los genes, entre otras cosas. En el proceso de obtención de esta información, los algoritmos de clustering son una pieza importante al ayudar en la clasificación de los datos provenientes de microarreglos. En este trabajo modificamos el algoritmo de Clustering Superparamagnético añadiendo un peso extra en la fórmula de interacción que aprovecha la información que se tiene sobre los genes regulados por un mismo factor de transcripción. Con este algoritmo modificado, que nombramos SPCTF, analizamos los datos de microarreglos de Spellman et al. para ciclo celular en levadura (Saccharomyces cerevisiae) y encontramos clusters con un número mayor de integrantes, comparando con el algoritmo original SPC. Algunos de los genes que pudimos incorporar no fueron detectados por Spellman et al. en un principio, pero fueron identificados por otros estudios posteriormente. Otros de los genes que fueron incorporados aún no han sido clasificados, por lo que analizamos los clusters compuestos en su mayoría por estos genes sin identificar con el algoritmo MUSA y esto nos permitió seleccionar aquellos cuyos genes contienen sitios de unión a factores de transcripción correspondientes a ciclo celular. Estos clusters pueden ser estudiados ahora de manera experimental para descubrir nuevos genes involucrados en el ciclo celular. La idea de introducir la información biológica ya disponible para optimizar la clasificación de genes puede ser implementada para otros algoritmos de clustering.""Microarray technology allow researchers to examine the transcriptional activity of thousands of genes under different conditions. Microarrays have been used, for example, to discover key genes involved in cellular processes, disease classification, drug development and gene function annotation. Clustering algorithms have become an important step in the microarray data analysis in order to discover biologically relevant information. We modify the superparamagnetic clustering algorithm (SPC) by adding an extra weight to the interaction formula that considers which genes are regulated by the same transcription factor. This combined similarity measure for two genes relies on two types of information: their expression profiles generated by a microarray, and the number of shared transcription factors that have been proved (experimentally) to bind to their promoters. With this modified algorithm which we call SPCTF, we analyze the Spellman et al. microarray data for cell cycle genes in yeast (Saccharomyces cerevisiae), and find clusters with a higher number of elements compared with those obtained with the SPC algorithm. Some of the incorporated genes by using SPCFT were not detected at first by Spellman et al. but were later identified by other studies, whereas several genes still remain unclassified. The clusters composed by unidentified genes were analyzed with MUSA, the motif finding using an unsupervised approach algorithm, and this allow us to select the clusters whose elements contain cell cycle transcription factor binding sites as clusters worthy of further experimental studies because they would probably lead to new cell cycle genes. Our idea of introducing the available information about transcription factors to optimize the gene classification could be implemented for other distance-based clustering algorithms.
    corecore