14 research outputs found

    Extracting expression modules from perturbational gene expression compendia

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Compendia of gene expression profiles under chemical and genetic perturbations constitute an invaluable resource from a systems biology perspective. However, the perturbational nature of such data imposes specific challenges on the computational methods used to analyze them. In particular, traditional clustering algorithms have difficulties in handling one of the prominent features of perturbational compendia, namely partial coexpression relationships between genes. Biclustering methods on the other hand are specifically designed to capture such partial coexpression patterns, but they show a variety of other drawbacks. For instance, some biclustering methods are less suited to identify overlapping biclusters, while others generate highly redundant biclusters. Also, none of the existing biclustering tools takes advantage of the staple of perturbational expression data analysis: the identification of differentially expressed genes.</p> <p>Results</p> <p>We introduce a novel method, called ENIGMA, that addresses some of these issues. ENIGMA leverages differential expression analysis results to extract expression modules from perturbational gene expression data. The core parameters of the ENIGMA clustering procedure are automatically optimized to reduce the redundancy between modules. In contrast to the biclusters produced by most other methods, ENIGMA modules may show internal substructure, i.e. subsets of genes with distinct but significantly related expression patterns. The grouping of these (often functionally) related patterns in one module greatly aids in the biological interpretation of the data. We show that ENIGMA outperforms other methods on artificial datasets, using a quality criterion that, unlike other criteria, can be used for algorithms that generate overlapping clusters and that can be modified to take redundancy between clusters into account. Finally, we apply ENIGMA to the Rosetta compendium of expression profiles for <it>Saccharomyces cerevisiae </it>and we analyze one pheromone response-related module in more detail, demonstrating the potential of ENIGMA to generate detailed predictions.</p> <p>Conclusion</p> <p>It is increasingly recognized that perturbational expression compendia are essential to identify the gene networks underlying cellular function, and efforts to build these for different organisms are currently underway. We show that ENIGMA constitutes a valuable addition to the repertoire of methods to analyze such data.</p

    Development of mathematical methods for modeling biological systems

    Get PDF

    A comprehensive evaluation of module detection methods for gene expression data

    Get PDF
    A critical step in the analysis of large genome-wide gene expression datasets is the use of module detection methods to group genes into co-expression modules. Because of limitations of classical clustering methods, numerous alternative module detection methods have been proposed, which improve upon clustering by handling co-expression in only a subset of samples, modelling the regulatory network, and/or allowing overlap between modules. In this study we use known regulatory networks to do a comprehensive and robust evaluation of these different methods. Overall, decomposition methods outperform all other strategies, while we do not find a clear advantage of biclustering and network inference-based approaches on large gene expression datasets. Using our evaluation workflow, we also investigate several practical aspects of module detection, such as parameter estimation and the use of alternative similarity measures, and conclude with recommendations for the further development of these methods

    Transcriptome-based predictive modeling approaches in Arabidopsis thaliana

    Get PDF

    Network types and their application in natural variation studies in plants

    Get PDF
    We are in the age of data-driven biology. Not even a decade after the invention of high-throughput sequencing technologies, there are methods that accurately monitor DNA polymorphisms, transcription profiles, methylation states, transcription factor binding sites, chromatin compactness, nucleosome positions, dynamic histone marks, and so on. We are starting to generate comparable amounts of protein or metabolite data. A key issue is how are we going to make sense of all this information. Network analysis is the most promising method to integrate, query and display large amounts of data for human interpretation. This review shortly summarizes the basic types of networks, their properties and limitations. In addition, I introduce the application of networks to the study of the molecular mechanisms behind natural phenotypic variation

    Using single‐plant‐omics in the field to link maize genes to functions and phenotypes

    Get PDF
    Most of our current knowledge on plant molecular biology is based on experiments in controlled laboratory environments. However, translating this knowledge from the laboratory to the field is often not straightforward, in part because field growth conditions are very different from laboratory conditions. Here, we test a new experimental design to unravel the molecular wiring of plants and study gene-phenotype relationships directly in the field. We molecularly profiled a set of individual maize plants of the same inbred background grown in the same field and used the resulting data to predict the phenotypes of individual plants and the function of maize genes. We show that the field transcriptomes of individual plants contain as much information on maize gene function as traditional laboratory-generated transcriptomes of pooled plant samples subject to controlled perturbations. Moreover, we show that field-generated transcriptome and metabolome data can be used to quantitatively predict individual plant phenotypes. Our results show that profiling individual plants in the field is a promising experimental design that could help narrow the lab-field gap
    corecore