2,008 research outputs found

    Identifying modular function via edge annotation in gene correlation networks using Gene Ontology search

    Get PDF
    Correlation networks provide a powerful tool for analyzing large sets of biological information. This method of high-throughput data modeling has important implications in uncovering novel knowledge of cellular function. Previous studies on other types of network modeling (protein-protein interaction networks, metabolomes, etc.) have demonstrated the presence of relationships between network structures and organization of cellular function. Studies with correlation network further confirm the existence of such network structure and biological function relationship. However, correlation networks are typically noisy and the identified network structures, such as clusters, must be further investigated to verify actual cellular function. This is traditionally done using Gene Ontology enrichment of the genes in that cluster. In this study a novel method to identify common cluster functions in correlation networks is proposed, which uses annotations of edges as opposed to the traditional annotation of node analysis. The results obtained using proposed method reveals functional relationships in clusters not visible by the traditional approach

    Identifying aging-related genes in mouse hippocampus using gateway nodes

    Get PDF
    BACKGROUND: High-throughput studies continue to produce volumes of metadata representing valuable sources of information to better guide biological research. With a stronger focus on data generation, analysis models that can readily identify actual signals have not received the same level of attention. This is due in part to high levels of noise and data heterogeneity, along with a lack of sophisticated algorithms for mining useful information. Networks have emerged as a powerful tool for modeling high-throughput data because they are capable of representing not only individual biological elements but also different types of relationships en masse. Moreover, well-established graph theoretic methodology can be applied to network models to increase efficiency and speed of analysis. In this project, we propose a network model that examines temporal data from mouse hippocampus at the transcriptional level via correlation of gene expression. Using this model, we formally define the concept of “gateway” nodes, loosely defined as nodes representing genes co-expressed in multiple states. We show that the proposed network model allows us to identify target genes implicated in hippocampal aging-related processes. RESULTS: By mining gateway genes related to hippocampal aging from networks made from gene expression in young and middle-aged mice, we provide a proof-of-concept of existence and importance of gateway nodes. Additionally, these results highlight how network analysis can act as a supplement to traditional statistical analysis of differentially expressed genes. Finally, we use the gateway nodes identified by our method as well as functional databases and literature to propose new targets for study of aging in the mouse hippocampus. CONCLUSIONS: This research highlights the need for methods of temporal comparison using network models and provides a systems biology approach to extract information from correlation networks of gene expression. Our results identify a number of genes previously implicated in the aging mouse hippocampus related to synaptic plasticity and apoptosis. Additionally, this model identifies a novel set of aging genes previously uncharacterized in the hippocampus. This research can be viewed as a first-step for identifying the processes behind comparative experiments in aging that is applicable to any type of temporal multi-state network

    Inferring the functions of longevity genes with modular subnetwork biomarkers of Caenorhabditis elegans aging

    Get PDF
    An algorithm for determining networks from gene expression data enables the identification of genes potentially linked to aging in worms

    Identifying functionally and topologically cohesive modules in protein interaction networks

    Get PDF
    Abstract unavailable please refer to PD

    Prior knowledge guided active modules identification: an integrated multi-objective approach

    Get PDF
    BACKGROUND: Active module, defined as an area in biological network that shows striking changes in molecular activity or phenotypic signatures, is important to reveal dynamic and process-specific information that is correlated with cellular or disease states. METHODS: A prior information guided active module identification approach is proposed to detect modules that are both active and enriched by prior knowledge. We formulate the active module identification problem as a multi-objective optimisation problem, which consists two conflicting objective functions of maximising the coverage of known biological pathways and the activity of the active module simultaneously. Network is constructed from protein-protein interaction database. A beta-uniform-mixture model is used to estimate the distribution of p-values and generate scores for activity measurement from microarray data. A multi-objective evolutionary algorithm is used to search for Pareto optimal solutions. We also incorporate a novel constraints based on algebraic connectivity to ensure the connectedness of the identified active modules. RESULTS: Application of proposed algorithm on a small yeast molecular network shows that it can identify modules with high activities and with more cross-talk nodes between related functional groups. The Pareto solutions generated by the algorithm provides solutions with different trade-off between prior knowledge and novel information from data. The approach is then applied on microarray data from diclofenac-treated yeast cells to build network and identify modules to elucidate the molecular mechanisms of diclofenac toxicity and resistance. Gene ontology analysis is applied to the identified modules for biological interpretation. CONCLUSIONS: Integrating knowledge of functional groups into the identification of active module is an effective method and provides a flexible control of balance between pure data-driven method and prior information guidance

    On Mining Biological Signals Using Correlation Networks

    Get PDF
    Correlation networks have been used in biological networks to analyze and model high-throughput biological data, such as gene expression from microarray or RNA-seq assays. Typically in biological network modeling, structures can be mined from these networks that represent biological functions; for example, a cluster of proteins in an interactome can represent a protein complex. In correlation networks built from high-throughput gene expression data, it has often been speculated or even assumed that clusters represent sets of genes that are coregulated. This research aims to validate this concept using network systems biology and data mining by identification of correlation network clusters via multiple clustering approaches and cross-validation of regulatory elements in these clusters via motif finding software. The results show that the majority (81- 100%) of genes in any given cluster will share at least one predicted transcription factor binding site. With this in mind, new regulatory relationships can be proposed using known transcription factors and their binding sites by integrating regulatory information and the network model itself

    Development and Application of Comparative Gene Co-expression Network Methods in Brachypodium distachyon

    Get PDF
    Gene discovery and characterization is a long and labor-intensive process. Gene co-expression network analysis is a long-standing powerful approach that can strongly enrich signals within gene expression datasets to predict genes critical for many cellular functions. Leveraging this approach with a large number of transcriptome datasets does not yield a concomitant increase in network granularity. Independently generated datasets that describe gene expression in various tissues, developmental stages, times of day, and environments can carry conflicting co-expression signals. The gene expression responses of the model C3 grass Brachypodium distachyon to abiotic stress is characterized by a co-expression-based analysis, identifying 22 modules of genes, annotated with putative DNA regulatory elements and functional terms. A great deal of co-expression elasticity is found among the genes characterized therein. An algorithm, dGCNA, designed to determine statistically significant changes in gene-gene co-expression relationships is presented. The algorithm is demonstrated on the very well-characterized circadian system of Arabidopsis thaliana, and identifies potential strong signals of molecular interactions between a specific transcription factor and putative target gene loci. Lastly, this network comparison approach based on edge-wise similarities is demonstrated on many pairwise comparisons of independent microarray datasets, to demonstrate the utility of fine-grained network comparison, rather than amassing as large a dataset as possible. This approach identifies a set of 182 gene loci which are differentially expressed under drought stress, change their co-expression strongly under loss of thermocycles or high-salinity stress, and are associated with cell-cycle and DNA replication functions. This set of genes provides excellent candidates for the generation of rhythmic growth under thermocycles in Brachypodium distachyon

    PREDICTING COMPLEX PHENOTYPE-GENOTYPE RELATIONSHIPS IN GRASSES: A SYSTEMS GENETICS APPROACH

    Get PDF
    It is becoming increasingly urgent to identify and understand the mechanisms underlying complex traits. Expected increases in the human population coupled with climate change make this especially urgent for grasses in the Poaceae family because these serve as major staples of the human and livestock diets worldwide. In particular, Oryza sativa (rice), Triticum spp. (wheat), Zea mays (maize), and Saccharum spp. (sugarcane) are among the top agricultural commodities. Molecular marker tools such as linkage-based Quantitative Trait Loci (QTL) mapping, Genome-Wide Association Studies (GWAS), Multiple Marker Assisted Selection (MMAS), and Genome Selection (GS) techniques offer promise for understanding the mechanisms behind complex traits and to improve breeding programs. These methods have shown some success. Often, however, they cannot identify the causal genes underlying traits nor the biological context in which those genes function. To improve our understanding of complex traits as well improve breeding techniques, additional tools are needed to augment existing methods. This work proposes a knowledge-independent systems-genetic paradigm that integrates results from genetic studies such as QTL mapping, GWAS and mutational insertion lines such as Tos17 with gene co-expression networks for grasses--in particular for rice. The techniques described herein attempt to overcome the bias of limited human knowledge by relying solely on the underlying signals within the data to capture a holistic representation of gene interactions for a species. Through integration of gene co-expression networks with genetic signal, modules of genes can be identified with potential effect for a given trait, and the biological function of those interacting genes can be determined
    corecore