8,167 research outputs found

    SWIM: A computational tool to unveiling crucial nodes in complex biological networks

    Get PDF
    SWItchMiner (SWIM) is a wizard-like software implementation of a procedure, previously described, able to extract information contained in complex networks. Specifically, SWIM allows unearthing the existence of a new class of hubs, called "fight-club hubs", characterized by a marked negative correlation with their first nearest neighbors. Among them, a special subset of genes, called "switch genes", appears to be characterized by an unusual pattern of intra- and inter-module connections that confers them a crucial topological role, interestingly mirrored by the evidence of their clinic-biological relevance. Here, we applied SWIM to a large panel of cancer datasets from The Cancer Genome Atlas, in order to highlight switch genes that could be critically associated with the drastic changes in the physiological state of cells or tissues induced by the cancer development. We discovered that switch genes are found in all cancers we studied and they encompass protein coding genes and non-coding RNAs, recovering many known key cancer players but also many new potential biomarkers not yet characterized in cancer context. Furthermore, SWIM is amenable to detect switch genes in different organisms and cell conditions, with the potential to uncover important players in biologically relevant scenarios, including but not limited to human cancer

    Network-based approaches to explore complex biological systems towards network medicine

    Get PDF
    Network medicine relies on different types of networks: from the molecular level of protein–protein interactions to gene regulatory network and correlation studies of gene expression. Among network approaches based on the analysis of the topological properties of protein–protein interaction (PPI) networks, we discuss the widespread DIAMOnD (disease module detection) algorithm. Starting from the assumption that PPI networks can be viewed as maps where diseases can be identified with localized perturbation within a specific neighborhood (i.e., disease modules), DIAMOnD performs a systematic analysis of the human PPI network to uncover new disease-associated genes by exploiting the connectivity significance instead of connection density. The past few years have witnessed the increasing interest in understanding the molecular mechanism of post-transcriptional regulation with a special emphasis on non-coding RNAs since they are emerging as key regulators of many cellular processes in both physiological and pathological states. Recent findings show that coding genes are not the only targets that microRNAs interact with. In fact, there is a pool of different RNAs—including long non-coding RNAs (lncRNAs) —competing with each other to attract microRNAs for interactions, thus acting as competing endogenous RNAs (ceRNAs). The framework of regulatory networks provides a powerful tool to gather new insights into ceRNA regulatory mechanisms. Here, we describe a data-driven model recently developed to explore the lncRNA-associated ceRNA activity in breast invasive carcinoma. On the other hand, a very promising example of the co-expression network is the one implemented by the software SWIM (switch miner), which combines topological properties of correlation networks with gene expression data in order to identify a small pool of genes—called switch genes—critically associated with drastic changes in cell phenotype. Here, we describe SWIM tool along with its applications to cancer research and compare its predictions with DIAMOnD disease genes

    Molecular epigenetics, chromatin, and NeuroAIDS/HIV: Immunopathological implications

    Get PDF
    Epigenetics studies factors related to the organism and environment that modulate inheritance from generation to generation. Molecular epigenetics examines non-coding DNA (ncdDNA) vs. coding DNA (cdDNA), and pertains to every domain of physiology, including immune and brain function. Molecular cartography, including genomics, proteomics, and interactomics, seeks to recognize and to identify the multi-faceted and intricate array of interacting genes and gene products that characterize the function and specialization of each individual cell in the context of cell-cell interaction, tissue, and organ function. Molecular cartography, epigenetics, and chromatin assembly, repair and remodeling (CARR), which, together with the RNA interfering signaling complex (RISC), is responsible for much of the control and regulation of gene expression, intersect

    Expression cartography of human tissues using self organizing maps

    Get PDF
    Background: The availability of parallel, high-throughput microarray and sequencing experiments poses a challenge how to best arrange and to analyze the obtained heap of multidimensional data in a concerted way. Self organizing maps (SOM), a machine learning method, enables the parallel sample- and gene-centered view on the data combined with strong visualization and second-level analysis capabilities. The paper addresses aspects of the method with practical impact in the context of expression analysis of complex data sets.
Results: The method was applied to generate a SOM characterizing the whole genome expression profiles of 67 healthy human tissues selected from ten tissue categories (adipose, endocrine, homeostasis, digestion, exocrine, epithelium, sexual reproduction, muscle, immune system and nervous tissues). SOM mapping reduces the dimension of expression data from ten thousands of genes to a few thousands of metagenes where each metagene acts as representative of a minicluster of co-regulated single genes. Tissue-specific and common properties shared between groups of tissues emerge as a handful of localized spots in the tissue maps collecting groups of co-regulated and co-expressed metagenes. The functional context of the spots was discovered using overrepresentation analysis with respect to pre-defined gene sets of known functional impact. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. Analysis techniques normally used at the gene-level such as two-way hierarchical clustering provide a better signal-to-noise ratio and a better representativeness of the method if applied to the metagenes. Metagene-based clustering analyses aggregate the tissues into essentially three clusters containing nervous, immune system and the remaining tissues. 
Conclusions: The global view on the behavior of a few well-defined modules of correlated and differentially expressed genes is more intuitive and more informative than the separate discovery of the expression levels of hundreds or thousands of individual genes. The metagene approach is less sensitive to a priori selection of genes. It can detect a coordinated expression pattern whose components would not pass single-gene significance thresholds and it is able to extract context-dependent patterns of gene expression in complex data sets.
&#xa

    Correlation Network Analysis reveals a sequential reorganization of metabolic and transcriptional states during germination and gene-metabolite relationships in developing seedlings of Arabidopsis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Holistic profiling and systems biology studies of nutrient availability are providing more and more insight into the mechanisms by which gene expression responds to diverse nutrients and metabolites. Less is known about the mechanisms by which gene expression is affected by endogenous metabolites, which can change dramatically during development. Multivariate statistics and correlation network analysis approaches were applied to non-targeted profiling data to investigate transcriptional and metabolic states and to identify metabolites potentially influencing gene expression during the heterotrophic to autotrophic transition of seedling establishment.</p> <p>Results</p> <p>Microarray-based transcript profiles were obtained from extracts of Arabidopsis seeds or seedlings harvested from imbibition to eight days-old. <sup>1</sup>H-NMR metabolite profiles were obtained for corresponding samples. Analysis of transcript data revealed high differential gene expression through seedling emergence followed by a period of less change. Differential gene expression increased gradually to day 8, and showed two days, 5 and 7, with a very high proportion of up-regulated genes, including transcription factor/signaling genes. Network cartography using spring embedding revealed two primary clusters of highly correlated metabolites, which appear to reflect temporally distinct metabolic states. Principle Component Analyses of both sets of profiling data produced a chronological spread of time points, which would be expected of a developmental series. The network cartography of the transcript data produced two distinct clusters comprising days 0 to 2 and days 3 to 8, whereas the corresponding analysis of metabolite data revealed a shift of day 2 into the day 3 to 8 group. A metabolite and transcript pair-wise correlation analysis encompassing all time points gave a set of 237 highly significant correlations. Of 129 genes correlated to sucrose, 44 of them were known to be sucrose responsive including a number of transcription factors.</p> <p>Conclusions</p> <p>Microarray analysis during germination and establishment revealed major transitions in transcriptional activity at time points potentially associated with developmental transitions. Network cartography using spring-embedding indicate that a shift in the state of nutritionally important metabolites precedes a major shift in the transcriptional state going from germination to seedling emergence. Pair-wise linear correlations of transcript and metabolite levels identified many genes known to be influenced by metabolites, and provided other targets to investigate metabolite regulation of gene expression during seedling establishment.</p

    Expression cartography of human tissues using self organizing maps

    Get PDF
    Background: The availability of parallel, high-throughput microarray and sequencing experiments poses a challenge how to best arrange and to analyze the obtained heap of multidimensional data in a concerted way. Self organizing maps (SOM), a machine learning method, enables the parallel sample- and gene-centered view on the data combined with strong visualization and second-level analysis capabilities. The paper addresses aspects of the method with practical impact in the context of expression analysis of complex data sets.&#xd;&#xa;Results: The method was applied to generate a SOM characterizing the whole genome expression profiles of 67 healthy human tissues selected from ten tissue categories (adipose, endocrine, homeostasis, digestion, exocrine, epithelium, sexual reproduction, muscle, immune system and nervous tissues). SOM mapping reduces the dimension of expression data from ten thousands of genes to a few thousands of metagenes where each metagene acts as representative of a minicluster of co-regulated single genes. Tissue-specific and common properties shared between groups of tissues emerge as a handful of localized spots in the tissue maps collecting groups of co-regulated and co-expressed metagenes. The functional context of the spots was discovered using overrepresentation analysis with respect to pre-defined gene sets of known functional impact. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. Analysis techniques normally used at the gene-level such as two-way hierarchical clustering provide a better signal-to-noise ratio and a better representativeness of the method if applied to the metagenes. Metagene-based clustering analyses aggregate the tissues into essentially three clusters containing nervous, immune system and the remaining tissues. &#xd;&#xa;Conclusions: The global view on the behavior of a few well-defined modules of correlated and differentially expressed genes is more intuitive and more informative than the separate discovery of the expression levels of hundreds or thousands of individual genes. The metagene approach is less sensitive to a priori selection of genes. It can detect a coordinated expression pattern whose components would not pass single-gene significance thresholds and it is able to extract context-dependent patterns of gene expression in complex data sets.&#xd;&#xa

    Mining SOM expression portraits: Feature selection and integrating concepts of molecular function

    Get PDF
    Background: &#xd;&#xa;Self organizing maps (SOM) enable the straightforward portraying of high-dimensional data of large sample collections in terms of sample-specific images. The analysis of their texture provides so-called spot-clusters of co-expressed genes which require subsequent significance filtering and functional interpretation. We address feature selection in terms of the gene ranking problem and the interpretation of the obtained spot-related lists using concepts of molecular function.&#xd;&#xa;&#xd;&#xa;Results: &#xd;&#xa;Different expression scores based either on simple fold change-measures or on regularized Students t-statistics are applied to spot-related gene lists and compared with special emphasis on the error characteristics of microarray expression data. The spot-clusters are analyzed using different methods of gene set enrichment analysis with the focus on overexpression and/or overrepresentation of predefined sets of genes. Metagene-related overrepresentation of selected gene sets was mapped into the SOM images to assign gene function to different regions. Alternatively we estimated set-related overexpression profiles over all samples studied using a gene set enrichment score. It was also applied to the spot-clusters to generate lists of enriched gene sets. We used the tissue body index data set, a collection of expression data of human tissues, as an illustrative example. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. In addition, we display special sets of housekeeping and of consistently weak and highly expressed genes using SOM data filtering. &#xd;&#xa;&#xd;&#xa;Conclusions:&#xd;&#xa;The presented methods allow the comprehensive downstream analysis of SOM-transformed expression data in terms of cluster-related gene lists and enriched gene sets for functional interpretation. SOM clustering implies the ability to define either new gene sets using selected SOM spots or to verify and/or to amend existing ones

    GOurmet: A tool for quantitative comparison and visualization of gene expression profiles based on gene ontology (GO) distributions

    Get PDF
    BACKGROUND: The ever-expanding population of gene expression profiles (EPs) from specified cells and tissues under a variety of experimental conditions is an important but difficult resource for investigators to utilize effectively. Software tools have been recently developed to use the distribution of gene ontology (GO) terms associated with the genes in an EP to identify specific biological functions or processes that are over- or under-represented in that EP relative to other EPs. Additionally, it is possible to use the distribution of GO terms inherent to each EP to relate that EP as a whole to other EPs. Because GO term annotation is organized in a tree-like cascade of variable granularity, this approach allows the user to relate (e.g., by hierarchical clustering) EPs of varying length and from different platforms (e.g., GeneChip, SAGE, EST library). RESULTS: Here we present GOurmet, a software package that calculates the distribution of GO terms represented by the genes in an individual expression profile (EP), clusters multiple EPs based on these integrated GO term distributions, and provides users several tools to visualize and compare EPs. GOurmet is particularly useful in meta-analysis to examine EPs of specified cell types (e.g., tissue-specific stem cells) that are obtained through different experimental procedures. GOurmet also introduces a new tool, the Targetoid plot, which allows users to dynamically render the multi-dimensional relationships among individual elements in any clustering analysis. The Targetoid plotting tool allows users to select any element as the center of the plot, and the program will then represent all other elements in the cluster as a function of similarity to the selected central element. CONCLUSION: GOurmet is a user-friendly, GUI-based software package that greatly facilitates analysis of results generated by multiple EPs. The clustering analysis features a dynamic targetoid plot that is generalizable for use with any clustering application
    corecore