2,325 research outputs found
Network module detection: Affinity search technique with the multi-node topological overlap measure
<p>Abstract</p> <p>Background</p> <p>Many clustering procedures only allow the user to input a <it>pairwise </it>dissimilarity or distance measure between objects. We propose a clustering method that can input a multi-point dissimilarity measure d(i1, i2, ..., iP) where the number of points P can be larger than 2. The work is motivated by gene network analysis where clusters correspond to modules of highly interconnected nodes. Here, we define modules as clusters of network nodes with high <it>multi-node </it>topological overlap. The topological overlap measure is a robust measure of interconnectedness which is based on shared network neighbors. In previous work, we have shown that the multi-node topological overlap measure yields biologically meaningful results when used as input of network neighborhood analysis.</p> <p>Findings</p> <p>We adapt network neighborhood analysis for the use of module detection. We propose the Module Affinity Search Technique (MAST), which is a generalized version of the Cluster Affinity Search Technique (CAST). MAST can accommodate a multi-node dissimilarity measure. Clusters grow around user-defined or automatically chosen seeds (e.g. hub nodes). We propose both local and global cluster growth stopping rules. We use several simulations and a gene co-expression network application to argue that the MAST approach leads to biologically meaningful results. We compare MAST with hierarchical clustering and partitioning around medoid clustering.</p> <p>Conclusion</p> <p>Our flexible module detection method is implemented in the MTOM software which can be downloaded from the following webpage: <url>http://www.genetics.ucla.edu/labs/horvath/MTOM/</url></p
PTOMSM: A modified version of Topological Overlap Measure used for predicting Protein-Protein Interaction Network
A variety of methods are developed to integrating diverse biological data to predict novel interaction relationship between proteins. However, traditional integration can only generate protein interaction pairs within existing relationships. Therefore, we propose a modified version of Topological Overlap Measure to identify not only extant direct PPIs links, but also novel protein interactions that can be indirectly inferred from various relationships between proteins. Our method is more powerful than a naïve Bayesian-network-based integration in PPI prediction, and could generate more reliable candidate PPIs. Furthermore, we examined the influence of the sizes of training and test datasets on prediction, and further demonstrated the effectiveness of PTOMSM in predicting PPI. More importantly, this method can be extended naturally to predict other types of biological networks, and may be combined with Bayesian method to further improve the prediction
Gene network interconnectedness and the generalized topological overlap measure
BACKGROUND: Network methods are increasingly used to represent the interactions of genes and/or proteins. Genes or proteins that are directly linked may have a similar biological function or may be part of the same biological pathway. Since the information on the connection (adjacency) between 2 nodes may be noisy or incomplete, it can be desirable to consider alternative measures of pairwise interconnectedness. Here we study a class of measures that are proportional to the number of neighbors that a pair of nodes share in common. For example, the topological overlap measure by Ravasz et al. [1] can be interpreted as a measure of agreement between the m = 1 step neighborhoods of 2 nodes. Several studies have shown that two proteins having a higher topological overlap are more likely to belong to the same functional class than proteins having a lower topological overlap. Here we address the question whether a measure of topological overlap based on higher-order neighborhoods could give rise to a more robust and sensitive measure of interconnectedness. RESULTS: We generalize the topological overlap measure from m = 1 step neighborhoods to m ≥ 2 step neighborhoods. This allows us to define the m-th order generalized topological overlap measure (GTOM) by (i) counting the number of m-step neighbors that a pair of nodes share and (ii) normalizing it to take a value between 0 and 1. Using theoretical arguments, a yeast co-expression network application, and a fly protein network application, we illustrate the usefulness of the proposed measure for module detection and gene neighborhood analysis. CONCLUSION: Topological overlap can serve as an important filter to counter the effects of spurious or missing connections between network nodes. The m-th order topological overlap measure allows one to trade-off sensitivity versus specificity when it comes to defining pairwise interconnectedness and network modules
Going the distance for protein function prediction: a new distance metric for protein interaction networks
Due to an error introduced in the production process, the x-axes in the first panels of Figure 1 and Figure 7 are not formatted correctly. The correct Figure 1 can be viewed here: http://dx.doi.org/10.1371/annotation/343bf260-f6ff-48a2-93b2-3cc79af518a9In protein-protein interaction (PPI) networks, functional similarity is often inferred based on the function of directly interacting proteins, or more generally, some notion of interaction network proximity among proteins in a local neighborhood. Prior methods typically measure proximity as the shortest-path distance in the network, but this has only a limited ability to capture fine-grained neighborhood distinctions, because most proteins are close to each other, and there are many ties in proximity. We introduce diffusion state distance (DSD), a new metric based on a graph diffusion property, designed to capture finer-grained distinctions in proximity for transfer of functional annotation in PPI networks. We present a tool that, when input a PPI network, will output the DSD distances between every pair of proteins. We show that replacing the shortest-path metric by DSD improves the performance of classical function prediction methods across the board.MC, HZ, NMD and LJC were supported in part by National Institutes of Health (NIH) R01 grant GM080330. JP was supported in part by NIH grant R01 HD058880. This material is based upon work supported by the National Science Foundation under grant numbers CNS-0905565, CNS-1018266, CNS-1012910, and CNS-1117039, and supported by the Army Research Office under grant W911NF-11-1-0227 (to MEC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Equilibrium statistical mechanics on correlated random graphs
Biological and social networks have recently attracted enormous attention
between physicists. Among several, two main aspects may be stressed: A non
trivial topology of the graph describing the mutual interactions between agents
exists and/or, typically, such interactions are essentially (weighted)
imitative. Despite such aspects are widely accepted and empirically confirmed,
the schemes currently exploited in order to generate the expected topology are
based on a-priori assumptions and in most cases still implement constant
intensities for links. Here we propose a simple shift in the definition of
patterns in an Hopfield model to convert frustration into dilution: By varying
the bias of the pattern distribution, the network topology -which is generated
by the reciprocal affinities among agents - crosses various well known regimes
(fully connected, linearly diverging connectivity, extreme dilution scenario,
no network), coupled with small world properties, which, in this context, are
emergent and no longer imposed a-priori. The model is investigated at first
focusing on these topological properties of the emergent network, then its
thermodynamics is analytically solved (at a replica symmetric level) by
extending the double stochastic stability technique, and presented together
with its fluctuation theory for a picture of criticality. At least at
equilibrium, dilution simply decreases the strength of the coupling felt by the
spins, but leaves the paramagnetic/ferromagnetic flavors unchanged. The main
difference with respect to previous investigations and a naive picture is that
within our approach replicas do not appear: instead of (multi)-overlaps as
order parameters, we introduce a class of magnetizations on all the possible
sub-graphs belonging to the main one investigated: As a consequence, for these
objects a closure for a self-consistent relation is achieved.Comment: 30 pages, 4 figure
Recommended from our members
ManiNetCluster: a novel manifold learning approach to reveal the functional links between gene networks.
BACKGROUND:The coordination of genomic functions is a critical and complex process across biological systems such as phenotypes or states (e.g., time, disease, organism, environmental perturbation). Understanding how the complexity of genomic function relates to these states remains a challenge. To address this, we have developed a novel computational method, ManiNetCluster, which simultaneously aligns and clusters gene networks (e.g., co-expression) to systematically reveal the links of genomic function between different conditions. Specifically, ManiNetCluster employs manifold learning to uncover and match local and non-linear structures among networks, and identifies cross-network functional links. RESULTS:We demonstrated that ManiNetCluster better aligns the orthologous genes from their developmental expression profiles across model organisms than state-of-the-art methods (p-value <2.2×10-16). This indicates the potential non-linear interactions of evolutionarily conserved genes across species in development. Furthermore, we applied ManiNetCluster to time series transcriptome data measured in the green alga Chlamydomonas reinhardtii to discover the genomic functions linking various metabolic processes between the light and dark periods of a diurnally cycling culture. We identified a number of genes putatively regulating processes across each lighting regime. CONCLUSIONS:ManiNetCluster provides a novel computational tool to uncover the genes linking various functions from different networks, providing new insight on how gene functions coordinate across different conditions. ManiNetCluster is publicly available as an R package at https://github.com/daifengwanglab/ManiNetCluster
Global Functional Atlas of \u3cem\u3eEscherichia coli\u3c/em\u3e Encompassing Previously Uncharacterized Proteins
One-third of the 4,225 protein-coding genes of Escherichia coli K-12 remain functionally unannotated (orphans). Many map to distant clades such as Archaea, suggesting involvement in basic prokaryotic traits, whereas others appear restricted to E. coli, including pathogenic strains. To elucidate the orphans’ biological roles, we performed an extensive proteomic survey using affinity-tagged E. coli strains and generated comprehensive genomic context inferences to derive a high-confidence compendium for virtually the entire proteome consisting of 5,993 putative physical interactions and 74,776 putative functional associations, most of which are novel. Clustering of the respective probabilistic networks revealed putative orphan membership in discrete multiprotein complexes and functional modules together with annotated gene products, whereas a machine-learning strategy based on network integration implicated the orphans in specific biological processes. We provide additional experimental evidence supporting orphan participation in protein synthesis, amino acid metabolism, biofilm formation, motility, and assembly of the bacterial cell envelope. This resource provides a “systems-wide” functional blueprint of a model microbe, with insights into the biological and evolutionary significance of previously uncharacterized proteins
Climate Dynamics: A Network-Based Approach for the Analysis of Global Precipitation
Precipitation is one of the most important meteorological variables for defining the climate dynamics, but the spatial patterns of precipitation have not been fully investigated yet. The complex network theory, which provides a robust tool to investigate the statistical interdependence of many interacting elements, is used here to analyze the spatial dynamics of annual precipitation over seventy years (1941-2010). The precipitation network is built associating a node to a geographical region, which has a temporal distribution of precipitation, and identifying possible links among nodes through the correlation function. The precipitation network reveals significant spatial variability with barely connected regions, as Eastern China and Japan, and highly connected regions, such as the African Sahel, Eastern Australia and, to a lesser extent, Northern Europe. Sahel and Eastern Australia are remarkably dry regions, where low amounts of rainfall are uniformly distributed on continental scales and small-scale extreme events are rare. As a consequence, the precipitation gradient is low, making these regions well connected on a large spatial scale. On the contrary, the Asiatic South-East is often reached by extreme events such as monsoons, tropical cyclones and heat waves, which can all contribute to reduce the correlation to the short-range scale only. Some patterns emerging between mid-latitude and tropical regions suggest a possible impact of the propagation of planetary waves on precipitation at a global scale. Other links can be qualitatively associated to the atmospheric and oceanic circulation. To analyze the sensitivity of the network to the physical closeness of the nodes, short-term connections are broken. The African Sahel, Eastern Australia and Northern Europe regions again appear as the supernodes of the network, confirming furthermore their long-range connection structure. Almost all North-American and Asian nodes vanish, revealing that extreme events can enhance high precipitation gradients, leading to a systematic absence of long-range patterns
- …