197,688 research outputs found
Detecting modules in biological networks by edge weights clustering and entropy significance
Detection of the modular structure of biological networks is of interest to researchers adopting a systems perspective for the analysis of omics data. Computational systems biology has provided a rich array of methods for network clustering. To date, the majority of approaches address this task through a network node classification based on topological or external quantifiable properties of network nodes. Conversely, numerical properties of network edges are underused, even though the information content which can be associated with network edges has augmented due to steady advances in molecular biology technology over the last decade. Properly accounting for network edges in the development of clustering approaches can become crucial to improve quantitative interpretation of omics data, finally resulting in more biologically plausible models. In this study, we present a novel technique for network module detection, named WG-Cluster (Weighted Graph CLUSTERing). WG-Cluster's notable features, compared to current approaches, lie in: (1) the simultaneous exploitation of network node and edge weights to improve the biological interpretability of the connected components detected, (2) the assessment of their statistical significance, and (3) the identification of emerging topological properties in the detected connected components. WG-Cluster utilizes three major steps: (i) an unsupervised version of k-means edge-based algorithm detects sub-graphs with similar edge weights, (ii) a fast-greedy algorithm detects connected components which are then scored and selected according to the statistical significance of their scores, and (iii) an analysis of the convolution between sub-graph mean edge weight and connected component score provides a summarizing view of the connected components. WG-Cluster can be applied to directed and undirected networks of different types of interacting entities and scales up to large omics data sets. Here, we show that WG-Cluster can be successfully used in the differential analysis of physical protein-protein interaction (PPI) networks. Specifically, applying WG-Cluster to a PPI network weighted by measurements of differential gene expression permits to explore the changes in network topology under two distinct (normal vs. tumor) conditions. WG-Cluster code is available at https://sites.google.com/site/paolaleccapersonalpage/
Graph theoretic network analysis reveals protein pathways underlying cell death following neurotropic viral infection
Complex protein networks underlie any cellular function. Certain proteins play a pivotal role in many network configurations, disruption of whose expression proves fatal to the cell. An efficient method to tease out such key proteins in a network is still unavailable. Here, we used graph-theoretic measures on protein-protein interaction data (interactome) to extract biophysically relevant information about individual protein regulation and network properties such as formation of function specific modules (sub-networks) of proteins. We took 5 major proteins that are involved in neuronal apoptosis post Chandipura Virus (CHPV) infection as seed proteins in a database to create a meta-network of immediately interacting proteins (1st order network). Graph theoretic measures were employed to rank the proteins in terms of their connectivity and the degree upto which they can be organized into smaller modules (hubs). We repeated the analysis on 2nd order interactome that includes proteins connected directly with proteins of 1st order. FADD and Casp-3 were connected maximally to other proteins in both analyses, thus indicating their importance in neuronal apoptosis. Thus, our analysis provides a blueprint for the detection and validation of protein networks disrupted by viral infections
Recommended from our members
Automated structure detection for distributed process optimization
The design and control of large-scale engineering systems, consisting of a number of interacting subsystems, is a heavily researched topic with relevance both for industry and academia. This paper presents two methodologies for optimal model-based decomposition, where an optimization problem is decomposed into several smaller sub-problems and subsequently solved by augmented Lagrangian decomposition methods. Large-scale and highly nonlinear problems commonly arise in process optimization, and could greatly benefit from these approaches, as they reduce the storage requirements and computational costs for global optimization. The strategy presented translates the problem into a constraint graph. The first approach uses a heuristic community detection algorithm to identify highly connected clusters in the optimization problem graph representation. The second approach uses a multilevel graph bisection algorithm to find the optimal partition, given a desired number of sub-problems. The partitioned graphs are translated back into decomposed sets of sub-problems with a minimal number of coupling constraints. Results show both of these methods can be used as efficient frameworks to decompose optimization problems in linear time, in comparison to traditional methods which require polynomial time.Author E. A. del Rio-Chanona would like to acknowledge CONACyT scholarship No. 522530 for funding this project. Author F. Fiorelli gratefully acknowledges the support from his family. The authors would also 27 like to thank Dr Bart Hallmark, University of Cambridge, for suggesting to employ as a demonstration the chemical system in Example 7.This is the author accepted manuscript. The final version is available from Elsevier via http://dx.doi.org/10.1016/j.compchemeng.2016.03.01
A statistical network analysis of the HIV/AIDS epidemics in Cuba
The Cuban contact-tracing detection system set up in 1986 allowed the
reconstruction and analysis of the sexual network underlying the epidemic
(5,389 vertices and 4,073 edges, giant component of 2,386 nodes and 3,168
edges), shedding light onto the spread of HIV and the role of contact-tracing.
Clustering based on modularity optimization provides a better visualization and
understanding of the network, in combination with the study of covariates. The
graph has a globally low but heterogeneous density, with clusters of high
intraconnectivity but low interconnectivity. Though descriptive, our results
pave the way for incorporating structure when studying stochastic SIR epidemics
spreading on social networks
Linear Time Subgraph Counting, Graph Degeneracy, and the Chasm at Size Six
We consider the problem of counting all k-vertex subgraphs in an input graph, for any constant k. This problem (denoted SUB-CNT_k) has been studied extensively in both theory and practice. In a classic result, Chiba and Nishizeki (SICOMP 85) gave linear time algorithms for clique and 4-cycle counting for bounded degeneracy graphs. This is a rich class of sparse graphs that contains, for example, all minor-free families and preferential attachment graphs. The techniques from this result have inspired a number of recent practical algorithms for SUB-CNT_k. Towards a better understanding of the limits of these techniques, we ask: for what values of k can SUB_CNT_k be solved in linear time?
We discover a chasm at k=6. Specifically, we prove that for k < 6, SUB_CNT_k can be solved in linear time. Assuming a standard conjecture in fine-grained complexity, we prove that for all k ? 6, SUB-CNT_k cannot be solved even in near-linear time
Multiresolution community detection for megascale networks by information-based replica correlations
We use a Potts model community detection algorithm to accurately and
quantitatively evaluate the hierarchical or multiresolution structure of a
graph. Our multiresolution algorithm calculates correlations among multiple
copies ("replicas") of the same graph over a range of resolutions. Significant
multiresolution structures are identified by strongly correlated replicas. The
average normalized mutual information, the variation of information, and other
measures in principle give a quantitative estimate of the "best" resolutions
and indicate the relative strength of the structures in the graph. Because the
method is based on information comparisons, it can in principle be used with
any community detection model that can examine multiple resolutions. Our
approach may be extended to other optimization problems. As a local measure,
our Potts model avoids the "resolution limit" that affects other popular
models. With this model, our community detection algorithm has an accuracy that
ranks among the best of currently available methods. Using it, we can examine
graphs over 40 million nodes and more than one billion edges. We further report
that the multiresolution variant of our algorithm can solve systems of at least
200000 nodes and 10 million edges on a single processor with exceptionally high
accuracy. For typical cases, we find a super-linear scaling, O(L^{1.3}) for
community detection and O(L^{1.3} log N) for the multiresolution algorithm
where L is the number of edges and N is the number of nodes in the system.Comment: 19 pages, 14 figures, published version with minor change
- …