588 research outputs found
A Special Structural Based Weighted Network Approach for the Analysis of Protein Complexes
The detection and analysis of protein complexes is essential for understanding the functional mechanism and cellular integrity. Recently, several techniques for detecting and analysing protein complexes from Protein–Protein Interaction (PPI) dataset have been developed. Most of those techniques are inefficient in terms of detecting, overlapping complexes, exclusion of attachment protein in complex core, inability to detect inherent structures of underlying complexes, have high false-positive rates and an enrichment analysis. To address these limitations, we introduce a special structural-based weighted network approach for the analysis of protein complexes based on a Weighted Edge, Core-Attachment and Local Modularity structures (WECALM). Experimental results indicate that WECALM performs relatively better than existing algorithms in terms of accuracy, computational time, and p-value. A functional enrichment analysis also shows that WECALM is able to identify a large number of biologically significant protein complexes. Overall, WECALM outperforms other approaches by striking a better balance of accuracy and efficiency in the detection of protein complexes
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
A least square method based model for identifying protein complexes in protein-protein interaction network
Protein complex formed by a group of physical interacting proteins plays a crucial role in cell activities. Great effort has been made to computationally identify protein complexes from protein-protein interaction (PPI) network. However, the accuracy of the prediction is still far from being satisfactory, because the topological structures of protein complexes in the PPI network are too complicated. This paper proposes a novel optimization framework to detect complexes from PPI network, named PLSMC. The method is on the basis of the fact that if two proteins are in a common complex, they are likely to be interacting. PLSMC employs this relation to determine complexes by a penalized least squares method. PLSMC is applied to several public yeast PPI networks, and compared with several state-of-the-art methods. The results indicate that PLSMC outperforms other methods. In particular, complexes predicted by PLSMC can match known complexes with a higher accuracy than other methods. Furthermore, the predicted complexes have high functional homogeneity
Skeleton coupling: a novel interlayer mapping of community evolution in temporal networks
Dynamic community detection (DCD) in temporal networks is a complicated task
that involves the selection of an algorithm and its associated parameters. How
to choose the most appropriate algorithm generally depends on the type of
network being analyzed and the specific properties of the data that define the
network. In functional temporal networks derived from neuronal spike train
data, communities are expected to be transient, and it is common for the
network to contain multiple singleton communities. Here, we compare the
performance of different DCD algorithms on functional temporal networks built
from synthetic neuronal time series data with known community structure. We
find that, for these networks, DCD algorithms that utilize interlayer links to
perform community carryover between layers outperform other methods. However,
we also observe that algorithm performance is highly dependent on the topology
of interlayer links, especially in the presence of singleton and transient
communities. We therefore define a novel method for defining interlayer links
in temporal networks called skeleton coupling that is specifically designed to
enhance the linkage of communities in the network throughout time based on the
topological properties of the community history. We show that integrating
skeleton coupling with current DCD methods improves algorithm performance in
synthetic data with planted singleton and transient communities. The use of
skeleton coupling to perform DCD will therefore allow for more accurate and
interpretable results of community evolution in real-world neuronal data or in
other systems with transient structure and singleton communities.Comment: 19 pages, 8 figure
Models and Algorithms in Biological Network Evolution with Modularity
Networks are commonly used to represent key processes in biology; examples include transcriptional regulatory networks, protein-protein interaction (PPI) networks, metabolic networks, etc. Databases store many such networks, as graphs, observed or inferred. Generative models for these networks have been proposed. For PPI networks, current models are based on duplication and divergence (D&D): a node (gene) is duplicated and inherits some subset of the connections of the original node. An early finding about biological networks is modularity: a higher-level structure is prevalent consisting of well connected subgraphs with less substantial connectivity to other such subgraphs. While D&D models spontaneously generate modular structures, neither have these structures been compared with those in the databases nor are D&D models known to maintain and evolve them. Given that the preferred generative models being based on D&D, the network inference models are also based on the same principle. We describe NEMo (Network Evolution with Modularity), a new model that embodies modularity. It consists of two layers: the lower layer is a derivation of the D&D process thus node-and-edge based, while the upper layer is module-aware. NEMo allows modules to appear and disappear, to fission and to merge, all driven by the underlying edge-level events using a duplication-based process. We also introduce measures to compare biological networks in terms of their modular structure. We present an extensive study of six model organisms across six public databases aimed at uncovering commonalities in network structure. We then use these commonalities as reference against which to compare the networks generated by D&D models and by our module-aware model NEMo. We find that, by restricting our data to high-confidence interactions, a number of shared structural features can be identified among the six species and six databases. When comparing these characteristics with those extracted from the networks produced by D&D models and our NEMo model, we further find that the networks generated by NEMo exhibit structural characteristics much closer to those of the PPI networks of the model organisms. We conclude that modularity in PPI networks takes a particular form, one that is better approximated by the module-aware NEMo model than by other current models. Finally, we draft the ideas for a module-aware network inference model that uses an altered form of our module-aware NEMo as the core component, from a parsimony perspective
A Combination Method of Centrality Measures and Biological Properties to Improve Detection of Protein Complexes in Weighted PPI Networks
Introduction: In protein-protein interaction networks (PPINs), a complex is a group of proteins that allows a biological process to take place. The correct identification of complexes can help better understanding of the function of cells used for therapeutic purposes, such as drug discoveries. One of the common methods for identifying complexes in the PPINs is clustering, but this study aimed to identify a new method for more accurate identification of complexes.
Method: In this study, Yeast and Human PPINs were investigated. The Yeast datasets, called DIP, MIPS, and Krogan, contain 4930 nodes and 17201 interactions, 4564 nodes and 15175 interactions, and 2675 nodes and 7084 interactions, respectively. The Human dataset contains 37437 interactions. The proposed and well-known methods have been implemented on datasets to identify protein complexes. Predicted complexes were compared with the CYC2008 and CORUM benchmark datasets. The evaluation criteria showed that the proposed method predicts PPINs with higher efficiency.
Results: In this study, a new method of the core-attachment methods was used to detect protein complexes enjoying high efficiency in the detection. The more precise the detection method is, the more correct we can identify the proteins involved in biological process. According to the evaluation criteria, the proposed method showed a significant improvement in the detection method compared to the other methods.
Conclusion: According to the results, the proposed method can identify a sufficient number of protein complexes, among the highest biological significance in functional cooperation with proteins
Identifying Essential Proteins in Dynamic PPI Network with Improved FOA
Identification of essential proteins plays an important role for understanding the cellular life activity and development in postgenomic era. Identification of essential proteins from the protein-protein interaction (PPI) networks has become a hot topic in recent years. In this work, fruit fly optimization algorithm (FOA) is extended for identifying essential proteins, the extended algorithm is called EPFOA, which merges FOA with topological properties and biological information for essential proteins identification. The algorithm EPFOA has the advantage of identifying multiple essential proteins simultaneously rather than completely relying on ranking score identification individually. The performance of EPFOA is analyzed on dynamic PPI networks, which are constructed by combining the gene expression data. The experimental results demonstrate that EPFOA is more efficient in detecting essential proteins than the state-of-the-art essential proteins detection methods
- …