Search CORE

146 research outputs found

Biological network comparison using graphlet degree distribution

Author: Altschul
Bader
Barabasi
Han
Ito
Jeong
Jeong
Lappe
Maslov
Milo
Milo
N. Przulj
Peri
Rual
Shen-Orr
SIMON
Stelzl
Tanaka
Uetz
von Mering
Watts
Zanzoni
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2007
Field of study

Analogous to biological sequence comparison, comparing cellular networks is an important problem that could provide insight into biological understanding and therapeutics. For technical reasons, comparing large networks is computationally infeasible, and thus heuristics such as the degree distribution have been sought. It is easy to demonstrate that two networks are different by simply showing a short list of properties in which they differ. It is much harder to show that two networks are similar, as it requires demonstrating their similarity in all of their exponentially many properties. Clearly, it is computationally prohibitive to analyze all network properties, but the larger the number of constraints we impose in determining network similarity, the more likely it is that the networks will truly be similar. We introduce a new systematic measure of a network's local structure that imposes a large number of similarity constraints on networks being compared. In particular, we generalize the degree distribution, which measures the number of nodes 'touching' k edges, into distributions measuring the number of nodes 'touching' k graphlets, where graphlets are small connected non-isomorphic subgraphs of a large network. Our new measure of network local structure consists of 73 graphlet degree distributions (GDDs) of graphlets with 2-5 nodes, but it is easily extendible to a greater number of constraints (i.e. graphlets). Furthermore, we show a way to combine the 73 GDDs into a network 'agreement' measure. Based on this new network agreement measure, we show that almost all of the 14 eukaryotic PPI networks, including human, are better modeled by geometric random graphs than by Erdos-Reny, random scale-free, or Barabasi-Albert scale-free networks.Comment: Proceedings of the 2006 European Conference on Computational Biology, ECCB'06, Eilat, Israel, January 21-24, 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

Topological network alignment uncovers biological function and phylogeny

Author: Cook S.
Flannick J.
Kuchaiev O.
Kuchaiev O.
Memišević V.
Nataša Pržulj
Oleksii Kuchaiev
Pržulj N.
Singh R.
Singh R.
Snijders T. A.
Tijana Milenković
Vesna Memišević
Wayne Hayes
Wentz-Hunter K.
Zhang Y.
Publication venue
Publication date: 07/10/2009
Field of study

Sequence comparison and alignment has had an enormous impact on our understanding of evolution, biology, and disease. Comparison and alignment of biological networks will likely have a similar impact. Existing network alignments use information external to the networks, such as sequence, because no good algorithm for purely topological alignment has yet been devised. In this paper, we present a novel algorithm based solely on network topology, that can be used to align any two networks. We apply it to biological networks to produce by far the most complete topological alignments of biological networks to date. We demonstrate that both species phylogeny and detailed biological function of individual proteins can be extracted from our alignments. Topology-based alignments have the potential to provide a completely new, independent source of phylogenetic information. Our alignment of the protein-protein interaction networks of two very different species--yeast and human--indicate that even distant species share a surprising amount of network topology with each other, suggesting broad similarities in internal cellular wiring across all life on Earth.Comment: Algorithm explained in more details. Additional analysis adde

arXiv.org e-Print Archive

Crossref

PubMed Central

UCL Discovery

Graphettes: Constant-time determination of graphlet and orbit identity including (possibly disconnected) graphlets up to size 8.

Author: Chung Po-Chien
Hasan Adib
Hayes Wayne
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

Graphlets are small connected induced subgraphs of a larger graph G. Graphlets are now commonly used to quantify local and global topology of networks in the field. Methods exist to exhaustively enumerate all graphlets (and their orbits) in large networks as efficiently as possible using orbit counting equations. However, the number of graphlets in G is exponential in both the number of nodes and edges in G. Enumerating them all is already unacceptably expensive on existing large networks, and the problem will only get worse as networks continue to grow in size and density. Here we introduce an efficient method designed to aid statistical sampling of graphlets up to size k = 8 from a large network. We define graphettes as the generalization of graphlets allowing for disconnected graphlets. Given a particular (undirected) graphette g, we introduce the idea of the canonical graphette [Formula: see text] as a representative member of the isomorphism group Iso(g) of g. We compute the mapping [Formula: see text], in the form of a lookup table, from all 2k(k - 1)/2 undirected graphettes g of size k ≤ 8 to their canonical representatives [Formula: see text], as well as the permutation that transforms g to [Formula: see text]. We also compute all automorphism orbits for each canonical graphette. Thus, given any k ≤ 8 nodes in a graph G, we can in constant time infer which graphette it is, as well as which orbit each of the k nodes belongs to. Sampling a large number N of such k-sets of nodes provides an approximation of both the distribution of graphlets and orbits across G, and the orbit degree vector at each node

arXiv.org e-Print Archive

Directory of Open Access Journals

eScholarship - University of California

Fitting a geometric graph to a protein-protein interaction network

Author: Barabási
Bender
Bradley
Cox
Desmond J. Higham
Erdös
Erdös
Gavin
Giot
Golub
Grindrod
Grindrod
Higham
Ho
Ito
Kaski
Khanin
Krogan
Lappe
Li
Marija Rašajski
Mewes
Milo
Morrison
Mrowka
Nataša Pržulj
Newman
Penrose
Peri
Pržulj
Pržulj
Pržulj
Pržulj
Rual
Simon
Stelzl
Taguchi
Tape
Thomas
Titz
Uetz
Vazquez
von Mering
Watts
Xenarios
Zanzoni
Zhong
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2008
Field of study

Finding a good network null model for protein-protein interaction (PPI) networks is a fundamental issue. Such a model would provide insights into the interplay between network structure and biological function as well as into evolution. Also, network (graph) models are used to guide biological experiments and discover new biological features. It has been proposed that geometric random graphs are a good model for PPI networks. In a geometric random graph, nodes correspond to uniformly randomly distributed points in a metric space and edges (links) exist between pairs of nodes for which the corresponding points in the metric space are close enough according to some distance norm. Computational experiments have revealed close matches between key topological properties of PPI networks and geometric random graph models. In this work, we push the comparison further by exploiting the fact that the geometric property can be tested for directly. To this end, we develop an algorithm that takes PPI interaction data and embeds proteins into a low-dimensional Euclidean space, under the premise that connectivity information corresponds to Euclidean proximity, as in geometric-random graphs.We judge the sensitivity and specificity of the fit by computing the area under the Receiver Operator Characteristic (ROC) curve. The network embedding algorithm is based on multi-dimensional scaling, with the square root of the path length in a network playing the role of the Euclidean distance in the Euclidean space. The algorithm exploits sparsity for computational efficiency, and requires only a few sparse matrix multiplications, giving a complexity of O(N2) where N is the number of proteins.The algorithm has been verified in the sense that it successfully rediscovers the geometric structure in artificially constructed geometric networks, even when noise is added by re-wiring some links. Applying the algorithm to 19 publicly available PPI networks of various organisms indicated that: (a) geometric effects are present and (b) two-dimensional Euclidean space is generally as effective as higher dimensional Euclidean space for explaining the connectivity. Testing on a high-confidence yeast data set produced a very strong indication of geometric structure (area under the ROC curve of 0.89), with this network being essentially indistinguishable from a noisy geometric network. Overall, the results add support to the hypothesis that PPI networks have a geometric structure

CiteSeerX

Crossref

University of Strathclyde Institutional Repository

Edinburgh Research Explorer

Fitting a geometric graph to a protein-protein interaction network

Author: Barabási
Bender
Bradley
Cox
Desmond J. Higham
Erdös
Erdös
Gavin
Giot
Golub
Grindrod
Grindrod
Higham
Ho
Ito
Kaski
Khanin
Krogan
Lappe
Li
Marija Rašajski
Mewes
Milo
Morrison
Mrowka
Nataša Pržulj
Newman
Penrose
Peri
Pržulj
Pržulj
Pržulj
Pržulj
Rual
Simon
Stelzl
Taguchi
Tape
Thomas
Titz
Uetz
Vazquez
von Mering
Watts
Xenarios
Zanzoni
Zhong
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2008
Field of study

CiteSeerX

Crossref

University of Strathclyde Institutional Repository

Edinburgh Research Explorer

Probabilistic graphlets capture biological function in probabilistic molecular networks

Author: Böttcher René
Doria Belenguer Sergio
Malod-Dognin Noël
Pržulj Nataša
Youssef Markus Kirolos
Publication venue: 'Oxford University Press (OUP)'
Publication date: 29/12/2020
Field of study

Motivation: Molecular interactions have been successfully modeled and analyzed as networks, where nodes represent molecules and edges represent the interactions between them. These networks revealed that molecules with similar local network structure also have similar biological functions. The most sensitive measures of network structure are based on graphlets. However, graphlet-based methods thus far are only applicable to unweighted networks, whereas real-world molecular networks may have weighted edges that can represent the probability of an interaction occurring in the cell. This information is commonly discarded when applying thresholds to generate unweighted networks, which may lead to information loss. Results: We introduce probabilistic graphlets as a tool for analyzing the local wiring patterns of probabilistic networks. To assess their performance compared to unweighted graphlets, we generate synthetic networks based on different well-known random network models and edge probability distributions and demonstrate that probabilistic graphlets outperform their unweighted counterparts in distinguishing network structures. Then we model different real-world molecular interaction networks as weighted graphs with probabilities as weights on edges and we analyze them with our new weighted graphlets-based methods. We show that due to their probabilistic nature, probabilistic graphlet-based methods more robustly capture biological information in these data, while simultaneously showing a higher sensitivity to identify condition-specific functions compared to their unweighted graphlet-based method counterparts.This work was supported by the European Research Council (ERC) Consolidator Grant 770827, the Serbian Ministry of Education and Science. Project III44006, the Slovenian Research Agency project J1-8155 and The Prostate Project.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC