Analogous to biological sequence comparison, comparing cellular networks is
an important problem that could provide insight into biological understanding
and therapeutics. For technical reasons, comparing large networks is
computationally infeasible, and thus heuristics such as the degree distribution
have been sought. It is easy to demonstrate that two networks are different by
simply showing a short list of properties in which they differ. It is much
harder to show that two networks are similar, as it requires demonstrating
their similarity in all of their exponentially many properties. Clearly, it is
computationally prohibitive to analyze all network properties, but the larger
the number of constraints we impose in determining network similarity, the more
likely it is that the networks will truly be similar.
We introduce a new systematic measure of a network's local structure that
imposes a large number of similarity constraints on networks being compared. In
particular, we generalize the degree distribution, which measures the number of
nodes 'touching' k edges, into distributions measuring the number of nodes
'touching' k graphlets, where graphlets are small connected non-isomorphic
subgraphs of a large network. Our new measure of network local structure
consists of 73 graphlet degree distributions (GDDs) of graphlets with 2-5
nodes, but it is easily extendible to a greater number of constraints (i.e.
graphlets). Furthermore, we show a way to combine the 73 GDDs into a network
'agreement' measure. Based on this new network agreement measure, we show that
almost all of the 14 eukaryotic PPI networks, including human, are better
modeled by geometric random graphs than by Erdos-Reny, random scale-free, or
Barabasi-Albert scale-free networks.Comment: Proceedings of the 2006 European Conference on Computational Biology,
ECCB'06, Eilat, Israel, January 21-24, 200