Search CORE

37,800 research outputs found

Fingerprint for Network Topologies

Author: C. Rose
D. Chakrabarti
J. Watts
M. Newman
P. Erdös
R. Albert
R. Pastor-Satorras
R.V. Hogg
S. Bornholdt
S. Wasserman
S. Zhou
S.N. Dorogovtsev
Y. Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/12/2008
Field of study

A network's topology information can be given as an adjacency matrix. The bitmap of sorted adjacency matrix(BOSAM) is a network visualisation tool which can emphasise different network structures by just looking at reordered adjacent matrixes. A BOSAM picture resembles the shape of a flower and is characterised by a series of 'leaves'. Here we show and mathematically prove that for most networks, there is a self-similar relation between the envelope of the BOSAM leaves. This self-similar property allows us to use a single envelope to predict all other envelopes and therefore reconstruct the outline of a network's BOSAM picture. We analogise the BOSAM envelope to human's fingerprint as they share a number of common features, e.g. both are simple, easy to obtain, and strongly characteristic encoding essential information for identification.Comment: 12papes, 3 figures, in pres

arXiv.org e-Print Archive

Crossref

UCL Discovery

Going the distance for protein function prediction: a new distance metric for protein interaction networks

Author: Cao Mengfei
Cowen Lenore J.
Crovella Mark E.
Daniels Noah M.
Hescott Benjamin
Park Jisoo
Zhang Hao
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Due to an error introduced in the production process, the x-axes in the first panels of Figure 1 and Figure 7 are not formatted correctly. The correct Figure 1 can be viewed here: http://dx.doi.org/10.1371/annotation/343bf260-f6ff-48a2-93b2-3cc79af518a9In protein-protein interaction (PPI) networks, functional similarity is often inferred based on the function of directly interacting proteins, or more generally, some notion of interaction network proximity among proteins in a local neighborhood. Prior methods typically measure proximity as the shortest-path distance in the network, but this has only a limited ability to capture fine-grained neighborhood distinctions, because most proteins are close to each other, and there are many ties in proximity. We introduce diffusion state distance (DSD), a new metric based on a graph diffusion property, designed to capture finer-grained distinctions in proximity for transfer of functional annotation in PPI networks. We present a tool that, when input a PPI network, will output the DSD distances between every pair of proteins. We show that replacing the shortest-path metric by DSD improves the performance of classical function prediction methods across the board.MC, HZ, NMD and LJC were supported in part by National Institutes of Health (NIH) R01 grant GM080330. JP was supported in part by NIH grant R01 HD058880. This material is based upon work supported by the National Science Foundation under grant numbers CNS-0905565, CNS-1018266, CNS-1012910, and CNS-1117039, and supported by the Army Research Office under grant W911NF-11-1-0227 (to MEC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

CiteSeerX

Boston University Institutional Repository (OpenBU)

Directory of Open Access Journals

PubMed Central

Uncovering the overlapping community structure of complex networks in nature and society

Author: A-L Barabási
AC Gavin
BS Everitt
C Song
DJ Watts
DJ Watts
E Ravasz
EI Boyle
F Radicchi
Gergely Palla
I Derényi
I Xenarios
Illés Farkas
Imre Derényi
J Scott
J-P Onnela
JFF Mendes
JM Cherry
K Faust
M Blatt
M Girvan
MEJ Newman
MEJ Newman
MG Everett
R Albert
RM Shiffrin
S Knudsen
S Kosub
S Warner
T Vicsek
Tamás Vicsek
V Spirin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/06/2005
Field of study

Many complex systems in nature and society can be described in terms of networks capturing the intricate web of connections among the units they are made of. A key question is how to interpret the global organization of such networks as the coexistence of their structural subunits (communities) associated with more highly interconnected parts. Identifying these a priori unknown building blocks (such as functionally related proteins, industrial sectors and groups of people) is crucial to the understanding of the structural and functional properties of networks. The existing deterministic methods used for large networks find separated communities, whereas most of the actual networks are made of highly overlapping cohesive groups of nodes. Here we introduce an approach to analysing the main statistical features of the interwoven sets of overlapping communities that makes a step towards uncovering the modular structure of complex systems. After defining a set of new characteristic quantities for the statistics of communities, we apply an efficient technique for exploring overlapping communities on a large scale. We find that overlaps are significant, and the distributions we introduce reveal universal features of networks. Our studies of collaboration, word-association and protein interaction graphs show that the web of communities has non-trivial correlations and specific scaling properties.Comment: The free academic research software, CFinder, used for the publication is available at the website of the publication: http://angel.elte.hu/clusterin

arXiv.org e-Print Archive

Crossref

CERN Document Server

A simple yet effective baseline for non-attributed graph classification

Author: Cai Chen
Wang Yusu
Publication venue
Publication date: 03/05/2019
Field of study

Graphs are complex objects that do not lend themselves easily to typical learning tasks. Recently, a range of approaches based on graph kernels or graph neural networks have been developed for graph classification and for representation learning on graphs in general. As the developed methodologies become more sophisticated, it is important to understand which components of the increasingly complex methods are necessary or most effective. As a first step, we develop a simple yet meaningful graph representation, and explore its effectiveness in graph classification. We test our baseline representation for the graph classification task on a range of graph datasets. Interestingly, this simple representation achieves similar performance as the state-of-the-art graph kernels and graph neural networks for non-attributed graph classification. Its performance on classifying attributed graphs is slightly weaker as it does not incorporate attributes. However, given its simplicity and efficiency, we believe that it still serves as an effective baseline for attributed graph classification. Our graph representation is efficient (linear-time) to compute. We also provide a simple connection with the graph neural networks. Note that these observations are only for the task of graph classification while existing methods are often designed for a broader scope including node embedding and link prediction. The results are also likely biased due to the limited amount of benchmark datasets available. Nevertheless, the good performance of our simple baseline calls for the development of new, more comprehensive benchmark datasets so as to better evaluate and analyze different graph learning methods. Furthermore, given the computational efficiency of our graph summary, we believe that it is a good candidate as a baseline method for future graph classification (or even other graph learning) studies.Comment: 13 pages. Shorter version appears at 2019 ICLR Workshop: Representation Learning on Graphs and Manifolds. arXiv admin note: text overlap with arXiv:1810.00826 by other author

arXiv.org e-Print Archive

On the Convexity of Latent Social Network Inference

Author: Leskovec Jure
Myers Seth A.
Publication venue
Publication date: 01/01/2010
Field of study

In many real-world scenarios, it is nearly impossible to collect explicit social network data. In such cases, whole networks must be inferred from underlying observations. Here, we formulate the problem of inferring latent social networks based on network diffusion or disease propagation data. We consider contagions propagating over the edges of an unobserved social network, where we only observe the times when nodes became infected, but not who infected them. Given such node infection times, we then identify the optimal network that best explains the observed data. We present a maximum likelihood approach based on convex programming with a l1-like penalty term that encourages sparsity. Experiments on real and synthetic data reveal that our method near-perfectly recovers the underlying network structure as well as the parameters of the contagion propagation model. Moreover, our approach scales well as it can infer optimal networks of thousands of nodes in a matter of minutes.Comment: NIPS, 201

arXiv.org e-Print Archive

CiteSeerX

The BioGRID Interaction Database: 2011 update

Author: A. Chatr-aryamontri
A. Winter
B.-J. Breitkreutz
Behrends
Bork
Breitkreutz
Breitkreutz
C. Stark
Cline
Costanzo
Drabkin
Hertz-Fowler
Howe
J. M. Rust
J. Nixon
K. Dolinski
K. Van Auken
Kerrien
L. Boucher
Leitner
M. S. Livstone
M. Tyers
Mering
M ller
R. Oughtred
Razick
T. Reguly
Wiederkehr
X. Shi
X. Wang
Yu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2011
Field of study

The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans (http://www.thebiogrid.org). BioGRID currently holds 347 966 interactions (170 162 genetic, 177 804 protein) curated from both high-throughput data sets and individual focused studies, as derived from over 23 000 publications in the primary literature. Complete coverage of the entire literature is maintained for budding yeast (Saccharomyces cerevisiae), fission yeast (Schizosaccharomyces pombe) and thale cress (Arabidopsis thaliana), and efforts to expand curation across multiple metazoan species are underway. The BioGRID houses 48 831 human protein interactions that have been curated from 10 247 publications. Current curation drives are focused on particular areas of biology to enable insights into conserved networks and pathways that are relevant to human health. The BioGRID 3.0 web interface contains new search and display features that enable rapid queries across multiple data types and sources. An automated Interaction Management System (IMS) is used to prioritize, coordinate and track curation across international sites and projects. BioGRID provides interaction data to several model organism databases, resources such as Entrez-Gene and other interaction meta-databases. The entire BioGRID 3.0 data collection may be downloaded in multiple file formats, including PSI MI XML. Source code for BioGRID 3.0 is freely available without any restrictions

CiteSeerX

Crossref

PubMed Central

Edinburgh Research Explorer

Caltech Authors