97,233 research outputs found
The specificity and robustness of long-distance connections in weighted, interareal connectomes
Brain areas' functional repertoires are shaped by their incoming and outgoing
structural connections. In empirically measured networks, most connections are
short, reflecting spatial and energetic constraints. Nonetheless, a small
number of connections span long distances, consistent with the notion that the
functionality of these connections must outweigh their cost. While the precise
function of these long-distance connections is not known, the leading
hypothesis is that they act to reduce the topological distance between brain
areas and facilitate efficient interareal communication. However, this
hypothesis implies a non-specificity of long-distance connections that we
contend is unlikely. Instead, we propose that long-distance connections serve
to diversify brain areas' inputs and outputs, thereby promoting complex
dynamics. Through analysis of five interareal network datasets, we show that
long-distance connections play only minor roles in reducing average interareal
topological distance. In contrast, areas' long-distance and short-range
neighbors exhibit marked differences in their connectivity profiles, suggesting
that long-distance connections enhance dissimilarity between regional inputs
and outputs. Next, we show that -- in isolation -- areas' long-distance
connectivity profiles exhibit non-random levels of similarity, suggesting that
the communication pathways formed by long connections exhibit redundancies that
may serve to promote robustness. Finally, we use a linearization of
Wilson-Cowan dynamics to simulate the covariance structure of neural activity
and show that in the absence of long-distance connections, a common measure of
functional diversity decreases. Collectively, our findings suggest that
long-distance connections are necessary for supporting diverse and complex
brain dynamics.Comment: 18 pages, 8 figure
Integration of molecular network data reconstructs Gene Ontology.
Motivation: Recently, a shift was made from using Gene Ontology (GO) to evaluate molecular network data to using these data to construct and evaluate GO. Dutkowski et al. provide the first evidence that a large part of GO can be reconstructed solely from topologies of molecular networks. Motivated by this work, we develop a novel data integration framework that integrates multiple types of molecular network data to reconstruct and update GO. We ask how much of GO can be recovered by integrating various molecular interaction data. Results: We introduce a computational framework for integration of various biological networks using penalized non-negative matrix tri-factorization (PNMTF). It takes all network data in a matrix form and performs simultaneous clustering of genes and GO terms, inducing new relations between genes and GO terms (annotations) and between GO terms themselves. To improve the accuracy of our predicted relations, we extend the integration methodology to include additional topological information represented as the similarity in wiring around non-interacting genes. Surprisingly, by integrating topologies of bakers’ yeasts protein–protein interaction, genetic interaction (GI) and co-expression networks, our method reports as related 96% of GO terms that are directly related in GO. The inclusion of the wiring similarity of non-interacting genes contributes 6% to this large GO term association capture. Furthermore, we use our method to infer new relationships between GO terms solely from the topologies of these networks and validate 44% of our predictions in the literature. In addition, our integration method reproduces 48% of cellular component, 41% of molecular function and 41% of biological process GO terms, outperforming the previous method in the former two domains of GO. Finally, we predict new GO annotations of yeast genes and validate our predictions through GIs profiling. Availability and implementation: Supplementary Tables of new GO term associations and predicted gene annotations are available at http://bio-nets.doc.ic.ac.uk/GO-Reconstruction/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online
Developments in the theory of randomized shortest paths with a comparison of graph node distances
There have lately been several suggestions for parametrized distances on a
graph that generalize the shortest path distance and the commute time or
resistance distance. The need for developing such distances has risen from the
observation that the above-mentioned common distances in many situations fail
to take into account the global structure of the graph. In this article, we
develop the theory of one family of graph node distances, known as the
randomized shortest path dissimilarity, which has its foundation in statistical
physics. We show that the randomized shortest path dissimilarity can be easily
computed in closed form for all pairs of nodes of a graph. Moreover, we come up
with a new definition of a distance measure that we call the free energy
distance. The free energy distance can be seen as an upgrade of the randomized
shortest path dissimilarity as it defines a metric, in addition to which it
satisfies the graph-geodetic property. The derivation and computation of the
free energy distance are also straightforward. We then make a comparison
between a set of generalized distances that interpolate between the shortest
path distance and the commute time, or resistance distance. This comparison
focuses on the applicability of the distances in graph node clustering and
classification. The comparison, in general, shows that the parametrized
distances perform well in the tasks. In particular, we see that the results
obtained with the free energy distance are among the best in all the
experiments.Comment: 30 pages, 4 figures, 3 table
Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs
Laplacian mixture models identify overlapping regions of influence in
unlabeled graph and network data in a scalable and computationally efficient
way, yielding useful low-dimensional representations. By combining Laplacian
eigenspace and finite mixture modeling methods, they provide probabilistic or
fuzzy dimensionality reductions or domain decompositions for a variety of input
data types, including mixture distributions, feature vectors, and graphs or
networks. Provable optimal recovery using the algorithm is analytically shown
for a nontrivial class of cluster graphs. Heuristic approximations for scalable
high-performance implementations are described and empirically tested.
Connections to PageRank and community detection in network analysis demonstrate
the wide applicability of this approach. The origins of fuzzy spectral methods,
beginning with generalized heat or diffusion equations in physics, are reviewed
and summarized. Comparisons to other dimensionality reduction and clustering
methods for challenging unsupervised machine learning problems are also
discussed.Comment: 13 figures, 35 reference
An introduction to spectral distances in networks (extended version)
Many functions have been recently defined to assess the similarity among
networks as tools for quantitative comparison. They stem from very different
frameworks - and they are tuned for dealing with different situations. Here we
show an overview of the spectral distances, highlighting their behavior in some
basic cases of static and dynamic synthetic and real networks
- …