76 research outputs found
Topological network alignment uncovers biological function and phylogeny
Sequence comparison and alignment has had an enormous impact on our
understanding of evolution, biology, and disease. Comparison and alignment of
biological networks will likely have a similar impact. Existing network
alignments use information external to the networks, such as sequence, because
no good algorithm for purely topological alignment has yet been devised. In
this paper, we present a novel algorithm based solely on network topology, that
can be used to align any two networks. We apply it to biological networks to
produce by far the most complete topological alignments of biological networks
to date. We demonstrate that both species phylogeny and detailed biological
function of individual proteins can be extracted from our alignments.
Topology-based alignments have the potential to provide a completely new,
independent source of phylogenetic information. Our alignment of the
protein-protein interaction networks of two very different species--yeast and
human--indicate that even distant species share a surprising amount of network
topology with each other, suggesting broad similarities in internal cellular
wiring across all life on Earth.Comment: Algorithm explained in more details. Additional analysis adde
Coarse rays
We give some characterizations of geodesic metric spaces coarsely equivalent to the ray R⁺
Comparative interactomics with Funcoup 2.0
FunCoup (http://FunCoup.sbc.su.se) is a database that maintains and visualizes global gene/protein networks of functional coupling that have been constructed by Bayesian integration of diverse high-throughput data. FunCoup achieves high coverage by orthology-based integration of data sources from different model organisms and from different platforms. We here present release 2.0 in which the data sources have been updated and the methodology has been refined. It contains a new data type Genetic Interaction, and three new species: chicken, dog and zebra fish. As FunCoup extensively transfers functional coupling information between species, the new input datasets have considerably improved both coverage and quality of the networks. The number of high-confidence network links has increased dramatically. For instance, the human network has more than eight times as many links above confidence 0.5 as the previous release. FunCoup provides facilities for analysing the conservation of subnetworks in multiple species. We here explain how to do comparative interactomics on the FunCoup website
An Introductory Guide to Aligning Networks Using SANA, the Simulated Annealing Network Aligner.
Sequence alignment has had an enormous impact on our understanding of biology, evolution, and disease. The alignment of biological networks holds similar promise. Biological networks generally model interactions between biomolecules such as proteins, genes, metabolites, or mRNAs. There is strong evidence that the network topology-the "structure" of the network-is correlated with the functions performed, so that network topology can be used to help predict or understand function. However, unlike sequence comparison and alignment-which is an essentially solved problem-network comparison and alignment is an NP-complete problem for which heuristic algorithms must be used.Here we introduce SANA, the Simulated Annealing Network Aligner. SANA is one of many algorithms proposed for the arena of biological network alignment. In the context of global network alignment, SANA stands out for its speed, memory efficiency, ease-of-use, and flexibility in the arena of producing alignments between two or more networks. SANA produces better alignments in minutes on a laptop than most other algorithms can produce in hours or days of CPU time on large server-class machines. We walk the user through how to use SANA for several types of biomolecular networks
Probabilistic Random Walk Models for Comparative Network Analysis
Graph-based systems and data analysis methods have become critical tools in many
fields as they can provide an intuitive way of representing and analyzing interactions between
variables. Due to the advances in measurement techniques, a massive amount of
labeled data that can be represented as nodes on a graph (or network) have been archived
in databases. Additionally, novel data without label information have been gradually generated
and archived. Labeling and identifying characteristics of novel data is an important
first step in utilizing the valuable data in an effective and meaningful way. Comparative
network analysis is an effective computational means to identify and predict the properties
of the unlabeled data by comparing the similarities and differences between well-studied
and less-studied networks. Comparative network analysis aims to identify the matching
nodes and conserved subnetworks across multiple networks to enable a prediction of the
properties of the nodes in the less-studied networks based on the properties of the matching
nodes in the well-studied networks (i.e., transferring knowledge between networks).
One of the fundamental and important questions in comparative network analysis is
how to accurately estimate node-to-node correspondence as it can be a critical clue in
analyzing the similarities and differences between networks. Node correspondence is a
comprehensive similarity that integrates various types of similarity measurements in a
balanced manner. However, there are several challenges in accurately estimating the node
correspondence for large-scale networks. First, the scale of the networks is a critical issue.
As networks generally include a large number of nodes, we have to examine an extremely
large space and it can pose a computational challenge due to the combinatorial nature of
the problem. Furthermore, although there are matching nodes and conserved subnetworks
in different networks, structural variations such as node insertions and deletions make it difficult to integrate a topological similarity.
In this dissertation, novel probabilistic random walk models are proposed to accurately
estimate node-to-node correspondence between networks. First, we propose a context-sensitive
random walk (CSRW) model. In the CSRW model, the random walker analyzes
the context of the current position of the random walker and it can switch the random
movement to either a simultaneous walk on both networks or an individual walk on one
of the networks. The context-sensitive nature of the random walker enables the method
to effectively integrate different types of similarities by dealing with structural variations.
Second, we propose the CUFID (Comparative network analysis Using the steady-state
network Flow to IDentify orthologous proteins) model. In the CUFID model, we construct
an integrated network by inserting pseudo edges between potential matching nodes in
different networks. Then, we design the random walk protocol to transit more frequently
between potential matching nodes as their node similarity increases and they have more
matching neighboring nodes. We apply the proposed random walk models to comparative
network analysis problems: global network alignment and network querying. Through
extensive performance evaluations, we demonstrate that the proposed random walk models
can accurately estimate node correspondence and these can lead to improved and reliable
network comparison results
Simultaneous Optimization of Both Node and Edge Conservation in Network Alignment via WAVE
Network alignment can be used to transfer functional knowledge between
conserved regions of different networks. Typically, existing methods use a node
cost function (NCF) to compute similarity between nodes in different networks
and an alignment strategy (AS) to find high-scoring alignments with respect to
the total NCF over all aligned nodes (or node conservation). But, they then
evaluate quality of their alignments via some other measure that is different
than the node conservation measure used to guide the alignment construction
process. Typically, one measures the amount of conserved edges, but only after
alignments are produced. Hence, a recent attempt aimed to directly maximize the
amount of conserved edges while constructing alignments, which improved
alignment accuracy. Here, we aim to directly maximize both node and edge
conservation during alignment construction to further improve alignment
accuracy. For this, we design a novel measure of edge conservation that (unlike
existing measures that treat each conserved edge the same) weighs each
conserved edge so that edges with highly NCF-similar end nodes are favored. As
a result, we introduce a novel AS, Weighted Alignment VotEr (WAVE), which can
optimize any measures of node and edge conservation, and which can be used with
any NCF or combination of multiple NCFs. Using WAVE on top of established
state-of-the-art NCFs leads to superior alignments compared to the existing
methods that optimize only node conservation or only edge conservation or that
treat each conserved edge the same. And while we evaluate WAVE in the
computational biology domain, it is easily applicable in any domain.Comment: 12 pages, 4 figure
AlignNemo: A Local Network Alignment Method to Integrate Homology and Topology
Local network alignment is an important component of the analysis of protein-protein interaction networks that may lead to the identification of evolutionary related complexes. We present AlignNemo, a new algorithm that, given the networks of two organisms, uncovers subnetworks of proteins that relate in biological function and topology of interactions. The discovered conserved subnetworks have a general topology and need not to correspond to specific interaction patterns, so that they more closely fit the models of functional complexes proposed in the literature. The algorithm is able to handle sparse interaction data with an expansion process that at each step explores the local topology of the networks beyond the proteins directly interacting with the current solution. To assess the performance of AlignNemo, we ran a series of benchmarks using statistical measures as well as biological knowledge. Based on reference datasets of protein complexes, AlignNemo shows better performance than other methods in terms of both precision and recall. We show our solutions to be biologically sound using the concept of semantic similarity applied to Gene Ontology vocabularies. The binaries of AlignNemo and supplementary details about the algorithms and the experiments are available at: sourceforge.net/p/alignnemo
Bridging topological and functional information in protein interaction networks by short loops profiling
Protein-protein interaction networks (PPINs) have been employed to identify potential novel interconnections between proteins as well as crucial cellular functions. In this study we identify fundamental principles of PPIN topologies by analysing network motifs of short loops, which are small cyclic interactions of between 3 and 6 proteins. We compared 30 PPINs with corresponding randomised null models and examined the occurrence of common biological functions in loops extracted from a cross-validated high-confidence dataset of 622 human protein complexes. We demonstrate that loops are an intrinsic feature of PPINs and that specific cell functions are predominantly performed by loops of different lengths. Topologically, we find that loops are strongly related to the accuracy of PPINs and define a core of interactions with high resilience. The identification of this core and the analysis of loop composition are promising tools to assess PPIN quality and to uncover possible biases from experimental detection methods. More than 96% of loops share at least one biological function, with enrichment of cellular functions related to mRNA metabolic processing and the cell cycle. Our analyses suggest that these motifs can be used in the design of targeted experiments for functional phenotype detection.This research was supported by the Biotechnology and Biological Sciences Research Council (BB/H018409/1 to AP, ACCC and FF, and BB/J016284/1 to NSBT) and by the Leukaemia & Lymphoma Research (to NSBT and FF). SSC is funded by a Leukaemia & Lymphoma Research Gordon Piller PhD Studentship
- …