Search CORE

18 research outputs found

SANA NetGO: A combinatorial approach to using Gene Ontology (GO) terms to score network alignments

Author: Hayes Wayne B.
Mamano Nil
Publication venue
Publication date: 04/04/2017
Field of study

Gene Ontology (GO) terms are frequently used to score alignments between protein-protein interaction (PPI) networks. Methods exist to measure the GO similarity between two proteins in isolation, but pairs of proteins in a network alignment are not isolated: each pairing is implicitly dependent upon every other pairing via the alignment itself. Current methods fail to take into account the frequency of GO terms across the networks, and attempt to account for common GO terms in an ad hoc fashion by imposing arbitrary rules on when to "allow" GO terms based on their location in the GO hierarchy, rather than using readily available frequency information in the PPI networks themselves. Here we develop a new measure, NetGO, that naturally weighs infrequent, informative GO terms more heavily than frequent, less informative GO terms, without requiring arbitrary cutoffs. In particular, NetGO down-weights the score of frequent GO terms according to their frequency in the networks being aligned. This is a global measure applicable only to alignments, independent of pairwise GO measures, in the same sense that the edge-based EC or S3 scores are global measures of topological similarity independent of pairwise topological similarities. We demonstrate the superiority of NetGO by creating alignments of predetermined quality based on homologous pairs of nodes and show that NetGO correlates with alignment quality much better than any existing GO-based alignment measures. We also demonstrate that NetGO provides a measure of taxonomic similarity between species, consistent with existing taxonomic measures--a feature not shared with existing GO-based network alignment measures. Finally, we re-score alignments produced by almost a dozen aligners from a previous study and show that NetGO does a better job than existing measures at separating good alignments from bad ones

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Fair Evaluation of Global Network Aligners

Author: Crawford Joseph
Milenković Tijana
Sun Yihan
Publication venue
Publication date: 17/07/2014
Field of study

Biological network alignment identifies topologically and functionally conserved regions between networks of different species. It encompasses two algorithmic steps: node cost function (NCF), which measures similarities between nodes in different networks, and alignment strategy (AS), which uses these similarities to rapidly identify high-scoring alignments. Different methods use both different NCFs and different ASs. Thus, it is unclear whether the superiority of a method comes from its NCF, its AS, or both. We already showed on MI-GRAAL and IsoRankN that combining NCF of one method and AS of another method can lead to a new superior method. Here, we evaluate MI-GRAAL against newer GHOST to potentially further improve alignment quality. Also, we approach several important questions that have not been asked systematically thus far. First, we ask how much of the node similarity information in NCF should come from sequence data compared to topology data. Existing methods determine this more-less arbitrarily, which could affect the resulting alignment(s). Second, when topology is used in NCF, we ask how large the size of the neighborhoods of the compared nodes should be. Existing methods assume that larger neighborhood sizes are better. We find that MI-GRAAL's NCF is superior to GHOST's NCF, while the performance of the methods' ASs is data-dependent. Thus, the combination of MI-GRAAL's NCF and GHOST's AS could be a new superior method for certain data. Also, which amount of sequence information is used within NCF does not affect alignment quality, while the inclusion of topological information is crucial. Finally, larger neighborhood sizes are preferred, but often, it is the second largest size that is superior, and using this size would decrease computational complexity. Together, our results give several general recommendations for a fair evaluation of network alignment methods.Comment: 19 pages. 10 figures. Presented at the 2014 ISMB Conference, July 13-15, Boston, M

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

NatalieQ: A web server for protein-protein interaction network querying

Author
Publication venue: BioMed Central
Publication date
Field of study

Springer - Publisher Connector

An Introductory Guide to Aligning Networks Using SANA, the Simulated Annealing Network Aligner.

Author: A Chatr-Aryamontri
A Hasan
A Lesk
BH Junker
C Clark
C Mering Von
C Yang
CG El Van
D Davis
DM Prescott
EH Davidson
F Alkan
FE Faisal
FE Faisal
HT Phan
J Crawford
K Chen
K Mehlhorn
KI Smith
M Ashburner
M El-Kebir
M Kotlyar
M Malek
M Milano
M Vidal
MP Williamson
MR Garey
N Malod-Dognin
N Pržulj
N Yaveroğlu
O Fiehn
O Kuchaiev
O Kuchaiev
O Sporns
R Jaenicke
S Hashemifar
SA Cook
SF Altschul
SJ Larsen
T Hočevar
T Milenković
T Milenković
T Tokar
TA Farazi
V Saraph
V Vijayan
Publication venue: eScholarship, University of California
Publication date: 22/11/2019
Field of study

Sequence alignment has had an enormous impact on our understanding of biology, evolution, and disease. The alignment of biological networks holds similar promise. Biological networks generally model interactions between biomolecules such as proteins, genes, metabolites, or mRNAs. There is strong evidence that the network topology-the "structure" of the network-is correlated with the functions performed, so that network topology can be used to help predict or understand function. However, unlike sequence comparison and alignment-which is an essentially solved problem-network comparison and alignment is an NP-complete problem for which heuristic algorithms must be used.Here we introduce SANA, the Simulated Annealing Network Aligner. SANA is one of many algorithms proposed for the arena of biological network alignment. In the context of global network alignment, SANA stands out for its speed, memory efficiency, ease-of-use, and flexibility in the arena of producing alignments between two or more networks. SANA produces better alignments in minutes on a laptop than most other algorithms can produce in hours or days of CPU time on large server-class machines. We walk the user through how to use SANA for several types of biomolecular networks

arXiv.org e-Print Archive

Crossref

Ezid

eScholarship - University of California

Unified Alignment of Protein-Protein Interaction Networks

Author: Ban K
Malod-Dognin N
Przulj N
Publication venue: NATURE PUBLISHING GROUP
Publication date: 11/04/2017
Field of study

Paralleling the increasing availability of protein-protein interaction (PPI) network data, several network alignment methods have been proposed. Network alignments have been used to uncover functionally conserved network parts and to transfer annotations. However, due to the computational intractability of the network alignment problem, aligners are heuristics providing divergent solutions and no consensus exists on a gold standard, or which scoring scheme should be used to evaluate them. We comprehensively evaluate the alignment scoring schemes and global network aligners on large scale PPI data and observe that three methods, HUBALIGN, L-GRAAL and NATALIE, regularly produce the most topologically and biologically coherent alignments. We study the collective behaviour of network aligners and observe that PPI networks are almost entirely aligned with a handful of aligners that we unify into a new tool, Ulign. Ulign enables complete alignment of two networks, which traditional global and local aligners fail to do. Also, multiple mappings of Ulign define biologically relevant soft clusterings of proteins in PPI networks, which may be used for refining the transfer of annotations across networks. Hence, PPI networks are already well investigated by current aligners, so to gain additional biological insights, a paradigm shift is needed. We propose such a shift come from aligning all available data types collectively rather than any particular data type in isolation from others

UCL Discovery