Search CORE

1,030 research outputs found

Detecting highly overlapping community structure by greedy clique expansion

Author: Hurley Neil
Lee Conrad
McDaid Aaron
Reid Fergal
Publication venue
Publication date: 01/01/2010
Field of study

In complex networks it is common for each node to belong to several communities, implying a highly overlapping community structure. Recent advances in benchmarking indicate that existing community assignment algorithms that are capable of detecting overlapping communities perform well only when the extent of community overlap is kept to modest levels. To overcome this limitation, we introduce a new community assignment algorithm called Greedy Clique Expansion (GCE). The algorithm identifies distinct cliques as seeds and expands these seeds by greedily optimizing a local fitness function. We perform extensive benchmarks on synthetic data to demonstrate that GCE's good performance is robust across diverse graph topologies. Significantly, GCE is the only algorithm to perform well on these synthetic graphs, in which every node belongs to multiple communities. Furthermore, when put to the task of identifying functional modules in protein interaction data, and college dorm assignments in Facebook friendship data, we find that GCE performs competitively.Comment: 10 pages, 7 Figures. Implementation source and binaries available at http://sites.google.com/site/greedycliqueexpansion

arXiv.org e-Print Archive

CiteSeerX

Research Repository UCD

Irish Universities

An Algorithm for Finding Functional Modules and Protein Complexes in Protein-Protein Interaction Networks

Author: Chen Yu
Cui Guangyu
Han Kyungsook
Huang De-Shuang
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2008
Field of study

Biological processes are often performed by a group of proteins rather than by individual proteins, and proteins in a same biological group form a densely connected subgraph in a protein-protein interaction network. Therefore, finding a densely connected subgraph provides useful information to predict the function or protein complex of uncharacterized proteins in the highly connected subgraph. We have developed an efficient algorithm and program for finding cliques and near-cliques in a protein-protein interaction network. Analysis of the interaction network of yeast proteins using the algorithm demonstrates that 59% of the near-cliques identified by our algorithm have at least one function shared by all the proteins within a near-clique, and that 56% of the near-cliques show a good agreement with the experimentally determined protein complexes catalogued in MIPS

Crossref

Directory of Open Access Journals

PubMed Central

Parallel Maximum Clique Algorithms with Applications to Network Analysis and Storage

Author: Ali Patwary
Assefaw H. Gebremedhin
David F. Gleich
Md. Mostofa
Ryan A. Rossi
Publication venue
Publication date: 25/12/2013
Field of study

We propose a fast, parallel maximum clique algorithm for large sparse graphs that is designed to exploit characteristics of social and information networks. The method exhibits a roughly linear runtime scaling over real-world networks ranging from 1000 to 100 million nodes. In a test on a social network with 1.8 billion edges, the algorithm finds the largest clique in about 20 minutes. Our method employs a branch and bound strategy with novel and aggressive pruning techniques. For instance, we use the core number of a vertex in combination with a good heuristic clique finder to efficiently remove the vast majority of the search space. In addition, we parallelize the exploration of the search tree. During the search, processes immediately communicate changes to upper and lower bounds on the size of maximum clique, which occasionally results in a super-linear speedup because vertices with large search spaces can be pruned by other processes. We apply the algorithm to two problems: to compute temporal strong components and to compress graphs.Comment: 11 page

arXiv.org e-Print Archive

CiteSeerX

High functional coherence in k-partite protein cliques of protein interaction networks

Author: Chen Yi-Ping Phoebe
Li Jinyan
Liu Qian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

We introduce a new topological concept called k-partite protein cliques to study protein interaction (PPI) networks. In particular, we examine functional coherence of proteins in k-partite protein cliques. A k-partite protein clique is a k-partite maximal clique comprising two or more nonoverlapping protein subsets between any two of which full interactions are exhibited. In the detection of PPI’s k-partite maximal cliques, we propose to transform PPI networks into induced K-partite graphs with proteins as vertices where edges only exist among the graph’s partites. Then, we present a k-partite maximal clique mining (MaCMik) algorithm to enumerate k-partite maximal cliques from K-partite graphs. Our MaCMik algorithm is applied to a yeast PPI network. We observe that there does exist interesting and unusually high functional coherence in k-partite protein cliques—most proteins in k-partite protein cliques, especially those in the same partites, share the same functions. Therefore, the idea of k-partite protein cliques suggests a novel approach to characterizing PPI networks, and may help function prediction for unknown proteins.<br /

Deakin Research Online

Recent advances in clustering methods for protein interaction networks

Author: Deng Youping
Li Min
Pan Yi
Wang Jianxin
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

The increasing availability of large-scale protein-protein interaction data has made it possible to understand the basic components and organization of cell machinery from the network level. The arising challenge is how to analyze such complex interacting data to reveal the principles of cellular organization, processes and functions. Many studies have shown that clustering protein interaction network is an effective approach for identifying protein complexes or functional modules, which has become a major research topic in systems biology. In this review, recent advances in clustering methods for protein interaction networks will be presented in detail. The predictions of protein functions and interactions based on modules will be covered. Finally, the performance of different clustering methods will be compared and the directions for future research will be discussed

Crossref

ScholarWorks @ Georgia State University

Springer - Publisher Connector

PubMed Central

A Coevolutionary Residue Network at the Site of a Functionally Important Conformational Change in a Phosphohexomutase Enzyme Family

Author: A Paccanaro
AA Fodor
AA Hagberg
AC Wallace
AM Schramm
AM Schramm
Ashley M. Buckle
C Marino Buslje
C Olvera
C Regni
C Regni
C Regni
C Regni
CA Brown
CA Regni
CE Shannon
CN Chi
Cristina Furdui
D McElheny
D Zhou
DD Boehr
DY Little
F Kose
FM Codoner
G Palla
GA McKay
GB Gloor
GB Gloor
GS Shackelford
J Jeon
JA Wells
Jacob Mick
JP Klinman
K Henzler-Wildman
KA Henzler-Wildman
LC Martin
LE Naught
LE Naught
LE Naught
Lesa J. Beamer
MA Monteiro
MJ Buck
NP West
RC Edgar
RC Sandlin
RJ Dickson
RP Alexander
RW Ye
S Chakrabarti
S Saen-Oon
S Tungtur
SA Travers
SD Dunn
SH Ackerman
SH Kim
SL Chiang
SN Fatakia
SN Fatakia
SW Lockless
TM Venancio
TR McCarthy
VJ LiCata
W Min
WL DeLano
WY Chen
Y Xu
Yingying Lee
Publication venue: Public Library of Science
Publication date: 07/06/2012
Field of study

Coevolution analyses identify residues that co-vary with each other during evolution, revealing sequence relationships unobservable from traditional multiple sequence alignments. Here we describe a coevolutionary analysis of phosphomannomutase/phosphoglucomutase (PMM/PGM), a widespread and diverse enzyme family involved in carbohydrate biosynthesis. Mutual information and graph theory were utilized to identify a network of highly connected residues with high significance. An examination of the most tightly connected regions of the coevolutionary network reveals that most of the involved residues are localized near an interdomain interface of this enzyme, known to be the site of a functionally important conformational change. The roles of four interface residues found in this network were examined via site-directed mutagenesis and kinetic characterization. For three of these residues, mutation to alanine reduces enzyme specificity to ∼10% or less of wild-type, while the other has ∼45% activity of wild-type enzyme. An additional mutant of an interface residue that is not densely connected in the coevolutionary network was also characterized, and shows no change in activity relative to wild-type enzyme. The results of these studies are interpreted in the context of structural and functional data on PMM/PGM. Together, they demonstrate that a network of coevolving residues links the highly conserved active site with the interdomain conformational change necessary for the multi-step catalytic reaction. This work adds to our understanding of the functional roles of coevolving residue networks, and has implications for the definition of catalytically important residues

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Method and System for Identification of Metabolites Using Mass Spectra

Author: Carreer William J.
Flight Robert M.
Mitchell Joshua
Moseley Hunter N. B.
Publication venue: UKnowledge
Publication date: 31/03/2020
Field of study

A method and system is provided for mass spectrometry for identification of a specific elemental formula for an unknown compound which includes but is not limited to a metabolite. The method includes calculating a natural abundance probability (NAP) of a given isotopologue for isotopes of non-labelling elements of an unknown compound. Molecular fragments for a subset of isotopes identified using the NAP are created and sorted into a requisite cache data structure to be subsequently searched. Peaks from raw spectrum data from mass spectrometry for an unknown compound. Sample-specific peaks of the unknown com- pound from various spectral artifacts in ultra-high resolution Fourier transform mass spectra are separated. A set of possible isotope-resolved molecular formula (IMF) are created by iteratively searching the molecular fragment caches and combining with additional isotopes and then statistically filtering the results based on NAP and mass-to-charge (m/2) matching probabilities. An unknown compound is identified and its corresponding elemental molecular formula (EMF) from statistically-significant caches of isotopologues with compatible IMFs

University of Kentucky