Search CORE

6 research outputs found

A Statistical Performance Analysis of Graph Clustering Algorithms

Author: Anupam Biswas
Charalampos N Moschopoulos
DA Spielman
Hélio Almeida
Jörg Reichardt
L Ostroumova Prokhorenkova
Lawrence B. Holder
Liudmila Ostroumova Prokhorenkova
S Fortunato
S. Fortunato
U Brandes
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Measuring graph clustering quality remains an open problem. Here, we introduce three statistical measures to address the problem. We empirically explore their behavior under a number of stress test scenarios and compare it to the commonly used modularity and conductance. Our measures are robust, immune to resolution limit, easy to intuitively interpret and also have a formal statistical interpretation. Our empirical stress test results confirm that our measures compare favorably to the established ones. In particular, they are shown to be more responsive to graph structure, less sensitive to sample size and breakdowns during numerical implementation and less sensitive to uncertainty in connectivity. These features are especially important in the context of larger data sets or when the data may contain errors in the connectivity patterns

Crossref

Edinburgh Research Explorer

Queen Mary Research Online

GIBA: a clustering tool for detecting protein complexes

Author: AC Gavin
AC Gavin
AD King
AH Tong
AJ Enright
B Snel
Charalampos N Moschopoulos
CN Moschopoulos
D Stoll
E Hartuv
E Sprinzak
GD Bader
Georgios A Pavlopoulos
I Xenarios
M Koyuturk
NJ Krogan
O Puig
P Shannon
Reinhard Schneider
RP Sear
S Brohee
SH Yook
Sophia Kossida
Spiridon D Likothanassis
T Ito
V Spirin
WG Willats
X-L Li
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background: During the last years, high throughput experimental methods have been developed which generate large datasets of protein - protein interactions (PPIs). However, due to the experimental methodologies these datasets contain errors mainly in terms of false positive data sets and reducing therefore the quality of any derived information. Typically these datasets can be modeled as graphs, where vertices represent proteins and edges the pairwise PPIs, making it easy to apply automated clustering methods to detect protein complexes or other biological significant functional groupings. Methods: In this paper, a clustering tool, called GIBA (named by the first characters of its developers' nicknames), is presented. GIBA implements a two step procedure to a given dataset of protein-protein interaction data. First, a clustering algorithm is applied to the interaction data, which is then followed by a filtering step to generate the final candidate list of predicted complexes. Results: The efficiency of GIBA is demonstrated through the analysis of 6 different yeast protein interaction datasets in comparison to four other available algorithms. We compared the results of the different methods by applying five different performance measurement metrices. Moreover, the parameters of the methods that constitute the filter have been checked on how they affect the final results. Conclusion: GIBA is an effective and easy to use tool for the detection of protein complexes out of experimentally measured protein - protein interaction networks. The results show that GIBA has superior prediction accuracy than previously published methods

Crossref

Springer

Springer - Publisher Connector

PubMed Central

Open Repository and Bibliography - Luxembourg

Which clustering algorithm is better for predicting protein complexes?

Abstract Background Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. Results In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H) and Tandem Affinity Purification (TAP) methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. Conclusions While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: <url>http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm</url></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

EUR Research Repository

Open Repository and Bibliography - Luxembourg

University of Thessaly Institutional Repository

Using graph theory to analyze biological networks

Author: A Finney
A Mazurie
A Paccanaro
A Sandelin
A Zanzoni
A Özgür
A-L Barabási
A-L Barabási
AC Gavin
AC Gavin
AD King
AD Perkins
AH Tong
AI Saeed
AJ Enright
AJ Enright
AK Jain
AM Feist
B MacQueen
BH Junker
BioPAX Working group
BJ Frey
Björn H Junker
BN Kholodenko
C Bron
C Lefebvre
C von Mering
Carninci Pea
CD Michener
CH Schilling
CH Schilling
CH Schilling
Charalampos N Moschopoulos
CM Lloyd
CN Moschopoulos
D Koschützki
D Stoll
DJ Watts
E Burgos
E Estrada
E Estrada
E Estrada
E Ravasz
E van Nimwegen
E Wingender
E Zotenko
EW Dijkstra
F Nisbach
F Picard
FCS Diella
G Lima-Mendez
GA Pavlopoulos
GA Pavlopoulos
GA Pavlopoulos
GA Pavlopoulos
GD Bader
Georgios A Pavlopoulos
Glenn W Milligan
H Hermjakob
H Jeong
H Jeong
H Jeong
H Ma
H Salgado
H Zhang
H-J Schulz
HG Vikis
HK Lee
I Lozada-Chavez
I Xenarios
J Berg
J Gagneur
J Quackenbush
J Seo
J Seo
J Vlasblom
J Yu
Jan Aerts
JC Rain
K Han
K Raman
K Tamura
Kim Sneppen
L Gao
L Giot
LdF Costa
LE Ulrich
Linding Rea
M Baur
M Hahn
M Hucka
M Kitsak
M Krull
M Madan Babu
Maria Secrier
MEJ Newman
MP Joy
MR da Silva
N Guelzim
N Saitou
NJ Krogan
O Gascuel
O Lassila
O Puig
P Erdös
P Holme
P Murray-Rust
P Shannon
P Uetz
Pantelis G Bagos
PD Karp
PE Hodges
PHA Sneath
PJ Ingram
R Albert
R D'andrade
R Milo
R Yoshida
RC Gentleman
RD Leclerc
Reinhard Schneider
RO Duda
RP Murray
RW Floyd
S Brohee
S Chavali
S Kumar
S Kumar
S Kumar
S Li
S Redner
S Schuster
S Schuster
S Shen-Orr
S van Dogen
SC Johnson
SD Hooper
Sophia Kossida
SR Paladugu
T Ito
T Yamada
TH Cormen
Theodoros G Soldatos
TI Lee
US Bhalla
V Batagelj
V Matys
W Huber
W Zhong
WG Willats
X Li
Y Lu
Z Hu
ZHL Rong
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can be done by examining the elementary constituents individually and then how these are connected. The myriad components of a system and their interactions are best characterized as networks and they are mainly represented as graphs where thousands of nodes are connected with thousands of vertices. In this article we demonstrate approaches, models and methods from the graph theory universe and we discuss ways in which they can be used to reveal hidden properties and features of a network. This network profiling combined with knowledge extraction will help us to better understand the biological significance of the system

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Open Repository and Bibliography - Luxembourg

University of Thessaly Institutional Repository