Search CORE

Metabolomics approach for determining growth-specific metabolites based on Fourier transform ion cyclotron resonance mass spectrometry

Author: A Aharoni
A Bairoch
A Oikawa
AG Marshall
AL Boulesteix
Daisaku Ohta
DL Wheeler
DW Grogan
ER Vimr
H Suzuki
Hiroki Takahashi
J Laskin
JP Merlie
JR Laeter De
JW Gauthier
K Magnuson
Ken Kurokawa
Kenichi Tanaka
Kosuke Kai
L Stein
M Altaf-Ul-Amin
M Ishinaga
M Kanehisa
M Yano
Md. Altaf-Ul-Amin
MJ Brauer
MY Hirai
MY Hirai
Naotake Ogasawara
O Fiehn
RD Hall
S Goto
S Kanaya
SE Polakis
SG Villas-Boas
Shigehiko Kanaya
ST Ali
T Abe
T Kind
T Kind
T Soga
T Tohge
Taku Oshima
V Luca De
Y Nakamura
Yoko Shinbo
YY Chang
Publication venue: Springer-Verlag
Publication date: 01/01/2008
Field of study

Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR/MS) is the best MS technology for obtaining exact mass measurements owing to its great resolution and accuracy, and several outstanding FT-ICR/MS-based metabolomics approaches have been reported. A reliable annotation scheme is needed to deal with direct-infusion FT-ICR/MS metabolic profiling. Correlation analyses can help us not only uncover relations between the ions but also annotate the ions originated from identical metabolites (metabolite derivative ions). In the present study, we propose a procedure for metabolite annotation on direct-infusion FT-ICR/MS by taking into consideration the classification of metabolite-derived ions using correlation analyses. Integrated analysis based on information of isotope relations, fragmentation patterns by MS/MS analysis, co-occurring metabolites, and database searches (KNApSAcK and KEGG) can make it possible to annotate ions as metabolites and estimate cellular conditions based on metabolite composition. A total of 220 detected ions were classified into 174 metabolite derivative groups and 72 ions were assigned to candidate metabolites in the present work. Finally, metabolic profiling has been able to distinguish between the growth stages with the aid of PCA. The constructed model using PLS regression for OD600 values as a function of metabolic profiles is very useful for identifying to what degree the ions contribute to the growth stages. Ten phospholipids which largely influence the constructed model are highly abundant in the cells. Our analyses reveal that global modification of those phospholipids occurs as E. coli enters the stationary phase. Thus, the integrated approach involving correlation analyses, metabolic profiling, and database searching is efficient for high-throughput metabolomics

Efficient and accurate greedy search methods for mining functional modules in protein interaction networks

Author: A Gavin
B Adamcsek
Baoliu Ye
BS Everitt
C Brun
Chaojun Li
DJ Watts
F Luo
F Radicchi
G Palla
GD Bader
H Jeong
H Leung
HW Mewes
I Xenarios
J Wang
J Wang
J Wang
Jieyue He
L Gao
LF Wu
M Altaf-Ul-Amin
M Girvan
M Li
M Li
M Wu
MEJ Newman
SH Jung
SS Dwight
V Spirin
Wei Zhong
X Li
YR Cho
Z Dezso
Publication venue: BioMed Central
Publication date: 01/06/2012
Field of study

Abstract Background Most computational algorithms mainly focus on detecting highly connected subgraphs in PPI networks as protein complexes but ignore their inherent organization. Furthermore, many of these algorithms are computationally expensive. However, recent analysis indicates that experimentally detected protein complexes generally contain Core/attachment structures. Methods In this paper, a Greedy Search Method based on Core-Attachment structure (GSM-CA) is proposed. The GSM-CA method detects densely connected regions in large protein-protein interaction networks based on the edge weight and two criteria for determining core nodes and attachment nodes. The GSM-CA method improves the prediction accuracy compared to other similar module detection approaches, however it is computationally expensive. Many module detection approaches are based on the traditional hierarchical methods, which is also computationally inefficient because the hierarchical tree structure produced by these approaches cannot provide adequate information to identify whether a network belongs to a module structure or not. In order to speed up the computational process, the Greedy Search Method based on Fast Clustering (GSM-FC) is proposed in this work. The edge weight based GSM-FC method uses a greedy procedure to traverse all edges just once to separate the network into the suitable set of modules. Results The proposed methods are applied to the protein interaction network of S. cerevisiae. Experimental results indicate that many significant functional modules are detected, most of which match the known complexes. Results also demonstrate that the GSM-FC algorithm is faster and more accurate as compared to other competing algorithms. Conclusions Based on the new edge weight definition, the proposed algorithm takes advantages of the greedy search procedure to separate the network into the suitable set of modules. Experimental analysis shows that the identified modules are statistically significant. The algorithm can reduce the computational time significantly while keeping high prediction accuracy.</p

Coexpression Analysis of Tomato Genes and Experimental Verification of Coordinated Expression of Genes Found in a Functionally Enriched Coexpression Module

Author: A. Kurabayashi
Altaf-Ul-Amin
Borden
C. Konishi
Chiu
Chomczynski
Craigon
D. Shibata
Edgar
Girvan
Goda
Iijima
K. Aoki
K. Suda
Kilian
M. Yamazaki
Matsukura
Mitsuhara
N. Yamamoto
Quevillon
Rocca-Serra
S. Bunsupa
S. Inai
S. Ozaki
Schmid
Sherlock
T. Fujii
T. Suzuki
T. Tsugane
Tohge
Toufighi
Y. Iijima
Y. Ogata
Zimmermann
Publication venue: Oxford University Press
Publication date
Field of study

Gene-to-gene coexpression analysis is a powerful approach to infer the function of uncharacterized genes. Here, we report comprehensive identification of coexpression gene modules of tomato (Solanum lycopersicum) and experimental verification of coordinated expression of module member genes. On the basis of the gene-to-gene correlation coefficient calculated from 67 microarray hybridization data points, we performed a network-based analysis. This facilitated the identification of 199 coexpression modules. A gene ontology annotation search revealed that 75 out of the 199 modules are enriched with genes associated with common functional categories. To verify the coexpression relationships between module member genes, we focused on one module enriched with genes associated with the flavonoid biosynthetic pathway. A non-enzyme, non-transcription factor gene encoding a zinc finger protein in this module was overexpressed in S. lycopersicum cultivar Micro-Tom, and expression levels of flavonoid pathway genes were investigated. Flavonoid pathway genes included in the module were up-regulated in the plant overexpressing the zinc finger gene. This result demonstrates that coexpression modules, at least the ones identified in this study, represent actual transcriptional coordination between genes, and can facilitate the inference of tomato gene function

Public Library of Science (PLOS)

Jerarca: Efficient Analysis of Complex Networks Using Hierarchical Clustering

Author: A Clauset
A Marco
A Marco
AD King
AL Barabási
AM Yip
AW Rives
BJ Breitkreutz
C Brun
Carl Kingsford
DJ Watts
GD Bader
H Lu
Ignacio Marín
JB Pereira-Leal
JI Lucas
JS Farris
K Tamura
M Altaf-Ul-Amin
M Nei
MEJ Newman
MS Cline
N Pržulj
N Saitou
R Sharan
Rodrigo Aldecoa
RR Sokal
SY Pu
T Aittokallio
V Arnau
V Spirin
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Background: How to extract useful information from complex biological networks is a major goal in many fields, especially in genomics and proteomics. We have shown in several works that iterative hierarchical clustering, as implemented in the UVCluster program, is a powerful tool to analyze many of those networks. However, the amount of computation time required to perform UVCluster analyses imposed significant limitations to its use. Methodology/Principal Findings: We describe the suite Jerarca, designed to efficiently convert networks of interacting units into dendrograms by means of iterative hierarchical clustering. Jerarca is divided into three main sections. First, weighted distances among units are computed using up to three different approaches: a more efficient version of UVCluster and two new, related algorithms called RCluster and SCluster. Second, Jerarca builds dendrograms based on those distances, using well-known phylogenetic algorithms, such as UPGMA or Neighbor-Joining. Finally, Jerarca provides optimal partitions of the trees using statistical criteria based on the distribution of intra- and intercluster connections. Outputs compatible with the phylogenetic software MEGA and the Cytoscape package are generated, allowing the results to be easily visualized. Conclusions/Significance: The four main advantages of Jerarca in respect to UVCluster are: 1) Improved speed of a novel UVCluster algorithm; 2) Additional, alternative strategies to perform iterative hierarchical clustering; 3) Automatic evaluatio

CiteSeerX

arXiv.org e-Print Archive

Digital.CSIC

Characterization of complex networks: A survey of measurements

Author: Altaf-Ul-Amin M
Anderberg MR
Arenas A
Baker WE
Baker WE
Baldi P
Bar-Yam Y
Barabási A-L
Barabási A-L
Batagelj V
Ben-Naim E
Benkler Y
Boccara N
Boguñá M
Bollobás B
Bollobás B
Bornholdt S
Brillouin L
Buchanan M
Bunde A
Bunde A
Carrington PJ
Castells M
Codenotti B
Costa L DA F
Csermely P
Danon L
Dawson Ross
di Bernardo M
Diestel R
Dodge M
Dodge M
Dorogovtsev SN
Duda RO
Edwards AL
Erdős P
Erdős P
F. A. Rodrigues
Fiedler M
Freeman LC
Fukunaga K
G. Travieso
Garrido PL
Hair JF
Hayes B
Hayes B
Huberman BA
Jain AK
Johnson RA
Kochen M
L. da F. Costa
McLachlan GJ
McNeill RR
Mehta ML
Messner D
Milgram S
Monasson R
Monge PR
Newman MEJ
Newman MEJ
Newman MEJ
P. R. Villas Boas
Pastor-Satorras R
Reichl LE
Reif F
Romesburg HC
Schlosser G
Scott JP
Shannon CE
Stauffer D
Stoyan D
Strogatz S
Tyler JR
Wasserman S
Watts DJ
Watts DJ
West DB
Westland C
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2005
Field of study

Each complex network (or class of networks) presents specific topological features which characterize its connectivity and highly influence the dynamics of processes executed on the network. The analysis, discrimination, and synthesis of complex networks therefore rely on the use of measurements capable of expressing the most relevant topological features. This article presents a survey of such measurements. It includes general considerations about complex network characterization, a brief review of the principal models, and the presentation of the main existing measurements. Important related issues covered in this work comprise the representation of the evolution of complex networks in terms of trajectories in several measurement spaces, the analysis of the correlations between some of the most traditional measurements, perturbation analysis, as well as the use of multivariate statistics for feature selection and network classification. Depending on the network and the analysis task one has in mind, a specific set of features may be chosen. It is hoped that the present survey will help the proper application and interpretation of measurements.Comment: A working manuscript with 78 pages, 32 figures. Suggestions of measurements for inclusion are welcomed by the author

CiteSeerX

Identification of functional hubs and modules by converting interactome networks into hierarchical ordering of proteins

Author: A Chatr-aryamontri
A-L Barabasi
Aidong Zhang
AW Rives
B-J Breitkreutz
C Brun
DJ Watts
E Banks
F Luo
G Palla
GD Bader
H Jeong
HB Fraser
HW Mewes
J Demeter
J-DJ Han
JR Parrish
L Salwinski
M Altaf-Ul-Amin
NN Batada
R Aebersold
R Dunn
R Saeed
R Sharan
S Kerrien
The Gene Ontology Consortium
V Spirin
W Li
X He
Y Chen
Y-R Cho
Y-R Cho
Young-Rae Cho
Z Wang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Evaluation of clustering algorithms for protein-protein interaction networks

Author: A Vazquez
AC Gavin
AC Gavin
AD King
AJ Enright
AW Rives
BJ Breitkreutz
C Brun
C Brun
C Ding
C Friedrich
C von Mering
DS Goldberg
E Ravasz
E Ravasz
E Sprinzak
GD Bader
H Jeong
H Lu
J Gagneur
Jacques van Helden
JB Pereira-Leal
JDJ Han
JF Poyatos
JP Miller
M Altaf-Ul-Amin
M Blatt
M Middendorf
MR Said
N Simonis
NJ Krogan
P Shannon
P Uetz
R Development Core Team
R Dunn
S Bandyopadhyay
S Van Dongen
SH Yook
Sylvain Brohée
T Ito
V Arnau
V Spirin
Y Ho
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Protein interactions are crucial components of all cellular processes. Recently, high-throughput methods have been developed to obtain a global description of the interactome (the whole network of protein interactions for a given organism). In 2002, the yeast interactome was estimated to contain up to 80,000 potential interactions. This estimate is based on the integration of data sets obtained by various methods (mass spectrometry, two-hybrid methods, genetic studies). High-throughput methods are known, however, to yield a non-negligible rate of false positives, and to miss a fraction of existing interactions. The interactome can be represented as a graph where nodes correspond with proteins and edges with pairwise interactions. In recent years clustering methods have been developed and applied in order to extract relevant modules from such graphs. These algorithms require the specification of parameters that may drastically affect the results. In this paper we present a comparative assessment of four algorithms: Markov Clustering (MCL), Restricted Neighborhood Search Clustering (RNSC), Super Paramagnetic Clustering (SPC), and Molecular Complex Detection (MCODE). RESULTS: A test graph was built on the basis of 220 complexes annotated in the MIPS database. To evaluate the robustness to false positives and false negatives, we derived 41 altered graphs by randomly removing edges from or adding edges to the test graph in various proportions. Each clustering algorithm was applied to these graphs with various parameter settings, and the clusters were compared with the annotated complexes. We analyzed the sensitivity of the algorithms to the parameters and determined their optimal parameter values. We also evaluated their robustness to alterations of the test graph. We then applied the four algorithms to six graphs obtained from high-throughput experiments and compared the resulting clusters with the annotated complexes. CONCLUSION: This analysis shows that MCL is remarkably robust to graph alterations. In the tests of robustness, RNSC is more sensitive to edge deletion but less sensitive to the use of suboptimal parameter values. The other two algorithms are clearly weaker under most conditions. The analysis of high-throughput data supports the superiority of MCL for the extraction of complexes from interaction networks

HAL AMU

Public Library of Science (PLOS)

DI-fusion

Local Network Topology in Human Protein Interaction Data Predicts Functional Association

Author: A Hildebrand
A Vazquez
AD King
AJ Enright
AJ Walhout
AK Ramani
B Lehner
B Schwikowski
BA Hocevar
BJ Breitkreutz
C Stark
CT Chien
DS Goldberg
E Nabieva
F Li
GD Bader
H Jeong
Hua Li
J Geisler-Lee
J Laurikkala
J Rual
JA McMahon
JC Rain
JD Han
JG Abreu
L Giot
M Altaf-Ul-Amin
M Ashburner
M Deng
M Girvan
M Kanehisa
MP Samanta
N Przulj
P D'haeseleer
PK Datta
PK Datta
R Llewellyn
R Mazzarella
R Sharan
R Sharan
S Li
Shoudan Liang
Sridhar Hannenhalli
T Ito
TKB Gandhi
U Karaoz
U Stelzl
V Spirin
W Chen
Y Benjamini
Y Ikeda
Y Sun
Publication venue: Public Library of Science
Publication date: 29/07/2009
Field of study

The use of high-throughput techniques to generate large volumes of protein-protein interaction (PPI) data has increased the need for methods that systematically and automatically suggest functional relationships among proteins. In a yeast PPI network, previous work has shown that the local connection topology, particularly for two proteins sharing an unusually large number of neighbors, can predict functional association. In this study we improved the prediction scheme by developing a new algorithm and applied it on a human PPI network to make a genome-wide functional inference. We used the new algorithm to measure and reduce the influence of hub proteins on detecting function-associated protein pairs. We used the annotations of the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) as benchmarks to compare and evaluate the function relevance. The application of our algorithms to human PPI data yielded 4,233 significant functional associations among 1,754 proteins. Further functional comparisons between them allowed us to assign 466 KEGG pathway annotations to 274 proteins and 123 GO annotations to 114 proteins with estimated false discovery rates of <21% for KEGG and <30% for GO. We clustered 1,729 proteins by their functional associations and made functional inferences from detailed analysis on one subcluster highly enriched in the TGF-β signaling pathway (P<10−50). Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations. Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotation in this post-genomic era

Which clustering algorithm is better for predicting protein complexes?

Abstract Background Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. Results In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H) and Tandem Affinity Purification (TAP) methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. Conclusions While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: <url>http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm</url></p