Search CORE

45 research outputs found

Mapping Iranian patents based on International Patent Classification (IPC), from 1976 to 2011

Author: Alireza Noruzi
BL Basberg
D Archibugi
D Archibugi
D-Z Chen
FM Abbott
H Grupp
J Guan
J List
J Schmookler
J Schmookler
J Schmookler
K Pavitt
L Leydesdorff
M Bregonje
M Meyer
Mohammadhiwa Abdekhoda
P Ganguli
S Bhattacharya
SV Ramani
SV Ramani
WS Comanor
Y-CJ Wu
Z Griliches
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Predicting protein linkages in bacteria: Which method is best depends on task

Author: A Karimpour-Fard
A Karimpour-Fard
A Karimpour-Fard
AJ Enright
AK Ramani
Anis Karimpour-Fard
B Rost
BP Westover
C von Mering
CM Fraser
D Barker
D Eisenberg
DJ Watts
E Nabieva
EM Marcotte
G Kolesov
G Moreno-Hagelsieb
G Moreno-Hagelsieb
H Salgado
H Salgado
I Shah
I Yanai
J Bockhorst
J Bockhorst
J Sun
J Sun
JC Mellor
L Wang
Lawrence E Hunter
M Craven
M Huynen
M Pellegrini
M Strong
MA Huynen
MD Ermolaeva
OG Troyanskaya
OX Cordero
P Shannon
PD Karp
PM Bowers
PR Romero
R Jansen
R Jothi
R Overbeek
R Overbeek
RL Tatusov
Ryan T Gill
S Leach
S Tsoka
SC Janga
Sonia M Leach
SV Date
T Dandekar
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Applications of computational methods for predicting protein functional linkages are increasing. In recent years, several bacteria-specific methods for predicting linkages have been developed. The four major genomic context methods are: Gene cluster, Gene neighbor, Rosetta Stone, and Phylogenetic profiles. These methods have been shown to be powerful tools and this paper provides guidelines for when each method is appropriate by exploring different features of each method and potential improvements offered by their combination. We also review many previous treatments of these prediction methods, use the latest available annotations, and offer a number of new observations. Results Using <it>Escherichia coli </it>K12 and <it>Bacillus subtilis</it>, linkage predictions made by each of these methods were evaluated against three benchmarks: functional categories defined by COG and KEGG, known pathways listed in EcoCyc, and known operons listed in RegulonDB. Each evaluated method had strengths and weaknesses, with no one method dominating all aspects of predictive ability studied. For functional categories, as previous studies have shown, the Rosetta Stone method was individually best at detecting linkages and predicting functions among proteins with shared KEGG categories while the Phylogenetic profile method was best for linkage detection and function prediction among proteins with common COG functions. Differences in performance under COG versus KEGG may be attributable to the presence of paralogs. Better function prediction was observed when using a weighted combination of linkages based on reliability versus using a simple unweighted union of the linkage sets. For pathway reconstruction, 99 complete metabolic pathways in <it>E. coli </it>K12 (out of the 209 known, non-trivial pathways) and 193 pathways with 50% of their proteins were covered by linkages from at least one method. Gene neighbor was most effective individually on pathway reconstruction, with 48 complete pathways reconstructed. For operon prediction, Gene cluster predicted completely 59% of the known operons in <it>E. coli </it>K12 and 88% (333/418)in <it>B. subtilis</it>. Comparing two versions of the <it>E. coli </it>K12 operon database, many of the unannotated predictions in the earlier version were updated to true predictions in the later version. Using only linkages found by both Gene Cluster and Gene Neighbor improved the precision of operon predictions. Additionally, as previous studies have shown, combining features based on intergenic region and protein function improved the specificity of operon prediction. Conclusion A common problem for computational methods is the generation of a large number of false positives that might be caused by an incomplete source of validation. By comparing two versions of a database, we demonstrated the dramatic differences on reported results. We used several benchmarks on which we have shown the comparative effectiveness of each prediction method, as well as provided guidelines as to which method is most appropriate for a given prediction task.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Bayesian Inference for Genomic Data Integration Reduces Misclassification Rate in Predicting Protein-Protein Interactions

Author: A Elefsinioti
A Valencia
AJ Enright
AK Ramani
AL Hopkins
BA Shoemaker
C von Mering
C von Mering
CC Wu
Christos A. Ouzounis
Chuanhua Xing
CS Goh
David B. Dunson
DB Dunson
DR Rhodes
EC Butcher
EM Marcotte
F Browne
F Pazos
GT Hart
H Huang
H Ishwaran
H Yu
I Lee
IW Taylor
J Saric
J Sun
JS Bader
L Hakes
L Hood
L Lu
LJ Jensen
LJ Lu
LV Zhang
M Huang
M Persico
MA Yildirim
MP Brown
MS Scott
N Lin
OG Troyanskaya
P Aloy
P Bork
P Pagel
P Sham
R Chowdhary
R Jansen
R Malik
R Mrowka
S Dolma
S Kim
S Tsoka
SV Date
Y Qi
Y Qi
Publication venue: Public Library of Science
Publication date: 01/07/2011
Field of study

Protein-protein interactions (PPIs) are essential to most fundamental cellular processes. There has been increasing interest in reconstructing PPIs networks. However, several critical difficulties exist in obtaining reliable predictions. Noticeably, false positive rates can be as high as >80%. Error correction from each generating source can be both time-consuming and inefficient due to the difficulty of covering the errors from multiple levels of data processing procedures within a single test. We propose a novel Bayesian integration method, deemed nonparametric Bayes ensemble learning (NBEL), to lower the misclassification rate (both false positives and negatives) through automatically up-weighting data sources that are most informative, while down-weighting less informative and biased sources. Extensive studies indicate that NBEL is significantly more robust than the classic naïve Bayes to unreliable, error-prone and contaminated data. On a large human data set our NBEL approach predicts many more PPIs than naïve Bayes. This suggests that previous studies may have large numbers of not only false positives but also false negatives. The validation on two human PPIs datasets having high quality supports our observations. Our experiments demonstrate that it is feasible to predict high-throughput PPIs computationally with substantially reduced false positives and false negatives. The ability of predicting large numbers of PPIs both reliably and automatically may inspire people to use computational approaches to correct data errors in general, and may speed up PPIs prediction with high quality. Such a reliable prediction may provide a solid platform to other studies such as protein functions prediction and roles of PPIs in disease susceptibility

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Discovering functional linkages and uncharacterized cellular pathways using phylogenetic profile comparisons: a comprehensive assessment

Author: AC Gavin
AJ Enright
AK Ramani
B Snel
BG Mirkin
C von Mering
C von Mering
CS Goh
CS Goh
D Barker
D Vallenet
E Kolker
EM Marcotte
EM Marcotte
ES Snitkin
F Pazos
F Pazos
G Butland
GV Glazko
H Li
H Rachman
H Tettelin
H Wu
H Wu
HB Fraser
I Lee
I Tirosh
I Uchiyama
J De Las Rivas
J Gertz
J Sun
J Tamames
J Wu
J Wu
JB Pereira-Leal
JC Mellor
JC Rain
JF Rual
JM Peregrin-Alvarez
K Jim
K Tan
L Aravind
L Giot
M Campillos
M Levesque
M Pellegrini
M Strong
M Strong
M Wu
MA Huynen
MG Kann
MJ Martin
ML Green
MY Galperin
N Lopez-Bigas
NJ Krogan
NS Baliga
NS Baliga
P Pagel
P Shannon
P Ternes
P Uetz
PM Bowers
PM Bowers
PM Bowers
R Bonneau
R Jothi
R Jothi
R Overbeek
RA Gutierrez
RA Gutierrez
Raja Jothi
RL Tatusov
SB Hedges
SF Altschul
SV Date
SV Date
T Dandekar
T Gaasterland
T Ito
T Sato
T Wang
T Yamada
Teresa M Przytycka
TF Deluca
TS Mikkelsen
U Stelzl
V Kunin
Y Kim
Y Kim
Y Ye
Y Zheng
Y Zhou
Z Su
ZI Johnson
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background A widely-used approach for discovering functional and physical interactions among proteins involves phylogenetic profile comparisons (PPCs). Here, proteins with similar profiles are inferred to be functionally related under the assumption that proteins involved in the same metabolic pathway or cellular system are likely to have been co-inherited during evolution. Results Our experimentation with <it>E. coli </it>and yeast proteins with 16 different carefully composed reference sets of genomes revealed that the phyletic patterns of proteins in prokaryotes alone could be adequate enough to make reasonably accurate functional linkage predictions. A slight improvement in performance is observed on adding few eukaryotes into the reference set, but a noticeable drop-off in performance is observed with increased number of eukaryotes. Inclusion of most parasitic, pathogenic or vertebrate genomes and multiple strains of the same species into the reference set do not necessarily contribute to an improved sensitivity or accuracy. Interestingly, we also found that evolutionary histories of individual pathways have a significant affect on the performance of the PPC approach with respect to a particular reference set. For example, to accurately predict functional links in carbohydrate or lipid metabolism, a reference set solely composed of prokaryotic (or bacterial) genomes performed among the best compared to one composed of genomes from all three super-kingdoms; this is in contrast to predicting functional links in translation for which a reference set composed of prokaryotic (or bacterial) genomes performed the worst. We also demonstrate that the widely used random null model to quantify the statistical significance of profile similarity is incomplete, which could result in an increased number of false-positives. Conclusion Contrary to previous proposals, it is not merely the number of genomes but a careful selection of informative genomes in the reference set that influences the prediction accuracy of the PPC approach. We note that the predictive power of the PPC approach, especially in eukaryotes, is heavily influenced by the primary endosymbiosis and subsequent bacterial contributions. The over-representation of parasitic unicellular eukaryotes and vertebrates additionally make eukaryotes less useful in the reference sets. Reference sets composed of highly non-redundant set of genomes from all three super-kingdoms fare better with pathways showing considerable vertical inheritance and strong conservation (e.g. translation apparatus), while reference sets solely composed of prokaryotic genomes fare better for more variable pathways like carbohydrate metabolism. Differential performance of the PPC approach on various pathways, and a weak positive correlation between functional and profile similarities suggest that caution should be exercised while interpreting functional linkages inferred from genome-wide large-scale profile comparisons using a single reference set.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central