Search CORE

3,303 research outputs found

Integration of molecular network data reconstructs Gene Ontology.

Author: Gligorijević V
Janjić V
Pržulj N
Publication venue: 'Oxford University Press (OUP)'
Publication date: 22/08/2014
Field of study

Motivation: Recently, a shift was made from using Gene Ontology (GO) to evaluate molecular network data to using these data to construct and evaluate GO. Dutkowski et al. provide the first evidence that a large part of GO can be reconstructed solely from topologies of molecular networks. Motivated by this work, we develop a novel data integration framework that integrates multiple types of molecular network data to reconstruct and update GO. We ask how much of GO can be recovered by integrating various molecular interaction data. Results: We introduce a computational framework for integration of various biological networks using penalized non-negative matrix tri-factorization (PNMTF). It takes all network data in a matrix form and performs simultaneous clustering of genes and GO terms, inducing new relations between genes and GO terms (annotations) and between GO terms themselves. To improve the accuracy of our predicted relations, we extend the integration methodology to include additional topological information represented as the similarity in wiring around non-interacting genes. Surprisingly, by integrating topologies of bakers’ yeasts protein–protein interaction, genetic interaction (GI) and co-expression networks, our method reports as related 96% of GO terms that are directly related in GO. The inclusion of the wiring similarity of non-interacting genes contributes 6% to this large GO term association capture. Furthermore, we use our method to infer new relationships between GO terms solely from the topologies of these networks and validate 44% of our predictions in the literature. In addition, our integration method reproduces 48% of cellular component, 41% of molecular function and 41% of biological process GO terms, outperforming the previous method in the former two domains of GO. Finally, we predict new GO annotations of yeast genes and validate our predictions through GIs profiling. Availability and implementation: Supplementary Tables of new GO term associations and predicted gene annotations are available at http://bio-nets.doc.ic.ac.uk/GO-Reconstruction/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

PubMed Central

Spiral - Imperial College Digital Repository

NET-GE: a novel NETwork-based Gene Enrichment for detecting biological processes associated to Mendelian diseases

Author: Casadio Rita
Di Lena Pietro
Fariselli Piero
Martelli Pier Luigi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Enrichment analysis is a widely applied procedure for shedding light on the molecular mechanisms and functions at the basis of phenotypes, for enlarging the dataset of possibly related genes/proteins and for helping interpretation and prioritization of newly determined variations. Several standard and Network-based enrichment methods are available. Both approaches rely on the annotations that characterize the genes/proteins included in the input set; network based ones also include in different ways physical and functional relationships among different genes or proteins that can be extracted from the available biological networks of interactions

PubMed Central

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio istituzionale della ricerca - Università di Padova

Cluster-based assessment of protein-protein interaction confidence

Author: Grossmann A.
Herwig R.
Kamburov A.
Stelzl U.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/10/2012
Field of study

Background: Protein-protein interaction networks are key to a systems-level understanding of cellular biology. However, interaction data can contain a considerable fraction of false positives. Several methods have been proposed to assess the confidence of individual interactions. Most of them require the integration of additional data like protein expression and interaction homology information. While being certainly useful, such additional data are not always available and may introduce additional bias and ambiguity. Results: We propose a novel, network topology based interaction confidence assessment method called CAPPIC (cluster-based assessment of protein-protein interaction confidence). It exploits the network’s inherent modular architecture for assessing the confidence of individual interactions. Our method determines algorithmic parameters intrinsically and does not require any parameter input or reference sets for confidence scoring. Conclusions: On the basis of five yeast and two human physical interactome maps inferred using different techniques, we show that CAPPIC reliably assesses interaction confidence and its performance compares well to other approaches that are also based on network topology. The confidence score correlates with the agreement in localization and biological process annotations of interacting proteins. Moreover, it corroborates experimental evidence of physical interactions. Our method is not limited to physical interactome maps as we exemplify with a large yeast genetic interaction network. An implementation of CAPPIC is available at http://intscore.molgen.mpg.d

Springer - Publisher Connector

PubMed Central

MPG.PuRe

Data integration for the analysis of uncharacterized proteins in Mycobacterium tuberculosis

Author: Mazandu Gaston Kuzamunu
Publication venue: Institute of Infectious Disease and Molecular Medicine
Publication date: 01/01/2010
Field of study

Includes abstract.Includes bibliographical references (leaves 126-150).Mycobacterium tuberculosis is a bacterial pathogen that causes tuberculosis, a leading cause of human death worldwide from infectious diseases, especially in Africa. Despite enormous advances achieved in recent years in controlling the disease, tuberculosis remains a public health challenge. The contribution of existing drugs is of immense value, but the deadly synergy of the disease with Human Immunodeficiency Virus (HIV) or Acquired Immunodeficiency Syndrome (AIDS) and the emergence of drug resistant strains are threatening to compromise gains in tuberculosis control. In fact, the development of active tuberculosis is the outcome of the delicate balance between bacterial virulence and host resistance, which constitute two distinct and independent components. Significant progress has been made in understanding the evolution of the bacterial pathogen and its interaction with the host. The end point of these efforts is the identification of virulence factors and drug targets within the bacterium in order to develop new drugs and vaccines for the eradication of the disease

Cape Town University OpenUCT

Evaluation of GO-based functional similarity measures using S. cerevisiae protein interaction and expression profile data

Author: Du LinFang
Xu Tao
Zhou Yan
Publication venue: BioMed Central
Publication date: 01/11/2008
Field of study

Abstract Background Researchers interested in analysing the expression patterns of functionally related genes usually hope to improve the accuracy of their results beyond the boundaries of currently available experimental data. Gene ontology (GO) data provides a novel way to measure the functional relationship between gene products. Many approaches have been reported for calculating the similarities between two GO terms, known as semantic similarities. However, biologists are more interested in the relationship between gene products than in the scores linking the GO terms. To highlight the relationships among genes, recent studies have focused on functional similarities. Results In this study, we evaluated five functional similarity methods using both protein-protein interaction (PPI) and expression data of <it>S. cerevisiae</it>. The receiver operating characteristics (ROC) and correlation coefficient analysis of these methods showed that the maximum method outperformed the other methods. Statistical comparison of multiple- and single-term annotated proteins in biological process ontology indicated that genes with multiple GO terms may be more reliable for separating true positives from noise. Conclusion This study demonstrated the reliability of current approaches that elevate the similarity of GO terms to the similarity of proteins. Suggestions for further improvements in functional similarity analysis are also provided.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A probabilistic framework to predict protein function from interaction data integrated with semantic knowledge

Author: A Ruepp
A Valencia
A Vazquez
Aidong Zhang
B Schwikowski
B-J Breitkreutz
DS Goldberg
E Nabieva
E Sprinzak
EM Marcotte
H Hishigaki
H Lee
HN Chua
HW Mewes
I Friedberg
International Human Genome Sequencing Consortium
JBL Bard
JR Parrish
JZ Wang
L Salwinski
Lei Shi
M Deng
M Kirac
M Pellegrini
MB Eisen
Murali Ramanathan
P Resnik
PW Lord
R Aebersold
R Overbeek
SF Altschul
The Gene Ontology Consortium
U Karaoz
WR Pearson
X Guo
X Wu
Y Tao
Y-R Cho
Young-Rae Cho
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The functional characterization of newly discovered proteins has been a challenge in the post-genomic era. Protein-protein interactions provide insights into the functional analysis because the function of unknown proteins can be postulated on the basis of their interaction evidence with known proteins. The protein-protein interaction data sets have been enriched by high-throughput experimental methods. However, the functional analysis using the interaction data has a limitation in accuracy because of the presence of the false positive data experimentally generated and the interactions that are a lack of functional linkage. Results Protein-protein interaction data can be integrated with the functional knowledge existing in the Gene Ontology (GO) database. We apply similarity measures to assess the functional similarity between interacting proteins. We present a probabilistic framework for predicting functions of unknown proteins based on the functional similarity. We use the leave-one-out cross validation to compare the performance. The experimental results demonstrate that our algorithm performs better than other competing methods in terms of prediction accuracy. In particular, it handles the high false positive rates of current interaction data well. Conclusion The experimentally determined protein-protein interactions are erroneous to uncover the functional associations among proteins. The performance of function prediction for uncharacterized proteins can be enhanced by the integration of multiple data sources available.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Novel semantic similarity measure improves an integrative approach to predicting gene functional associations

Author
Publication venue: BioMed Central
Publication date: 14/03/2013
Field of study

Springer - Publisher Connector

An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology

Author: A del Pozo
A Hofer
A Patil
A Schlicker
AC Gavin
AJ Faller
C Pesquita
C Pesquita
C Pesquita
C Pesquita
CM Deane
D Li
D Lin
D Warde-Farley
DP Eisinger
DR Rhodes
F Azuaje
FM Couto
G Alterovitz
G van Rossum
G Yu
Gary D Bader
H Wu
H Yu
H Yu
I Xenarios
J Cheng
JJ Jiang
JL Sevilla
JZ Wang
K Strasser
K Xia
LJ Jensen
M Ashburner
M West
N Nariai
NJ Krogan
P Resnik
P Uetz
P Zhang
R Gentleman
R Shen
S Chavez
S Razick
Shobhit Jain
T Xu
U Stelzl
X Guo
Y Chen
Y Tao
Z Lei
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Semantic similarity measures are useful to assess the physiological relevance of protein-protein interactions (PPIs). They quantify similarity between proteins based on their function using annotation systems like the Gene Ontology (GO). Proteins that interact in the cell are likely to be in similar locations or involved in similar biological processes compared to proteins that do not interact. Thus the more semantically similar the gene function annotations are among the interacting proteins, more likely the interaction is physiologically relevant. However, most semantic similarity measures used for PPI confidence assessment do not consider the unequal depth of term hierarchies in different classes of cellular location, molecular function, and biological process ontologies of GO and thus may over-or under-estimate similarity. Results We describe an improved algorithm, Topological Clustering Semantic Similarity (TCSS), to compute semantic similarity between GO terms annotated to proteins in interaction datasets. Our algorithm, considers unequal depth of biological knowledge representation in different branches of the GO graph. The central idea is to divide the GO graph into sub-graphs and score PPIs higher if participating proteins belong to the same sub-graph as compared to if they belong to different sub-graphs. Conclusions The TCSS algorithm performs better than other semantic similarity measurement techniques that we evaluated in terms of their performance on distinguishing true from false protein interactions, and correlation with gene expression and protein families. We show an average improvement of 4.6 times the <it>F</it>1 score over Resnik, the next best method, on our <it>Saccharomyces cerevisiae </it>PPI dataset and 2 times on our <it>Homo sapiens </it>PPI dataset using cellular component, biological process and molecular function GO annotations.</p

University of Toronto Research Repository

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central