Search CORE

19,047 research outputs found

Combining multiple positive training sets to generate confidence scores for protein–protein interactions

Author: Asthana
Bader
Beuming
Bowers
Chatr-aryamontri
Crosby
Deane
Deng
Ewing
Formstecher
Gavin
Giot
Guldener
Ideker
Jingkai Yu
Kanehisa
Kelley
Kerrien
Krogan
Lee
Mishra
Mrowka
Myers
O'Brien
Pacifico
Parrish
Qi
Russell L. Finley
Scott
Sharan
Sprinzak
Stanyon
Stark
Stein
Stelzl
Suthram
Tarassov
The Gene Ontology Consortium
Titz
Tong
Uetz
Vastrik
von Mering
von Mering
Wong
Yamanishi
Yu
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: High-throughput experimental and computational methods are generating a wealth of protein–protein interaction data for a variety of organisms. However, data produced by current state-of-the-art methods include many false positives, which can hinder the analyses needed to derive biological insights. One way to address this problem is to assign confidence scores that reflect the reliability and biological significance of each interaction. Most previously described scoring methods use a set of likely true positives to train a model to score all interactions in a dataset. A single positive training set, however, may be biased and not representative of true interaction space

Crossref

PubMed Central

PTOMSM: A modified version of Topological Overlap Measure used for predicting Protein-Protein Interaction Network

Author: Xun Huang
Publication venue
Publication date: 25/12/2009
Field of study

A variety of methods are developed to integrating diverse biological data to predict novel interaction relationship between proteins. However, traditional integration can only generate protein interaction pairs within existing relationships. Therefore, we propose a modified version of Topological Overlap Measure to identify not only extant direct PPIs links, but also novel protein interactions that can be indirectly inferred from various relationships between proteins. Our method is more powerful than a naïve Bayesian-network-based integration in PPI prediction, and could generate more reliable candidate PPIs. Furthermore, we examined the influence of the sizes of training and test datasets on prediction, and further demonstrated the effectiveness of PTOMSM in predicting PPI. More importantly, this method can be extended naturally to predict other types of biological networks, and may be combined with Bayesian method to further improve the prediction

Crossref

Nature Precedings

Automated data integration for developmental biological research

Author: Sternberg Paul W.
Zhong Weiwei
Publication venue: 'The Company of Biologists'
Publication date: 15/09/2007
Field of study

In an era exploding with genome-scale data, a major challenge for developmental biologists is how to extract significant clues from these publicly available data to benefit our studies of individual genes, and how to use them to improve our understanding of development at a systems level. Several studies have successfully demonstrated new approaches to classic developmental questions by computationally integrating various genome-wide data sets. Such computational approaches have shown great potential for facilitating research: instead of testing 20,000 genes, researchers might test 200 to the same effect. We discuss the nature and state of this art as it applies to developmental research

Caltech Authors

Adapting a relation extraction pipeline for the BioCreAtIvE II task

Author: Grover Claire
Haddow Barry
Klein Ewan
Matthews Michael
Nielsen Leif Arda
Tobin Richard
Wang Xinglong
Publication venue
Publication date: 01/01/2007
Field of study

Edinburgh Research Explorer

DroID: the Drosophila Interactions Database, a comprehensive resource for annotated gene and protein interactions

Author: AC Gavin
AC Gingras
B Deplancke
B Lehner
BA Shoemaker
C Boone
C von Mering
CA Stanyon
CA Stanyon
CY Lin
E Formstecher
Guozhen Liu
H Yu
I Lee
I Vastrik
Jingkai Yu
JR Parrish
JR Parrish
JS Bader
L Giot
M Deng
M Persico
M Vidal
MN Arbeitman
P Shannon
P Tomancak
Russell L Finley
S Fields
S Fields
S Mathivanan
S Mukherjee
S Pacifico
S Suthram
SD Hooper
Svetlana Pacifico
T Beuming
T Reguly
T Sandmann
TI Lee
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Charting the interactions among genes and among their protein products is essential for understanding biological systems. A flood of interaction data is emerging from high throughput technologies, computational approaches, and literature mining methods. Quick and efficient access to this data has become a critical issue for biologists. Several excellent multi-organism databases for gene and protein interactions are available, yet most of these have understandable difficulty maintaining comprehensive information for any one organism. No single database, for example, includes all available interactions, integrated gene expression data, and comprehensive and searchable gene information for the important model organism, <it>Drosophila melanogaster</it>. Description DroID, the <it>Drosophila </it>Interactions Database, is a comprehensive interactions database designed specifically for <it>Drosophila</it>. DroID houses published physical protein interactions, genetic interactions, and computationally predicted interactions, including interologs based on data for other model organisms and humans. All interactions are annotated with original experimental data and source information. DroID can be searched and filtered based on interaction information or a comprehensive set of gene attributes from Flybase. DroID also contains gene expression and expression correlation data that can be searched and used to filter datasets, for example, to focus a study on sub-networks of co-expressed genes. To address the inherent noise in interaction data, DroID employs an updatable confidence scoring system that assigns a score to each physical interaction based on the likelihood that it represents a biologically significant link. Conclusion DroID is the most comprehensive interactions database available for <it>Drosophila</it>. To facilitate downstream analyses, interactions are annotated with original experimental information, gene expression data, and confidence scores. All data in DroID are freely available and can be searched, explored, and downloaded through three different interfaces, including a text based web site, a Java applet with dynamic graphing capabilities (IM Browser), and a Cytoscape plug-in. DroID is available at <url>http://www.droidb.org</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Commons@Wayne State University

Classifying pairs with trees for supervised biological network inference

Author: Babu M. Madan
Geurts Pierre
Schrynemackers Marie
Wehenkel Louis
Publication venue
Publication date: 24/04/2014
Field of study

Networks are ubiquitous in biology and computational approaches have been largely investigated for their inference. In particular, supervised machine learning methods can be used to complete a partially known network by integrating various measurements. Two main supervised frameworks have been proposed: the local approach, which trains a separate model for each network node, and the global approach, which trains a single model over pairs of nodes. Here, we systematically investigate, theoretically and empirically, the exploitation of tree-based ensemble methods in the context of these two approaches for biological network inference. We first formalize the problem of network inference as classification of pairs, unifying in the process homogeneous and bipartite graphs and discussing two main sampling schemes. We then present the global and the local approaches, extending the later for the prediction of interactions between two unseen network nodes, and discuss their specializations to tree-based ensemble methods, highlighting their interpretability and drawing links with clustering techniques. Extensive computational experiments are carried out with these methods on various biological networks that clearly highlight that these methods are competitive with existing methods.Comment: 22 page

arXiv.org e-Print Archive

CiteSeerX