Search CORE

An integrative approach for measuring semantic similarities using gene ontology

Author: Hongxiang Li
Jiajie Peng
Jin Chen
Qinghua Jiang
Yadong Wang
Publication venue: Springer Nature
Publication date: 01/01/2014
Field of study

GORouter: an RDF model for providing semantic query and inference services for Gene Ontology and its associations

Author: Li Yixue
Lu Qiang
Luo Qingming
Shi Yixiang
Xu Qingwei
Zhang Guoqing
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Qucosa - Publikationsserver der Universität Leipzig

Instance-Based Matching of Large Life Science Ontologies

Author: Kirsten Toralf
Rahm Erhard
Thor Andreas
Publication venue
Publication date: 06/02/2019
Field of study

Ontologies are heavily used in life sciences so that there is increasing value to match different ontologies in order to determine related conceptual categories. We propose a simple yet powerful methodology for instance-based ontology matching which utilizes the associations between molecular-biological objects and ontologies. The approach can build on many existing ontology associations for instance objects like sequences and proteins and thus makes heavy use of available domain knowledge. Furthermore, the approach is flexible and extensible since each instance source with associations to the ontologies of interest can contribute to the ontology mapping. We study several approaches to determine the instance-based similarity of ontology categories. We perform an extensive experimental evaluation to use protein associations for different species to match between subontologies of the Gene Ontology and OMIM. We also provide a comparison with metadata-based ontology matching

Recommended from our members

Integrating Ontological Knowledge and Textual Evidence in Estimating Gene and Gene Product Similarity

Author: Gopalan Banu
Gregory Michelle L.
Posse Christian
Sanfilippo Antonio P.
Tratz Stephen C.
Publication venue: Pacific Northwest National Laboratory (U.S.)
Publication date: 08/06/2006
Field of study

With the rising influence of the Gene On-tology, new approaches have emerged where the similarity between genes or gene products is obtained by comparing Gene Ontology code annotations associ-ated with them. So far, these approaches have solely relied on the knowledge en-coded in the Gene Ontology and the gene annotations associated with the Gene On-tology database. The goal of this paper is to demonstrate that improvements to these approaches can be obtained by integrating textual evidence extracted from relevant biomedical literature

UNT Digital Library

Gene analogue finder: a GRID solution for finding functionally analogous gene products

Author: Donvito Giacinto
Gisel Andreas
Licciulli Flavio
Maggi Giorgio
Tulipano Angelica
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background To date more than 2,1 million gene products from more than 100000 different species have been described specifying their function, the processes they are involved in and their cellular localization using a very well defined and structured vocabulary, the gene ontology (GO). Such vast, well defined knowledge opens the possibility of compare gene products at the level of functionality, finding gene products which have a similar function or are involved in similar biological processes without relying on the conventional sequence similarity approach. Comparisons within such a large space of knowledge are highly data and computing intensive. For this reason this project was based upon the use of the computational GRID, a technology offering large computing and storage resources. Results We have developed a tool, GENe AnaloGue FINdEr (ENGINE) that parallelizes the search process and distributes the calculation and data over the computational GRID, splitting the process into many sub-processes and joining the calculation and the data on the same machine and therefore completing the whole search in about 3 days instead of occupying one single machine for more than 5 CPU years. The results of the functional comparison contain potential functional analogues for more than 79000 gene products from the most important species. 46% of the analyzed gene products are well enough described for such an analysis to individuate functional analogues, such as well-known members of the same gene family, or gene products with similar functions which would never have been associated by standard methods. Conclusion ENGINE has produced a list of potential functionally analogous relations between gene products within and between species using, in place of the sequence, the gene description of the GO, thus demonstrating the potential of the GO. However, the current limiting factor is the quality of the associations of many gene products from non-model organisms that often have electronic associations, since experimental information is missing. With future improvements of the GO, this limit will be reduced. ENGINE will manifest its power when it is applied to the whole GODB of more than 2,1 million gene products from more than 100000 organisms. The data produced by this search is planed to be available as a supplement to the GO database as soon as we are able to provide regular updates.</p

A transversal approach to predict gene product networks from ontology-based similarity

Author: A Budanitsky
A Schlicker
A Singhal
Anita Burgun
C Wolting
D Lin
DS Harris
E Agirre
E Camon
E Levy
EB Camon
F Azuaje
FD Gibbons
FJ Field
G Rigau
G Salton
GO Consortium
H Bedrine-Ferran
H Sun
H Wang
IG Wool
J Chabalier
J Chabalier
J Jiang
Jean Mosser
JH Chiang
JM Mariadason
Julie Chabalier
M Gerstein
M Kanehisa
MB Eisen
MD Weiss
ME Brosnan
O Bodenreider
P Joseph
P Khatri
P Resnik
PW Lord
R Baeza-Yates
R Rada
RC Gentleman
T Barrett
T Nakajima
T Yamamoto
TK Jenssen
X Mao
Y Quentin
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Interpretation of transcriptomic data is usually made through a "standard" approach which consists in clustering the genes according to their expression patterns and exploiting Gene Ontology (GO) annotations within each expression cluster. This approach makes it difficult to underline functional relationships between gene products that belong to different expression clusters. To address this issue, we propose a transversal analysis that aims to predict functional networks based on a combination of GO processes and data expression. Results The transversal approach presented in this paper consists in computing the semantic similarity between gene products in a Vector Space Model. Through a weighting scheme over the annotations, we take into account the representativity of the terms that annotate a gene product. Comparing annotation vectors results in a matrix of gene product similarities. Combined with expression data, the matrix is displayed as a set of functional gene networks. The transversal approach was applied to 186 genes related to the enterocyte differentiation stages. This approach resulted in 18 functional networks proved to be biologically relevant. These results were compared with those obtained through a standard approach and with an approach based on information content similarity. Conclusion Complementary to the standard approach, the transversal approach offers new insight into the cellular mechanisms and reveals new research hypotheses by combining gene product networks based on semantic similarity, and data expression.</p

A literature-based similarity metric for biological processes

Author: A Hyvarinen
A Tanay
AA Petti
AB Maxfield
AG Fraser
AH Tong
Alberto Pascual-Montano
CD Powell
Concha Gil
D Chaussabel
D Lin
D Martin
DD Lee
DE Levin
DM Blei
E Ravasz
EA Adie
G Weeks
H Shatkay
HS Carr
J Tuikkala
Jose M Carazo
L Giot
LH Hartwell
M Ashburner
M Chagoyen
M Vidal
MF Porter
Monica Chagoyen
NJ Krogan
O Bodenreider
P Glenisson
P Khatri
P Pehkonen
P Resnik
P Resnik
Pedro Carmona-Saez
PV Ogren
PW Lord
PW Lord
R Homayouni
RB Cattell
S Deerwester
S Deerwester
S Myhre
T Hofmann
T Sekito
T Yu
U Alon
VL Boyartchuk
X Wu
Z Bar-Joseph
ZN Oltvai
Publication venue: BioMed Central
Publication date: 01/07/2006
Field of study

BACKGROUND: Recent analyses in systems biology pursue the discovery of functional modules within the cell. Recognition of such modules requires the integrative analysis of genome-wide experimental data together with available functional schemes. In this line, methods to bridge the gap between the abstract definitions of cellular processes in current schemes and the interlinked nature of biological networks are required. RESULTS: This work explores the use of the scientific literature to establish potential relationships among cellular processes. To this end we haveused a document based similarity method to compute pair-wise similarities of the biological processes described in the Gene Ontology (GO). The method has been applied to the biological processes annotated for the Saccharomyces cerevisiae genome. We compared our results with similarities obtained with two ontology-based metrics, as well as with gene product annotation relationships. We show that the literature-based metric conserves most direct ontological relationships, while reveals biologically sounded similarities that are not obtained using ontology-based metrics and/or genome annotation. CONCLUSION: The scientific literature is a valuable source of information from which to compute similarities among biological processes. The associations discovered by literature analysis are a valuable complement to those encoded in existing functional schemes, and those that arise by genome annotation. These similarities can be used to conveniently map the interlinked structure of cellular processes in a particular organism

Digital.CSIC

Combining evidence, biomedical literature and statistical dependence: new insights for functional annotation of gene sets

Author: Aubry Marc
Burgun Anita
Chicault Celine
de Tayrac Marie
Galibert Marie-Dominique
Monnier Annabelle
Mosser Jean
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Large-scale genomic studies based on transcriptome technologies provide clusters of genes that need to be functionally annotated. The Gene Ontology (GO) implements a controlled vocabulary organised into three hierarchies: cellular components, molecular functions and biological processes. This terminology allows a coherent and consistent description of the knowledge about gene functions. The GO terms related to genes come primarily from semi-automatic annotations made by trained biologists (annotation based on evidence) or text-mining of the published scientific literature (literature profiling). RESULTS: We report an original functional annotation method based on a combination of evidence and literature that overcomes the weaknesses and the limitations of each approach. It relies on the Gene Ontology Annotation database (GOA Human) and the PubGene biomedical literature index. We support these annotations with statistically associated GO terms and retrieve associative relations across the three GO hierarchies to emphasise the major pathways involved by a gene cluster. Both annotation methods and associative relations were quantitatively evaluated with a reference set of 7397 genes and a multi-cluster study of 14 clusters. We also validated the biological appropriateness of our hybrid method with the annotation of a single gene (cdc2) and that of a down-regulated cluster of 37 genes identified by a transcriptome study of an in vitro enterocyte differentiation model (CaCo-2 cells). CONCLUSION: The combination of both approaches is more informative than either separate approach: literature mining can enrich an annotation based only on evidence. Text-mining of the literature can also find valuable associated MEDLINE references that confirm the relevance of the annotation. Eventually, GO terms networks can be built with associative relations in order to highlight cooperative and competitive pathways and their connected molecular functions