Search CORE

1,434 research outputs found

Measuring semantic similarities by combining gene ontology annotations and gene co-function networks

Author: A Franceschini
A Schlicker
AD Gordon
B Szappanos
C Brun
C Pesquita
C Pesquita
CA Joslyn
D Binns
DB Allison
FP Guengerich
HY Yu
I Lee
I Lee
I Lee
J Jin
J O’Madadhain
J Wang
Jiajie Peng
Jin Chen
JL Chen
JL Riechmann
JM Cherry
JZ Wang
K Verspoor
L Chae
L Hubert
M Ashburner
M Mizutani
MZ Zhu
P Kemmeren
P Lamesch
P Pagel
P Resnik
P Romero
P Zhang
PD Karp
PF Zhang
R Caspi
R Mani
S Falcon
S Romano
Sahra Uygun
Seung Y Rhee
SF Altschul
SY Rhee
SY Rhee
T Hawkins
Taehyong Kim
X Wu
Yadong Wang
Z Teng
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Ligand Similarity Complements Sequence, Physical Interaction, and Co-Expression for Gene Function Prediction

Author: Ballouz S.
Gillis J.
O'Meara M. J.
Shoichet B. K.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

The expansion of protein-ligand annotation databases has enabled large-scale networking of proteins by ligand similarity. These ligand-based protein networks, which implicitly predict the ability of neighboring proteins to bind related ligands, may complement biologically-oriented gene networks, which are used to predict functional or disease relevance. To quantify the degree to which such ligand-based protein associations might complement functional genomic associations, including sequence similarity, physical protein-protein interactions, co-expression, and disease gene annotations, we calculated a network based on the Similarity Ensemble Approach (SEA: sea.docking.org), where protein neighbors reflect the similarity of their ligands. We also measured the similarity with functional genomic networks over a common set of 1,131 genes, and found that the networks had only small overlaps, which were significant only due to the large scale of the data. Consistent with the view that the networks contain different information, combining them substantially improved Molecular Function prediction within GO (from AUROC~0.63-0.75 for the individual data modalities to AUROC~0.8 in the aggregate). We investigated the boost in guilt-by-association gene function prediction when the networks are combined and describe underlying properties that can be further exploited

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

UNSWorks

FigShare

Exploiting biomedical web resources: a case study

Author: DESSI NICOLETTA
PES BARBARA
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

An increasing number of web resources continue to be extensively used by healthcare operators to obtain more accurate diagnostic results. In particular, health care is reaping the benefits of technological advances in genomic for facing the demand of genetic tests that allow a better comprehension of diagnostic results. Within this context, Gene Ontology (GO) is a popular and effective mean for extracting knowledge from a list of genes and evaluating their semantic similarity. This paper investigates about the potential and any limits of GO ontology as support for capturing information about a set of genes which are supposed to play a significant role in a pathological condition. In particular, we present a case study that exploits some biomedical web resources for devising several groups of functionally coherent genes and experiments about the evaluation of their semantic similarity over GO. Due to the GO structure and content, results reveal limitations that not affect the evaluation of the semantic similarity when genes exhibit simple correlations but influence the estimation of the relatedness of genes belonging to complex organizations

Elsevier - Publisher Connector

Archivio istituzionale della ricerca - Università di Cagliari

Inferring gene ontologies from pairwise similarity data.

Author: Bafna Vineet
Dutkowski Janusz
Ideker Trey
Kramer Michael
Yu Michael
Publication venue: eScholarship, University of California
Publication date: 01/06/2014
Field of study

MotivationWhile the manually curated Gene Ontology (GO) is widely used, inferring a GO directly from -omics data is a compelling new problem. Recognizing that ontologies are a directed acyclic graph (DAG) of terms and hierarchical relations, algorithms are needed that: analyze a full matrix of gene-gene pairwise similarities from -omics data; infer true hierarchical structure in these data rather than enforcing hierarchy as a computational artifact; and respect biological pleiotropy, by which a term in the hierarchy can relate to multiple higher level terms. Methods addressing these requirements are just beginning to emerge-none has been evaluated for GO inference.MethodsWe consider two algorithms [Clique Extracted Ontology (CliXO), LocalFitness] that uniquely satisfy these requirements, compared with methods including standard clustering. CliXO is a new approach that finds maximal cliques in a network induced by progressive thresholding of a similarity matrix. We evaluate each method's ability to reconstruct the GO biological process ontology from a similarity matrix based on (a) semantic similarities for GO itself or (b) three -omics datasets for yeast.ResultsFor task (a) using semantic similarity, CliXO accurately reconstructs GO (>99% precision, recall) and outperforms other approaches (<20% precision, <20% recall). For task (b) using -omics data, CliXO outperforms other methods using two -omics datasets and achieves ∼30% precision and recall using YeastNet v3, similar to an earlier approach (Network Extracted Ontology) and better than LocalFitness or standard clustering (20-25% precision, recall).ConclusionThis study provides algorithmic foundation for building gene ontologies by capturing hierarchical and pleiotropic structure embedded in biomolecular data

PubMed Central

eScholarship - University of California

Identifying disease genes using machine learning and gene functional similarities, assessed through Gene Ontology

Author: Asif M.
Couto F.M.
Martiniano H.F.M.C.M.
Vicente A.M.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 10/12/2018
Field of study

Identifying disease genes from a vast amount of genetic data is one of the most challenging tasks in the post-genomic era. Also, complex diseases present highly heterogeneous genotype, which difficult biological marker identification. Machine learning methods are widely used to identify these markers, but their performance is highly dependent upon the size and quality of available data. In this study, we demonstrated that machine learning classifiers trained on gene functional similarities, using Gene Ontology (GO), can improve the identification of genes involved in complex diseases. For this purpose, we developed a supervised machine learning methodology to predict complex disease genes. The proposed pipeline was assessed using Autism Spectrum Disorder (ASD) candidate genes. A quantitative measure of gene functional similarities was obtained by employing different semantic similarity measures. To infer the hidden functional similarities between ASD genes, various types of machine learning classifiers were built on quantitative semantic similarity matrices of ASD and non-ASD genes. The classifiers trained and tested on ASD and non-ASD gene functional similarities outperformed previously reported ASD classifiers. For example, a Random Forest (RF) classifier achieved an AUC of 0. 80 for predicting new ASD genes, which was higher than the reported classifier (0.73). Additionally, this classifier was able to predict 73 novel ASD candidate genes that were enriched for core ASD phenotypes, such as autism and obsessive-compulsive behavior. In addition, predicted genes were also enriched for ASD co-occurring conditions, including Attention Deficit Hyperactivity Disorder (ADHD). We also developed a KNIME workflow with the proposed methodology which allows users to configure and execute it without requiring machine learning and programming skills. Machine learning is an effective and reliable technique to decipher ASD mechanism by identifying novel disease genes, but this study further demonstrated that their performance can be improved by incorporating a quantitative measure of gene functional similarities. Source code and the workflow of the proposed methodology are available at https://github.com/Muh-Asif/ASD-genes-prediction.This work was supported by the Portuguese Fundação para a Ciência e Tecnologia (SFRH/BD/52485/2014 to MA and DeST: Deep Semantic Tagger PTDC/CCI-BIO/28685/2017).info:eu-repo/semantics/publishedVersio

Repositório Científico do Instituto Nacional de Saúde

Novel semantic similarity measure improves an integrative approach to predicting gene functional associations

Author
Publication venue: BioMed Central
Publication date: 14/03/2013
Field of study

Springer - Publisher Connector

Finding disease similarity based on implicit semantic similarity

Author: Dinakarpandian Deendayal
Mathur Sachin
Publication venue: Elsevier Inc.
Publication date: 30/04/2012
Field of study

AbstractGenomics has contributed to a growing collection of gene–function and gene–disease annotations that can be exploited by informatics to study similarity between diseases. This can yield insight into disease etiology, reveal common pathophysiology and/or suggest treatment that can be appropriated from one disease to another. Estimating disease similarity solely on the basis of shared genes can be misleading as variable combinations of genes may be associated with similar diseases, especially for complex diseases. This deficiency can be potentially overcome by looking for common biological processes rather than only explicit gene matches between diseases. The use of semantic similarity between biological processes to estimate disease similarity could enhance the identification and characterization of disease similarity. We present functions to measure similarity between terms in an ontology, and between entities annotated with terms drawn from the ontology, based on both co-occurrence and information content. The similarity measure is shown to outperform other measures used to detect similarity. A manually curated dataset with known disease similarities was used as a benchmark to compare the estimation of disease similarity based on gene-based and Gene Ontology (GO) process-based comparisons. The detection of disease similarity based on semantic similarity between GO Processes (Recall=55%, Precision=60%) performed better than using exact matches between GO Processes (Recall=29%, Precision=58%) or gene overlap (Recall=88% and Precision=16%). The GO-Process based disease similarity scores on an external test set show statistically significant Pearson correlation (0.73) with numeric scores provided by medical residents. GO-Processes associated with similar diseases were found to be significantly regulated in gene expression microarray datasets of related diseases

Elsevier - Publisher Connector

Biological Process Linkage Networks

Author: A Battle
A Schlicker
A Vazquez
AC Gavin
AH Tong
AJ Butte
Avraham A. Melkman
B Schwikowski
C Stark
D Finley
D Lin
D Segre
DA Stavreva
Dikla Dotan-Cohen
E Formstecher
E Segal
E Unal
EM Marcotte
F Luo
H Hishigaki
H Jeong
I Xenarios
JL Lu
JM Stuart
KR Brown
L Giot
LA Amaral
LF Wu
M Larochelle
MA Harris
MA Huynen
P Bork
PT Spellman
PW Lord
R Kelley
R Sharan
Rodolfo Aramayo
Simon Kasif
SL Wong
Stan Letovsky
TR Hughes
U de Lichtenberg
U Karaoz
X Guo
Z Lubovac
Publication venue: Public Library of Science
Publication date: 23/04/2009
Field of study

BACKGROUND. The traditional approach to studying complex biological networks is based on the identification of interactions between internal components of signaling or metabolic pathways. By comparison, little is known about interactions between higher order biological systems, such as biological pathways and processes. We propose a methodology for gleaning patterns of interactions between biological processes by analyzing protein-protein interactions, transcriptional co-expression and genetic interactions. At the heart of the methodology are the concept of Linked Processes and the resultant network of biological processes, the Process Linkage Network (PLN). RESULTS. We construct, catalogue, and analyze different types of PLNs derived from different data sources and different species. When applied to the Gene Ontology, many of the resulting links connect processes that are distant from each other in the hierarchy, even though the connection makes eminent sense biologically. Some others, however, carry an element of surprise and may reflect mechanisms that are unique to the organism under investigation. In this aspect our method complements the link structure between processes inherent in the Gene Ontology, which by its very nature is species-independent. As a practical application of the linkage of processes we demonstrate that it can be effectively used in protein function prediction, having the power to increase both the coverage and the accuracy of predictions, when carefully integrated into prediction methods. CONCLUSIONS. Our approach constitutes a promising new direction towards understanding the higher levels of organization of the cell as a system which should help current efforts to re-engineer ontologies and improve our ability to predict which proteins are involved in specific biological processes.Lynn and William Frankel Center for Computer Science; the Paul Ivanier center for robotics research and production; National Science Foundation (ITR-048715); National Human Genome Research Institute (1R33HG002850-01A1, R01 HG003367-01A1); National Institute of Health (U54 LM008748

Public Library of Science (PLOS)

Crossref

Boston University Institutional Repository (OpenBU)

Directory of Open Access Journals

PubMed Central

An Integrative Disease Information Network Approach to Similar Disease detection

Author: Duan Lei
Jiang Weipeng
Li-Ling Jesse
Qin Ruiqi
Wang Tingting
Xu Wuli
Zhang Yidan
Zheng Huiru
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/09/2021
Field of study

Ulster University's Research Portal