Search CORE

Edinburgh Research Archive

Edinburgh Research Explorer

ToppGene Suite for gene list enrichment analysis and candidate gene prioritization

Author: A. G. Jegga
Adie
Aerts
B. J. Aronow
Bader
Barrett
Berger
Chen
Chen
Clarke
Dhandapany
E. E. Bardes
Fisher
Franke
Freudenberg
Hunt
J. Chen
Jimenez-Sanchez
Junker
Kohler
Peri
Rual
Stelzl
Thornblad
Tiffin
Tiffin
Turner
van Bokhoven
Villani
Wu
Zhu
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

ToppGene Suite (http://toppgene.cchmc.org; this web site is free and open to all users and does not require a login to access) is a one-stop portal for (i) gene list functional enrichment, (ii) candidate gene prioritization using either functional annotations or network analysis and (iii) identification and prioritization of novel disease candidate genes in the interactome. Functional annotation-based disease candidate gene prioritization uses a fuzzy-based similarity measure to compute the similarity between any two genes based on semantic annotations. The similarity scores from individual features are combined into an overall score using statistical meta-analysis. A P-value of each annotation of a test gene is derived by random sampling of the whole genome. The protein–protein interaction network (PPIN)-based disease candidate gene prioritization uses social and Web networks analysis algorithms (extended versions of the PageRank and HITS algorithms, and the K-Step Markov method). We demonstrate the utility of ToppGene Suite using 20 recently reported GWAS-based gene–disease associations (including novel disease genes) representing five diseases. ToppGene ranked 19 of 20 (95%) candidate genes within the top 20%, while ToppNet ranked 12 of 16 (75%) candidate genes among the top 20%

CiteSeerX

Gene prioritization in Type 2 Diabetes using domain interactions and network analysis

Author: Bharadwaj Dwaipayan
Chavali Sreenivas
Sharma Amitabh
Tabassum Rubina
Tandon Nikhil
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Identification of disease genes for Type 2 Diabetes (T2D) by traditional methods has yielded limited success. Based on our previous observation that T2D may result from disturbed protein-protein interactions affected through disrupting modular domain interactions, here we have designed an approach to rank the candidates in the T2D linked genomic regions as plausible disease genes. Results Our approach integrates Weight value (Wv) method followed by prioritization using clustering coefficients derived from domain interaction network. Wv for each candidate is calculated based on the assumption that disease genes might be functionally related, mainly facilitated by interactions among domains of the interacting proteins. The benchmarking using a test dataset comprising of both known T2D genes and non-T2D genes revealed that Wv method had a sensitivity and specificity of 0.74 and 0.96 respectively with 9 fold enrichment. The candidate genes having a Wv > 0.5 were called High Weight Elements (HWEs). Further, we ranked HWEs by using the network property-the clustering coefficient (Ci). Each HWE with a Ci < 0.015 was prioritized as plausible disease candidates (HWEc) as previous studies indicate that disease genes tend to avoid dense clustering (with an average Ci of 0.015). This method further prioritized the identified disease genes with a sensitivity of 0.32 and a specificity of 0.98 and enriched the candidate list by 6.8 fold. Thus, from the dataset of 4052 positional candidates the method ranked 435 to be most likely disease candidates. The gene ontology sharing for the candidates showed higher representation of metabolic and signaling processes. The approach also captured genes with unknown functions which were characterized by network motif analysis. Conclusions Prioritization of positional candidates is essential for cost-effective and an expedited discovery of disease genes. Here, we demonstrate a novel approach for disease candidate prioritization from numerous loci linked to T2D.</p

Springer - Publisher Connector

Public Library of Science (PLOS)

An algorithm for network-based gene prioritization that encodes knowledge both in nodes and in links

Author: A Madi
A Madi
AG Randolph
Chad Kimmel
D Nitsch
EA Adie
EA Adie
Eshel Ben-Jacob
G Gonzalez
G Ivan
J Chen
JM Kleinberg
JY Chen
JZ Wang
KG Becker
KR Brown
KR Brown
L Franke
M Ashburner
M Oti
MD McDowall
MS Scott
P Kauppi
S Aerts
S Havlin
S Kohler
Shyam Visweswaran
U Ala
X Wu
Y Chen
Y Moreau
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

Background: Candidate gene prioritization aims to identify promising new genes associated with a disease or a biological process from a larger set of candidate genes. In recent years, network-based methods - which utilize a knowledge network derived from biological knowledge - have been utilized for gene prioritization. Biological knowledge can be encoded either through the network's links or nodes. Current network-based methods can only encode knowledge through links. This paper describes a new network-based method that can encode knowledge in links as well as in nodes. Results: We developed a new network inference algorithm called the Knowledge Network Gene Prioritization (KNGP) algorithm which can incorporate both link and node knowledge. The performance of the KNGP algorithm was evaluated on both synthetic networks and on networks incorporating biological knowledge. The results showed that the combination of link knowledge and node knowledge provided a significant benefit across 19 experimental diseases over using link knowledge alone or node knowledge alone. Conclusions: The KNGP algorithm provides an advance over current network-based algorithms, because the algorithm can encode both link and node knowledge. We hope the algorithm will aid researchers with gene prioritization. © 2013 Kimmel, Visweswaran

CiteSeerX

D-Scholarship@Pitt

FigShare

Inference of gene-phenotype associations via protein-protein interaction and orthology

Author: Lai WF
Li J
Lovell-Badge R
Wang JJ
Wang P
Xu F
Yalamanchili HK
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

published_or_final_versio

CiteSeerX

INRIA a CCSD electronic archive server

HKU Scholars Hub

Gene–disease relationship discovery based on model-driven data integration and database view definition

Author: Bicep C.
Devignes M.D.
Jonveaux P.
Pierron L.
Smaïl-Tabbone M.
Yilmaz S.
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Motivation: Computational methods are widely used to discover gene–disease relationships hidden in vast masses of available genomic and post-genomic data. In most current methods, a similarity measure is calculated between gene annotations and known disease genes or disease descriptions. However, more explicit gene–disease relationships are required for better insights into the molecular bases of diseases, especially for complex multi-gene diseases

Springer - Publisher Connector

Improving disease gene prioritization using the semantic similarity of Gene Ontology terms

Author: Adie
Adie
Aerts
Ala
Altshuler
Andreas Schlicker
Ashburner
Berglund
Blake
Chatr-Aryamontri
Chen
Chen
Cho
Cordell
Feldman
Franke
Freudenberg
Gibson
Goh
Hubbard
Ideker
Jimenez-Sanchez
Kann
Kann
Kelso
Kerrien
Lage
Lee
Lin
Lowe
Mario Albrecht
Navlakha
O'Connor
Ortutay
Oti
Ozgür
Perez-Iratxeta
Perez-Iratxeta
Prasad
Reference Genome Group of the Gene Ontology Consortium
Robinson
Ruepp
Salwinski
Schlicker
Schlicker
Schreiber
Shriner
Smith
Teare
Thomas Lengauer
Tiffin
Tranchevent
Turner
UniProt Consortium
van Driel
van Driel
Velankar
Wu
Yilmaz
Yu
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Motivation: Many hereditary human diseases are polygenic, resulting from sequence alterations in multiple genes. Genomic linkage and association studies are commonly performed for identifying disease-related genes. Such studies often yield lists of up to several hundred candidate genes, which have to be prioritized and validated further. Recent studies discovered that genes involved in phenotypically similar diseases are often functionally related on the molecular level

Improved human disease candidate gene prioritization using mouse phenotype

Author: Aronow Bruce J
Chen Jing
Jegga Anil G
Xu Huan
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background The majority of common diseases are multi-factorial and modified by genetically and mechanistically complex polygenic interactions and environmental factors. High-throughput genome-wide studies like linkage analysis and gene expression profiling, tend to be most useful for classification and characterization but do not provide sufficient information to identify or prioritize specific disease causal genes. Results Extending on an earlier hypothesis that the majority of genes that impact or cause disease share membership in any of several functional relationships we, for the first time, show the utility of mouse phenotype data in human disease gene prioritization. We study the effect of different data integration methods, and based on the validation studies, we show that our approach, ToppGene <url>http://toppgene.cchmc.org</url>, outperforms two of the existing candidate gene prioritization methods, SUSPECTS and ENDEAVOUR. Conclusion The incorporation of phenotype information for mouse orthologs of human genes greatly improves the human disease candidate gene analysis and prioritization.</p

Disease Gene Prioritization

Author: Carlos Roberto Arias
Hsiang-Yuan Yeh
Von-Wun Soo
Publication venue: 'IntechOpen'
Publication date: 21/10/2011
Field of study

IntechOpen

Endeavour update: a web resource for gene prioritization in multiple species

Author: Adie
Aerts
Aerts
Ashburner
B. Coessens
B. De Moor
Bader
Ebermann
Elbers
Gasteiger
Glenisson
Hamosh
Hovatta
Hristovski
Jimenez-Sanchez
L.-C. Tranchevent
Lopez-Bigas
Mewes
Mulder
Oti
P. Van Loo
Perez-Iratxeta
Peri
R. Barriot
Rossi
S. Aerts
S. Van Vooren
S. Yu
Salwinski
Smith
Son
Stark
Tiffin
Turner
van Driel
Walker
Xia
Y. Moreau
Ye
Zhu
Publication venue: Oxford University Press
Publication date
Field of study

Endeavour (http://www.esat.kuleuven.be/endeavourweb; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes. Using a training set of genes known to be involved in a biological process of interest, our approach consists of (i) inferring several models (based on various genomic data sources), (ii) applying each model to the candidate genes to rank those candidates against the profile of the known genes and (iii) merging the several rankings into a global ranking of the candidate genes. In the present article, we describe the latest developments of Endeavour. First, we provide a web-based user interface, besides our Java client, to make Endeavour more universally accessible. Second, we support multiple species: in addition to Homo sapiens, we now provide gene prioritization for three major model organisms: Mus musculus, Rattus norvegicus and Caenorhabditis elegans. Third, Endeavour makes use of additional data sources and is now including numerous databases: ontologies and annotations, protein–protein interactions, cis-regulatory information, gene expression data sets, sequence information and text-mining data. We tested the novel version of Endeavour on 32 recent disease gene associations from the literature. Additionally, we describe a number of recent independent studies that made use of Endeavour to prioritize candidate genes for obesity and Type II diabetes, cleft lip and cleft palate, and pulmonary fibrosis