Search CORE

1,444 research outputs found

DADA: Degree-Aware Algorithms for Network-Based Disease Gene Prioritization

Author: AM Edwards
AM Glazier
D Maglott
D Masotti
DS Goldberg
E Adie
E Nabieva
F Turner
G Bebek
Gurkan Bebek
H Tong
H Tong
HG Brunner
J Chen
K Lage
K Macropol
KI Goh
L Lovász
M Oti
M Oti
MA van Driel
Mehmet Koyutürk
MS Erten
O Vanunu
RA George
Rob M Ewing
S Aerts
S Brin
S Köhler
S Navlakha
Sinan Erten
T Ideker
VN Patel
WK Huh
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background High-throughput molecular interaction data have been used effectively to prioritize candidate genes that are linked to a disease, based on the observation that the products of genes associated with similar diseases are likely to interact with each other heavily in a network of protein-protein interactions (PPIs). An important challenge for these applications, however, is the incomplete and noisy nature of PPI data. Information flow based methods alleviate these problems to a certain extent, by considering indirect interactions and multiplicity of paths. Results We demonstrate that existing methods are likely to favor highly connected genes, making prioritization sensitive to the skewed degree distribution of PPI networks, as well as ascertainment bias in available interaction and disease association data. Motivated by this observation, we propose several statistical adjustment methods to account for the degree distribution of known disease and candidate genes, using a PPI network with associated confidence scores for interactions. We show that the proposed methods can detect loosely connected disease genes that are missed by existing approaches, however, this improvement might come at the price of more false negatives for highly connected genes. Consequently, we develop a suite called D<smcaps>A</smcaps>D<smcaps>A</smcaps>, which includes different uniform prioritization methods that effectively integrate existing approaches with the proposed statistical adjustment strategies. Comprehensive experimental results on the Online Mendelian Inheritance in Man (OMIM) database show that D<smcaps>A</smcaps>D<smcaps>A</smcaps> outperforms existing methods in prioritizing candidate disease genes. Conclusions These results demonstrate the importance of employing accurate statistical models and associated adjustment methods in network-based disease gene prioritization, as well as other network-based functional inference applications. D<smcaps>A</smcaps>D<smcaps>A</smcaps> is implemented in Matlab and is freely available at <url>http://compbio.case.edu/dada/</url>.</p

Southampton (e-Prints Soton)

Springer - Publisher Connector

Directory of Open Access Journals

Disease Gene Prioritization

Author: Carlos Roberto Arias
Hsiang-Yuan Yeh
Von-Wun Soo
Publication venue: 'IntechOpen'
Publication date: 21/10/2011
Field of study

Benchmarking network-based gene prioritization methods for cerebral small vessel disease

Author: Amy F
Cathie S
Grant R
Huayu Z
Keith S
Kristiina R
Muchen J
Teng Z
Wu H
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

Network-based gene prioritization algorithms are designed to prioritize disease-associated genes based on known ones using biological networks of protein interactions, gene disease associations and other relationships between biological entities. Various algorithms have been developed based on different mechanisms, but it is not obvious which algorithm is optimal for a specific disease. To address this issue, we benchmarked multiple algorithms for their application in cerebral small vessel disease (cSVD). We curated protein-gene interactions (PGI) and gene-disease associations (GDA) from databases and assembled PGI networks and disease-gene heterogenous networks. A screening of algorithms resulted in seven representative algorithms to be benchmarked. Performance of algorithms was assessed using both leave-one-out cross-validation (LOOCV) and external validation with MEGASTROKE genome-wide association study (GWAS). We found that random walk with restart on the heterogeneous network (RWRH) showed best LOOCV performance, with median LOOCV rediscovery rank of 185.5 (out of 19,463 genes). The GenePanda algorithm had most GWAS-confirmable genes in top 200 predictions, while RWRH had best ranks for small vessel stroke associated genes confirmed in GWAS. In conclusion, RWRH has overall better performance for application in cSVD despite its susceptibility to bias caused by degree centrality. Choice of algorithms should be determined before applying to specific disease. Current pure network-based gene prioritization algorithms are unlikely to find novel disease-associated genes that are not associated with known ones. The tools for implementing and benchmarking algorithms have been made available and can be generalized for other diseases

An integrated network of Arabidopsis growth regulators and its use for gene prioritization

Author: Drebert Zuzanna
Inzé Dirk
Sabaghian Ehsan
Saeys Yvan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Elucidating the molecular mechanisms that govern plant growth has been an important topic in plant research, and current advances in large-scale data generation call for computational tools that efficiently combine these different data sources to generate novel hypotheses. In this work, we present a novel, integrated network that combines multiple large-scale data sources to characterize growth regulatory genes in Arabidopsis, one of the main plant model organisms. The contributions of this work are twofold: first, we characterized a set of carefully selected growth regulators with respect to their connectivity patterns in the integrated network, and, subsequently, we explored to which extent these connectivity patterns can be used to suggest new growth regulators. Using a large-scale comparative study, we designed new supervised machine learning methods to prioritize growth regulators. Our results show that these methods significantly improve current state-of-the-art prioritization techniques, and are able to suggest meaningful new growth regulators. In addition, the integrated network is made available to the scientific community, providing a rich data source that will be useful for many biological processes, not necessarily restricted to plant growth

Identification of candidate disease genes by integrating Gene Ontologies and protein-interaction networks: case study of primary immunodeficiencies

Author: Adie
Aerts
Albert
Alexa
Amaral
Ashburner
Barabasi
Beissbarth
Bohn
Bustamante
Ceresa
Csaba Ortutay
Csardi
Doherty
Eppig
Estrada
Feldman
Fischer
Freeman
Gaj
George
Goh
Gol’dshtein
Higgins
Huber
Ideker
Jeger
Kohler
Lage
Latora
Liu
Lombard
Marodi
Mathivanan
Mauno Vihinen
Middendorf
Minegishi
Minegishi
Morimoto
Ochs
Ochs
Ortutay
Ortutay
Ortutay
Ortutay
Oti
Oti
Perez-Iratxeta
Perocchi
Pieroni
Plenge
Radivojac
Rhodes
Samarghitean
Shriner
Smith
Sultan
Teo
Thornblad
Uetz
Wu
Xavier
Xulvi-Brunet
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Disease gene identification is still a challenge despite modern high-throughput methods. Many diseases are very rare or lethal and thus cannot be investigated with traditional methods. Several in silico methods have been developed but they have some limitations. We introduce a new method that combines information about protein-interaction network properties and Gene Ontology terms. Genes with high-calculated network scores and statistically significant gene ontology terms based on known diseases are prioritized as candidate genes. The method was applied to identify novel primary immunodeficiency-related genes, 26 of which were found. The investigation uses the protein-interaction network for all essential immunome human genes available in the Immunome Knowledge Base and an analysis of their enriched gene ontology annotations. The identified disease gene candidates are mainly involved in cellular signaling including receptors, protein kinases and adaptor and binding proteins as well as enzymes. The method can be generalized for any disease group with sufficient information

Lund University Publications

Gene Prioritization through Consensus Strategy, Enrichment Methodologies Analysis, and Networking for Osteosarcoma Pathogenesis

Author: Cabrera-Andrade Alejandro
González-Díaz Humberto
Jaramillo-Koupermann Gabriela
López-Cortés Andrés
Munteanu Cristian-Robert
Paz-y-Miño César
Pazos A.
Pérez-Castillo Yunierkis
Tejera Eduardo
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

[Abstract] Osteosarcoma is the most common subtype of primary bone cancer, affecting mostly adolescents. In recent years, several studies have focused on elucidating the molecular mechanisms of this sarcoma; however, its molecular etiology has still not been determined with precision. Therefore, we applied a consensus strategy with the use of several bioinformatics tools to prioritize genes involved in its pathogenesis. Subsequently, we assessed the physical interactions of the previously selected genes and applied a communality analysis to this protein–protein interaction network. The consensus strategy prioritized a total list of 553 genes. Our enrichment analysis validates several studies that describe the signaling pathways PI3K/AKT and MAPK/ERK as pathogenic. The gene ontology described TP53 as a principal signal transducer that chiefly mediates processes associated with cell cycle and DNA damage response It is interesting to note that the communality analysis clusters several members involved in metastasis events, such as MMP2 and MMP9, and genes associated with DNA repair complexes, like ATM, ATR, CHEK1, and RAD51. In this study, we have identified well-known pathogenic genes for osteosarcoma and prioritized genes that need to be further explored.Instituto Carlos III; PI17/01826Xunta de Galicia; ED431C 2018/49Xunta de Galicia; ED431G/0

RUNA - Repositorio de Saúde

PANDA: prioritization of autism-genes using network-based deep-learning approach

Author: Zhang Yu
Publication venue: Memorial University of Newfoundland
Publication date: 01/08/2019
Field of study

Autism is a neuropsychiatric disorder characterized by impairments in reciprocal social interaction and communication, and the presence of restricted and repetitive behaviours. Autism is predominantly heritable, but the underlying genetic associations are still largely unknown. Understanding the genetic background of complex diseases, such as autism, plays an essential role in the promising precision medicine. The evaluation of candidate genes, however, requires time-consuming and expensive experiments given the large number of possibilities. Thus, computational methods have seen increasing applications in predicting gene-disease associations. In this thesis, we proposed a bioinformatics framework, Prioritization of Autism-genes using Network-based Deep-learning Approach (PANDA). Our approach aims to identify autism-genes across the human genome based on patterns of gene-gene interactions and topological similarity of genes in the interaction network. PANDA trains a graph deep learning classifier using the input of the human molecular interaction network (HMIN) and predicts and ranks the probability of autism association of every node (gene) in the network. PANDA was able to achieve a high classification accuracy of 89%, outperforming three other commonly used machine learning algorithms. Moreover, the gene prioritization ranking list produced by PANDA was evaluated and validated using a large-scale independent exome-sequencing study. The top decile (top 10%) of PANDA ranked genes were found significantly enriched for autism association

Memorial University Research Repository