Search CORE

14 research outputs found

One for all and all for One: Improving replication of genetic studies through network diffusion

Author: Adam Naj (3233193)
Daniel Lancour (5131244)
Gerard D. Schellenberg (160086)
Jonathan L. Haines (160079)
Lindsay A. Farrer (137611)
Margaret A. Pericak-Vance (160082)
Mark Crovella (5131247)
Richard Mayeux (113225)
Simon Kasif (2952)
Publication venue
Publication date: 01/04/2018
Field of study

<div>Improving accuracy in genetic studies would greatly accelerate understanding the genetic basis of complex diseases. One approach to achieve such an improvement for risk variants identified by the genome wide association study (GWAS) approach is to incorporate previously known biology when screening variants across the genome. We developed a simple approach for improving the prioritization of candidate disease genes that incorporates a network diffusion of scores from known disease genes using a protein network and a novel integration with GWAS risk scores, and tested this approach on a large Alzheimer disease (AD) GWAS dataset. Using a statistical bootstrap approach, we cross-validated the method and for the first time showed that a network approach improves the expected replication rates in GWAS studies. Several novel AD genes were predicted including CR2, SHARPIN, and PTPN2. Our re-prioritized results are enriched for established known AD-associated biological pathways including inflammation, immune response, and metabolism, whereas standard non-prioritized results were not. Our findings support a strategy of considering network information when investigating genetic risk factors.</div

Crossref

Boston University Institutional Repository (OpenBU)

Directory of Open Access Journals

University of Miami: Scholarship Miami

FigShare

Support vector machine training to predict GWAS and network Z-score weights.

Author: Adam Naj (3233193)
Daniel Lancour (5131244)
Gerard D. Schellenberg (160086)
Jonathan L. Haines (160079)
Lindsay A. Farrer (137611)
Margaret A. Pericak-Vance (160082)
Mark Crovella (5131247)
Richard Mayeux (113225)
Simon Kasif (2952)
Publication venue
Publication date
Field of study

Selection of genes with a high replication rate (> 0.7, blue points) and low replication rate (<0.1, red points) yielded a balanced number of genes in each replication class (high/low). A linear SVM model was trained to predict replication class using the GWAS and network Z-scores of each gene. Genes represented as X's were used as support vectors for the training of the SVM, whereas genes represented as O's were not. Both network and GWAS Z-scores contributed to the decision boundary, as demonstrated by the significance of their predicted coefficients using logistic regression (GWAS: p <2.0×10−16, Network: p = 0.0016).</p

FigShare

RAD genes and the type of study that identified them.

Author: Adam Naj (3233193)
Daniel Lancour (5131244)
Gerard D. Schellenberg (160086)
Jonathan L. Haines (160079)
Lindsay A. Farrer (137611)
Margaret A. Pericak-Vance (160082)
Mark Crovella (5131247)
Richard Mayeux (113225)
Simon Kasif (2952)
Publication venue
Publication date
Field of study

RAD genes and the type of study that identified them.</p

FigShare

GSEA results after ranking genes by combined Z-scores.

Author: Adam Naj (3233193)
Daniel Lancour (5131244)
Gerard D. Schellenberg (160086)
Jonathan L. Haines (160079)
Lindsay A. Farrer (137611)
Margaret A. Pericak-Vance (160082)
Mark Crovella (5131247)
Richard Mayeux (113225)
Simon Kasif (2952)
Publication venue
Publication date
Field of study

GSEA results after ranking genes by combined Z-scores.</p

FigShare

Proximity of non-RAD hub genes to RAD genes.

Author: Adam Naj (3233193)
Daniel Lancour (5131244)
Gerard D. Schellenberg (160086)
Jonathan L. Haines (160079)
Lindsay A. Farrer (137611)
Margaret A. Pericak-Vance (160082)
Mark Crovella (5131247)
Richard Mayeux (113225)
Simon Kasif (2952)
Publication venue
Publication date
Field of study

Proximity of non-RAD hub genes to RAD genes.</p

FigShare

Proximity between RAD genes in PPI network.

Author: Adam Naj (3233193)
Daniel Lancour (5131244)
Gerard D. Schellenberg (160086)
Jonathan L. Haines (160079)
Lindsay A. Farrer (137611)
Margaret A. Pericak-Vance (160082)
Mark Crovella (5131247)
Richard Mayeux (113225)
Simon Kasif (2952)
Publication venue
Publication date
Field of study

Each RAD gene was ranked (in comparison to the other 19,972 genes in the network) based upon its degree (number of interactions in network), its ASP distance to the RAD genes, and total diffusion distance from the RAD genes. The average ranking of the RAD genes was 7,949 using ASP (60th percentile, t-test p = 0.015) and 6,959 for diffusion (65th percentile, t-test p = 0.00054).</p

FigShare

Comparison of GWAS and network Z-scores.

Author: Adam Naj (3233193)
Daniel Lancour (5131244)
Gerard D. Schellenberg (160086)
Jonathan L. Haines (160079)
Lindsay A. Farrer (137611)
Margaret A. Pericak-Vance (160082)
Mark Crovella (5131247)
Richard Mayeux (113225)
Simon Kasif (2952)
Publication venue
Publication date
Field of study

A. Transformed Z-scores are uncorrelated. B. Genes with high network scores had higher replication rates compared to those with low network scores, as further visualized and confirmed statistically as shown in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007306#pgen.1007306.g004" target="_blank">Fig 4</a>. Reprate = replication rate.</p

FigShare

Top predicted AD genes using combination approach.

Author: Adam Naj (3233193)
Daniel Lancour (5131244)
Gerard D. Schellenberg (160086)
Jonathan L. Haines (160079)
Lindsay A. Farrer (137611)
Margaret A. Pericak-Vance (160082)
Mark Crovella (5131247)
Richard Mayeux (113225)
Simon Kasif (2952)
Publication venue
Publication date
Field of study

Top predicted AD genes using combination approach.</p

FigShare

Summary of analysis steps.

Author: Adam Naj (3233193)
Daniel Lancour (5131244)
Gerard D. Schellenberg (160086)
Jonathan L. Haines (160079)
Lindsay A. Farrer (137611)
Margaret A. Pericak-Vance (160082)
Mark Crovella (5131247)
Richard Mayeux (113225)
Simon Kasif (2952)
Publication venue
Publication date
Field of study

A set of AD genes that are reproducible (RAD genes) across different genetic studies was assembled through literature curation. The RAD genes were assigned a high initial risk score, and graph theoretical diffusion was employed to derive network diffusion scores for the rest of the genes in the network. Scores obtained from genetic screens and network diffusion were integrated to derive a new prioritization.</p

FigShare

Filtering on network score improves replication rate.

Author: Adam Naj (3233193)
Daniel Lancour (5131244)
Gerard D. Schellenberg (160086)
Jonathan L. Haines (160079)
Lindsay A. Farrer (137611)
Margaret A. Pericak-Vance (160082)
Mark Crovella (5131247)
Richard Mayeux (113225)
Simon Kasif (2952)
Publication venue
Publication date
Field of study

The replication rate was computed for all genes surpassing the significance threshold for each GWAS. This procedure was repeated in each bootstrapped dataset and the average replication rate was determined (purple). This process was repeated using increasingly strict filters on the network diffusion scores. The baseline replication rate without utilizing network scores (naïve method) is represented by the purple points. The strictest network filter (red) has a consistently higher replication rate than the naïve method.</p

FigShare