Search CORE

Evaluation of calling algorithms for array-CGH

Author: Alison Motsinger Reif
Siddharth Roy
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2013
Field of study

Copy number variation (CNV) detection has become an integral part many of genetic studies and new technologies promise to revolutionize our ability to detect and link them to disease. However, recent studies highlight discrepancies in the genome wide CNV profile when measured by different technologies and even by the same technology. Furthermore, the change point algorithms used to call CNVs can have substantial disagreement on the same data set. We focus this article on comparative genomic hybridization (CGH) arrays because this platform lends itself well to accurate statistical modeling. We describe some newer methodological developments in local statistics that are well suited for CNV detection and calling on CGH arrays. Then we use both simulation studies and public data to compare these new local methods with the global methods that currently dominate literature. These results offer suggestions for choosing a particular method and provide insight to the lack of reproducibility that has been seen in the field so far

Frontiers - Publisher Connector

Multifactor Dimensionality Reduction as a Filter-Based Approach for Genome Wide Association Studies

Author: Motsinger-Reif Alison A.
Oki Noffisat O.
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2011
Field of study

Advances in genotyping technology and the multitude of genetic data available now provide a vast amount of data that is proving to be useful in the quest for a better understanding of human genetic diseases through the study of genetic variation. This has led to the development of approaches such as genome wide association studies (GWAS) designed specifically for interrogating variants across the genome for association with disease, typically by testing single locus, univariate associations. More recently it has been accepted that epistatic (interaction) effects may also be great contributors to these genetic effects, and GWAS methods are now being applied to find epistatic effects. The challenge for these methods still remain in prioritization and interpretation of results, as it has also become standard for initial findings to be independently investigated in replication cohorts or functional studies. This is motivating the development and implementation of filter-based approaches to prioritize variants found to be significant in a discovery stage for follow-up for replication. Such filters must be able to detect both univariate and interactive effects. In the current study we present and evaluate the use of multifactor dimensionality reduction (MDR) as such a filter, with simulated data and a wide range of effect sizes. Additionally, we compare the performance of the MDR filter to a similar filter approach using logistic regression (LR), the more traditional approach used in GWAS analysis, as well as evaporative cooling (EC)-another prominent machine learning filtering method. The results of our simulation study show that MDR is an effective method for such prioritization, and that it can detect main effects, and interactions with or without marginal effects. Importantly, it performed as well as EC and LR for main effect models. It also significantly outperforms LR for various two-locus epistatic models, while it has equivalent results as EC for the epistatic models. The results of this study demonstrate the potential of MDR as a filter to detect gene–gene interactions in GWAS studies

Crossref

Frontiers - Publisher Connector

An R package implementation of multifactor dimensionality reduction

Author: Motsinger-Reif Alison A
Winham Stacey J
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background A breadth of high-dimensional data is now available with unprecedented numbers of genetic markers and data-mining approaches to variable selection are increasingly being utilized to uncover associations, including potential gene-gene and gene-environment interactions. One of the most commonly used data-mining methods for case-control data is Multifactor Dimensionality Reduction (MDR), which has displayed success in both simulations and real data applications. Additional software applications in alternative programming languages can improve the availability and usefulness of the method for a broader range of users. Results We introduce a package for the R statistical language to implement the Multifactor Dimensionality Reduction (MDR) method for nonparametric variable selection of interactions. This package is designed to provide an alternative implementation for R users, with great flexibility and utility for both data analysis and research. The 'MDR' package is freely available online at <url>http://www.r-project.org/</url>. We also provide data examples to illustrate the use and functionality of the package. Conclusions MDR is a frequently-used data-mining method to identify potential gene-gene interactions, and alternative implementations will further increase this usage. We introduce a flexible software package for R users.</p

arXiv.org e-Print Archive

Bayesian neural networks for detecting epistasis in genetic association studies

Author: Beam Andrew L
Doyle Jon
Motsinger-Reif Alison
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/04/2014
Field of study

Background: Discovering causal genetic variants from large genetic association studies poses many difficult challenges. Assessing which genetic markers are involved in determining trait status is a computationally demanding task, especially in the presence of gene-gene interactions. Results: A non-parametric Bayesian approach in the form of a Bayesian neural network is proposed for use in analyzing genetic association studies. Demonstrations on synthetic and real data reveal they are able to efficiently and accurately determine which variants are involved in determining case-control status. By using graphics processing units (GPUs) the time needed to build these models is decreased by several orders of magnitude. In comparison with commonly used approaches for detecting interactions, Bayesian neural networks perform very well across a broad spectrum of possible genetic relationships. Conclusions: The proposed framework is shown to be a powerful method for detecting causal SNPs while being computationally efficient enough to handle large datasets. Electronic supplementary material The online version of this article (doi:10.1186/s12859-014-0368-0) contains supplementary material, which is available to authorized users

Harvard University - DASH

Carolina Digital Repository

Carboplatin/taxane-induced gastrointestinal toxicity: a pharmacogenomics study on the SCOTROC1 trial

Author: Brown
Glass
He
Hoskins
McLeod
Motsinger-Reif
Paul
Winham
Publication venue
Publication date: 01/01/2016
Field of study

Carboplatin/taxane combination is first-line therapy for ovarian cancer. However, patients can encounter treatment delays, impaired quality of life, even death because of chemotherapy-induced gastrointestinal (GI) toxicity. A candidate gene study was conducted to assess potential association of genetic variants with GI toxicity in 808 patients who received carboplatin/taxane in the Scottish Randomized Trial in Ovarian Cancer 1 (SCOTROC1). Patients were randomized into discovery and validation cohorts consisting of 404 patients each. Clinical covariates and genetic variants associated with grade III/IV GI toxicity in discovery cohort were evaluated in replication cohort. Chemotherapy-induced GI toxicity was significantly associated with seven single-nucleotide polymorphisms in the ATP7B, GSR, VEGFA and SCN10A genes. Patients with risk genotypes were at 1.53 to 18.01 higher odds to develop carboplatin/taxane-induced GI toxicity (P<0.01). Variants in the VEGF gene were marginally associated with survival time. Our data provide potential targets for modulation/inhibition of GI toxicity in ovarian cancer patients

Carboplatin/taxane-induced gastrointestinal toxicity: a pharmacogenomics study on the SCOTROC1 trial

Author: Brown R.
Glass S.
He Y.J.
Hoskins J.M.
McLeod H.L.
Motsinger-Reif A.
Paul J.
Winham S.J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2016
Field of study

Enlighten

A comparison of internal validation techniques for multifactor dimensionality reduction

Author: Motsinger-Reif Alison A
Slater Andrew J
Winham Stacey J
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background It is hypothesized that common, complex diseases may be due to complex interactions between genetic and environmental factors, which are difficult to detect in high-dimensional data using traditional statistical approaches. Multifactor Dimensionality Reduction (MDR) is the most commonly used data-mining method to detect epistatic interactions. In all data-mining methods, it is important to consider internal validation procedures to obtain prediction estimates to prevent model over-fitting and reduce potential false positive findings. Currently, MDR utilizes cross-validation for internal validation. In this study, we incorporate the use of a three-way split (3WS) of the data in combination with a post-hoc pruning procedure as an alternative to cross-validation for internal model validation to reduce computation time without impairing performance. We compare the power to detect true disease causing loci using MDR with both 5- and 10-fold cross-validation to MDR with 3WS for a range of single-locus and epistatic disease models. Additionally, we analyze a dataset in HIV immunogenetics to demonstrate the results of the two strategies on real data. Results MDR with 3WS is computationally approximately five times faster than 5-fold cross-validation. The power to find the exact true disease loci without detecting false positive loci is higher with 5-fold cross-validation than with 3WS before pruning. However, the power to find the true disease causing loci in addition to false positive loci is equivalent to the 3WS. With the incorporation of a pruning procedure after the 3WS, the power of the 3WS approach to detect only the exact disease loci is equivalent to that of MDR with cross-validation. In the real data application, the cross-validation and 3WS analyses indicate the same two-locus model. Conclusions Our results reveal that the performance of the two internal validation methods is equivalent with the use of pruning procedures. The specific pruning procedure should be chosen understanding the trade-off between identifying all relevant genetic effects but including false positives and missing important genetic factors. This implies 3WS may be a powerful and computationally efficient approach to screen for epistatic effects, and could be used to identify candidate interactions in large-scale genetic studies.</p

Crossref

A Comparison of Association Methods for Cytotoxicity Mapping in Pharmacogenomics

Author: Brown Chad
Everitt Lorraine
Havener Tammy M.
McLeod Howard
Motsinger-Reif Alison A.
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2011
Field of study

Cytotoxicity assays of immortalized lymphoblastoid cell lines (LCLs) represent a promising new in vitro approach in pharmacogenomics research. However, previous studies employing LCLs in gene mapping have used simple association methods, which may not adequately capture the true differences in non-linear response profiles between genotypes. Two common approaches summarize each dose-response curve with either the IC50 or the slope parameter estimates from a hill slope fit and treat these estimates as the response in a linear model. The current study investigates these two methods, as well as four novel methods, and compares their power to detect differences between the response profiles of genotypes under a variety of different alternatives. The four novel methods include two methods that summarize each dose-response by its area under the curve, one method based off of an analysis of variance (ANOVA) design, and one method that compares hill slope fits for all individuals of each genotype. The power of each method was found to depend not only on the choice of alternative, but also on the choice for the set of dosages used in cytotoxicity measurements. The ANOVA-based method was found to be the most robust across alternatives and dosage sets for power in detecting differences between genotypes