325 research outputs found

    The effect of alternative permutation testing strategies on the performance of multifactor dimensionality reduction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multifactor Dimensionality Reduction (MDR) is a novel method developed to detect gene-gene interactions in case-control association analysis by exhaustively searching multi-locus combinations. While the end-goal of analysis is hypothesis generation, significance testing is employed to indicate statistical interest in a resulting model. Because the underlying distribution for the null hypothesis of no association is unknown, non-parametric permutation testing is used. Lately, there has been more emphasis on selecting all statistically significant models at the end of MDR analysis in order to avoid missing a true signal. This approach opens up questions about the permutation testing procedure. Traditionally omnibus permutation testing is used, where one permutation distribution is generated for all models. An alternative is <it>n</it>-locus permutation testing, where a separate distribution is created for each <it>n</it>-level of interaction tested.</p> <p>Findings</p> <p>In this study, we show that the false positive rate for the MDR method is at or below a selected alpha level, and demonstrate the conservative nature of omnibus testing. We compare the power and false positive rates of both permutation approaches and find omnibus permutation testing optimal for preserving power while protecting against false positives.</p> <p>Conclusion</p> <p>Omnibus permutation testing should be used with the MDR method.</p

    Evaluation of calling algorithms for array-CGH

    Get PDF
    Copy number variation (CNV) detection has become an integral part many of genetic studies and new technologies promise to revolutionize our ability to detect and link them to disease. However, recent studies highlight discrepancies in the genome wide CNV profile when measured by different technologies and even by the same technology. Furthermore, the change point algorithms used to call CNVs can have substantial disagreement on the same data set. We focus this article on comparative genomic hybridization (CGH) arrays because this platform lends itself well to accurate statistical modeling. We describe some newer methodological developments in local statistics that are well suited for CNV detection and calling on CGH arrays. Then we use both simulation studies and public data to compare these new local methods with the global methods that currently dominate literature. These results offer suggestions for choosing a particular method and provide insight to the lack of reproducibility that has been seen in the field so far

    Multifactor Dimensionality Reduction as a Filter-Based Approach for Genome Wide Association Studies

    Get PDF
    Advances in genotyping technology and the multitude of genetic data available now provide a vast amount of data that is proving to be useful in the quest for a better understanding of human genetic diseases through the study of genetic variation. This has led to the development of approaches such as genome wide association studies (GWAS) designed specifically for interrogating variants across the genome for association with disease, typically by testing single locus, univariate associations. More recently it has been accepted that epistatic (interaction) effects may also be great contributors to these genetic effects, and GWAS methods are now being applied to find epistatic effects. The challenge for these methods still remain in prioritization and interpretation of results, as it has also become standard for initial findings to be independently investigated in replication cohorts or functional studies. This is motivating the development and implementation of filter-based approaches to prioritize variants found to be significant in a discovery stage for follow-up for replication. Such filters must be able to detect both univariate and interactive effects. In the current study we present and evaluate the use of multifactor dimensionality reduction (MDR) as such a filter, with simulated data and a wide range of effect sizes. Additionally, we compare the performance of the MDR filter to a similar filter approach using logistic regression (LR), the more traditional approach used in GWAS analysis, as well as evaporative cooling (EC)-another prominent machine learning filtering method. The results of our simulation study show that MDR is an effective method for such prioritization, and that it can detect main effects, and interactions with or without marginal effects. Importantly, it performed as well as EC and LR for main effect models. It also significantly outperforms LR for various two-locus epistatic models, while it has equivalent results as EC for the epistatic models. The results of this study demonstrate the potential of MDR as a filter to detect gene–gene interactions in GWAS studies

    An R package implementation of multifactor dimensionality reduction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A breadth of high-dimensional data is now available with unprecedented numbers of genetic markers and data-mining approaches to variable selection are increasingly being utilized to uncover associations, including potential gene-gene and gene-environment interactions. One of the most commonly used data-mining methods for case-control data is Multifactor Dimensionality Reduction (MDR), which has displayed success in both simulations and real data applications. Additional software applications in alternative programming languages can improve the availability and usefulness of the method for a broader range of users.</p> <p>Results</p> <p>We introduce a package for the R statistical language to implement the Multifactor Dimensionality Reduction (MDR) method for nonparametric variable selection of interactions. This package is designed to provide an alternative implementation for R users, with great flexibility and utility for both data analysis and research. The 'MDR' package is freely available online at <url>http://www.r-project.org/</url>. We also provide data examples to illustrate the use and functionality of the package.</p> <p>Conclusions</p> <p>MDR is a frequently-used data-mining method to identify potential gene-gene interactions, and alternative implementations will further increase this usage. We introduce a flexible software package for R users.</p

    Bayesian neural networks for detecting epistasis in genetic association studies

    Get PDF
    Background: Discovering causal genetic variants from large genetic association studies poses many difficult challenges. Assessing which genetic markers are involved in determining trait status is a computationally demanding task, especially in the presence of gene-gene interactions. Results: A non-parametric Bayesian approach in the form of a Bayesian neural network is proposed for use in analyzing genetic association studies. Demonstrations on synthetic and real data reveal they are able to efficiently and accurately determine which variants are involved in determining case-control status. By using graphics processing units (GPUs) the time needed to build these models is decreased by several orders of magnitude. In comparison with commonly used approaches for detecting interactions, Bayesian neural networks perform very well across a broad spectrum of possible genetic relationships. Conclusions: The proposed framework is shown to be a powerful method for detecting causal SNPs while being computationally efficient enough to handle large datasets. Electronic supplementary material The online version of this article (doi:10.1186/s12859-014-0368-0) contains supplementary material, which is available to authorized users

    Carboplatin/taxane-induced gastrointestinal toxicity: a pharmacogenomics study on the SCOTROC1 trial

    Get PDF
    Carboplatin/taxane combination is first-line therapy for ovarian cancer. However, patients can encounter treatment delays, impaired quality of life, even death because of chemotherapy-induced gastrointestinal (GI) toxicity. A candidate gene study was conducted to assess potential association of genetic variants with GI toxicity in 808 patients who received carboplatin/taxane in the Scottish Randomized Trial in Ovarian Cancer 1 (SCOTROC1). Patients were randomized into discovery and validation cohorts consisting of 404 patients each. Clinical covariates and genetic variants associated with grade III/IV GI toxicity in discovery cohort were evaluated in replication cohort. Chemotherapy-induced GI toxicity was significantly associated with seven single-nucleotide polymorphisms in the ATP7B, GSR, VEGFA and SCN10A genes. Patients with risk genotypes were at 1.53 to 18.01 higher odds to develop carboplatin/taxane-induced GI toxicity (P<0.01). Variants in the VEGF gene were marginally associated with survival time. Our data provide potential targets for modulation/inhibition of GI toxicity in ovarian cancer patients

    Carboplatin/taxane-induced gastrointestinal toxicity: a pharmacogenomics study on the SCOTROC1 trial

    Get PDF
    Carboplatin/taxane combination is first-line therapy for ovarian cancer. However, patients can encounter treatment delays, impaired quality of life, even death because of chemotherapy-induced gastrointestinal (GI) toxicity. A candidate gene study was conducted to assess potential association of genetic variants with GI toxicity in 808 patients who received carboplatin/taxane in the Scottish Randomized Trial in Ovarian Cancer 1 (SCOTROC1). Patients were randomized into discovery and validation cohorts consisting of 404 patients each. Clinical covariates and genetic variants associated with grade III/IV GI toxicity in discovery cohort were evaluated in replication cohort. Chemotherapy-induced GI toxicity was significantly associated with seven single-nucleotide polymorphisms in the ATP7B, GSR, VEGFA and SCN10A genes. Patients with risk genotypes were at 1.53 to 18.01 higher odds to develop carboplatin/taxane-induced GI toxicity (P&#60;0.01). Variants in the VEGF gene were marginally associated with survival time. Our data provide potential targets for modulation/inhibition of GI toxicity in ovarian cancer patients

    A comparison of internal validation techniques for multifactor dimensionality reduction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>It is hypothesized that common, complex diseases may be due to complex interactions between genetic and environmental factors, which are difficult to detect in high-dimensional data using traditional statistical approaches. Multifactor Dimensionality Reduction (MDR) is the most commonly used data-mining method to detect epistatic interactions. In all data-mining methods, it is important to consider internal validation procedures to obtain prediction estimates to prevent model over-fitting and reduce potential false positive findings. Currently, MDR utilizes cross-validation for internal validation. In this study, we incorporate the use of a three-way split (3WS) of the data in combination with a post-hoc pruning procedure as an alternative to cross-validation for internal model validation to reduce computation time without impairing performance. We compare the power to detect true disease causing loci using MDR with both 5- and 10-fold cross-validation to MDR with 3WS for a range of single-locus and epistatic disease models. Additionally, we analyze a dataset in HIV immunogenetics to demonstrate the results of the two strategies on real data.</p> <p>Results</p> <p>MDR with 3WS is computationally approximately five times faster than 5-fold cross-validation. The power to find the exact true disease loci without detecting false positive loci is higher with 5-fold cross-validation than with 3WS before pruning. However, the power to find the true disease causing loci in addition to false positive loci is equivalent to the 3WS. With the incorporation of a pruning procedure after the 3WS, the power of the 3WS approach to detect only the exact disease loci is equivalent to that of MDR with cross-validation. In the real data application, the cross-validation and 3WS analyses indicate the same two-locus model.</p> <p>Conclusions</p> <p>Our results reveal that the performance of the two internal validation methods is equivalent with the use of pruning procedures. The specific pruning procedure should be chosen understanding the trade-off between identifying all relevant genetic effects but including false positives and missing important genetic factors. This implies 3WS may be a powerful and computationally efficient approach to screen for epistatic effects, and could be used to identify candidate interactions in large-scale genetic studies.</p

    A Comparison of Association Methods for Cytotoxicity Mapping in Pharmacogenomics

    Get PDF
    Cytotoxicity assays of immortalized lymphoblastoid cell lines (LCLs) represent a promising new in vitro approach in pharmacogenomics research. However, previous studies employing LCLs in gene mapping have used simple association methods, which may not adequately capture the true differences in non-linear response profiles between genotypes. Two common approaches summarize each dose-response curve with either the IC50 or the slope parameter estimates from a hill slope fit and treat these estimates as the response in a linear model. The current study investigates these two methods, as well as four novel methods, and compares their power to detect differences between the response profiles of genotypes under a variety of different alternatives. The four novel methods include two methods that summarize each dose-response by its area under the curve, one method based off of an analysis of variance (ANOVA) design, and one method that compares hill slope fits for all individuals of each genotype. The power of each method was found to depend not only on the choice of alternative, but also on the choice for the set of dosages used in cytotoxicity measurements. The ANOVA-based method was found to be the most robust across alternatives and dosage sets for power in detecting differences between genotypes
    corecore