Search CORE

230 research outputs found

An R package implementation of multifactor dimensionality reduction

Author: Motsinger-Reif Alison A
Winham Stacey J
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background A breadth of high-dimensional data is now available with unprecedented numbers of genetic markers and data-mining approaches to variable selection are increasingly being utilized to uncover associations, including potential gene-gene and gene-environment interactions. One of the most commonly used data-mining methods for case-control data is Multifactor Dimensionality Reduction (MDR), which has displayed success in both simulations and real data applications. Additional software applications in alternative programming languages can improve the availability and usefulness of the method for a broader range of users. Results We introduce a package for the R statistical language to implement the Multifactor Dimensionality Reduction (MDR) method for nonparametric variable selection of interactions. This package is designed to provide an alternative implementation for R users, with great flexibility and utility for both data analysis and research. The 'MDR' package is freely available online at <url>http://www.r-project.org/</url>. We also provide data examples to illustrate the use and functionality of the package. Conclusions MDR is a frequently-used data-mining method to identify potential gene-gene interactions, and alternative implementations will further increase this usage. We introduce a flexible software package for R users.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A comparison of internal validation techniques for multifactor dimensionality reduction

Author: Motsinger-Reif Alison A
Slater Andrew J
Winham Stacey J
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background It is hypothesized that common, complex diseases may be due to complex interactions between genetic and environmental factors, which are difficult to detect in high-dimensional data using traditional statistical approaches. Multifactor Dimensionality Reduction (MDR) is the most commonly used data-mining method to detect epistatic interactions. In all data-mining methods, it is important to consider internal validation procedures to obtain prediction estimates to prevent model over-fitting and reduce potential false positive findings. Currently, MDR utilizes cross-validation for internal validation. In this study, we incorporate the use of a three-way split (3WS) of the data in combination with a post-hoc pruning procedure as an alternative to cross-validation for internal model validation to reduce computation time without impairing performance. We compare the power to detect true disease causing loci using MDR with both 5- and 10-fold cross-validation to MDR with 3WS for a range of single-locus and epistatic disease models. Additionally, we analyze a dataset in HIV immunogenetics to demonstrate the results of the two strategies on real data. Results MDR with 3WS is computationally approximately five times faster than 5-fold cross-validation. The power to find the exact true disease loci without detecting false positive loci is higher with 5-fold cross-validation than with 3WS before pruning. However, the power to find the true disease causing loci in addition to false positive loci is equivalent to the 3WS. With the incorporation of a pruning procedure after the 3WS, the power of the 3WS approach to detect only the exact disease loci is equivalent to that of MDR with cross-validation. In the real data application, the cross-validation and 3WS analyses indicate the same two-locus model. Conclusions Our results reveal that the performance of the two internal validation methods is equivalent with the use of pruning procedures. The specific pruning procedure should be chosen understanding the trade-off between identifying all relevant genetic effects but including false positives and missing important genetic factors. This implies 3WS may be a powerful and computationally efficient approach to screen for epistatic effects, and could be used to identify candidate interactions in large-scale genetic studies.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Methylation of Leukocyte DNA and Ovarian Cancer: Relationships with Disease Status and Outcome

Author: Armasu Sebastian M
Cicek Mine S
Fridley Brooke L
Kalli Kimberly R
Koestler Devin C
Larson Melissa C
Wang Chen
Winham Stacey J
Publication venue: Dartmouth Digital Commons
Publication date: 28/04/2014
Field of study

Genome-wide interrogation of DNA methylation (DNAm) in blood-derived leukocytes has become feasible with the advent of CpG genotyping arrays. In epithelial ovarian cancer (EOC), one report found substantial DNAm differences between cases and controls; however, many of these disease-associated CpGs were attributed to differences in white blood cell type distributions. We examined blood-based DNAm in 336 EOC cases and 398 controls; we included only high-quality CpG loci that did not show evidence of association with white blood cell type distributions to evaluate association with case status and overall survival

Dartmouth Digital Commons (Dartmouth College)

A Targeted Genetic Association Study of Epithelial Ovarian Cancer Susceptibility

Author: Anton-Culver Hoda
Bandera Elisa V
Berchuck Andrew
Chien Jeremy
Cook Linda S
Cramer Daniel
Doherty Jennifer A
Earp Madalene
Larson Nicholas
Permuth Jennifer B
Sicotte Hugues
Winham Stacey J
Publication venue: Dartmouth Digital Commons
Publication date: 01/02/2016
Field of study

BACKGROUND: Genome-wide association studies have identified several common susceptibility alleles for epithelial ovarian cancer (EOC). To further understand EOC susceptibility, we examined previously ungenotyped candidate variants, including uncommon variants and those residing within known susceptibility loci. RESULTS: At nine of eleven previously published EOC susceptibility regions (2q31, 3q25, 5p15, 8q21, 8q24, 10p12, 17q12, 17q21.31, and 19p13), novel variants were identified that were more strongly associated with risk than previously reported variants. Beyond known susceptibility regions, no variants were found to be associated with EOC risk at genome-wide statistical significance (p \u3c5x10(-8)), nor were any significant after Bonferroni correction for 17,000 variants (p\u3c 3x10-6). METHODS: A customized genotyping array was used to assess over 17,000 variants in coding, non-coding, regulatory, and known susceptibility regions in 4,973 EOC cases and 5,640 controls from 13 independent studies. Susceptibility for EOC overall and for select histotypes was evaluated using logistic regression adjusted for age, study site, and population substructure. CONCLUSION: Given the novel variants identified within the 2q31, 3q25, 5p15, 8q21, 8q24, 10p12, 17q12, 17q21.31, and 19p13 regions, larger follow-up genotyping studies, using imputation where necessary, are needed for fine-mapping and confirmation of low frequency variants that fall below statistical significance

Dartmouth Digital Commons (Dartmouth College)

Bipolar disorder with binge eating behavior: a genome-wide association study implicates PRR5-ARHGAP8

Author: Biernacka Joanna M.
Colby Colin L.
Crow Scott
Cuellar Barboza Alfredo B.
Frye Mark A.
Ho Ada Man-Choi
Larrabee Beth R.
McElroy Susan L.
Sicotte Hugues
Winham Stacey J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2018
Field of study

Bipolar disorder (BD) is associated with binge eating behavior (BE), and both conditions are heritable. Previously, using data from the Genetic Association Information Network (GAIN) study of BD, we performed genome-wide association (GWA) analyses of BD with BE comorbidity. Here, utilizing data from the Mayo Clinic BD Biobank (969 BD cases, 777 controls), we performed a GWA analysis of a BD subtype deﬁned by BE, and case-only analysis comparing BD subjects with and without BE. We then performed a meta-analysis of the Mayo and GAIN results. The meta-analysis provided genome-wide signiﬁcant evidence of association between single nucleotide polymorphisms (SNPs) in PRR5-ARHGAP8 and BE in BD cases (rs726170 OR=1.91, P=3.05E-08). In the meta-analysis comparing cases with BD with comorbid BE vs. non-BD controls, a genome-wide signiﬁcant association was observed at SNP rs111940429 in an intergenic region near PPP1R2P5 (p=1.21E-08). PRR5-ARHGAP8 is a read-through transcript resulting in a fusion protein of PRR5 and ARHGAP8. PRR5 encodes a subunit of mTORC2, a serine/threonine kinase that participates in food intake regulation, while ARHGAP8 encodes a member of the RhoGAP family of proteins that mediate cross-talk between Rho GTPases and other signaling pathways. Without BE information in controls, it is not possible to determine whether the observed association reﬂects a risk factor for BE in general, risk for BE in individuals with BD, or risk of a subtype of BD with BE. The effect of PRR5-ARHGAP8 on BE risk thus warrants further investigation

Repositorio Academico Digital UANL

Directory of Open Access Journals

CYP2C8*3 predicts benefit/risk profile in breast cancer patients receiving neoadjuvant paclitaxel

Author: Carey Lisa A.
Dees E. Claire
Drobish Amy
Hertz Daniel L.
McLeod Howard L.
Motsinger-Reif Alison A.
Winham Stacey J.
Publication venue
Publication date: 01/01/2012
Field of study

Paclitaxel is one of the most frequently used chemotherapeutic agents for the treatment of breast cancer patients. Using a candidate gene approach, we hypothesized that polymorphisms in genes relevant to the metabolism and transport of paclitaxel are associated with treatment efficacy and toxicity. Patient and tumor characteristics and treatment outcomes were collected prospectively for breast cancer patients treated with paclitaxel-containing regimens in the neoadjuvant setting. Treatment response was measured before and after each phase of treatment by clinical tumor measurement and categorized according to RECIST criteria, while toxicity data were collected from physician notes. The primary endpoint was achievement of clinical complete response (cCR) and secondary endpoints included clinical response rate (complete response + partial response) and grade 3+ peripheral neuropathy. The genotypes and haplotypes assessed were CYP1B1*3, CYP2C8*3, CYP3A4*1B/CYP3A5*3C, and ABCB1*2. A total of 111 patients were included in this study. Overall, cCR was 30.1 % to the paclitaxel component. CYP2C8*3 carriers (23/111, 20.7 %) had higher rates of cCR (55 % vs. 23 %; OR = 3.92 [95 % CI: 1.46–10.48], corrected p = 0.046). In the secondary toxicity analysis, we observed a trend toward greater risk of severe neuropathy (22 % vs. 8 %; OR = 3.13 [95 % CI: 0.89–11.01], uncorrected p = 0.075) in subjects carrying the CYP2C8*3 variant. Other polymorphisms interrogated were not significantly associated with response or toxicity. Patients carrying CYP2C8*3 are more likely to achieve clinical complete response from neoadjuvant paclitaxel treatment, but may also be at increased risk of experiencing severe peripheral neurotoxicity

PubMed Central

Carolina Digital Repository

DNA Methylation Profiles of Ovarian Clear Cell Carcinoma

Author: Chiew Yoke-Eng
Churchman Michael
D. Brenton James
DeFazio Anna
Drapkin Ronny
Elishaev Esther
Fu Zhuxuan
G Huntsman David
Gourley Charlie
H Brand Alison
J. Kennedy Catherine
J. Winham Stacey
Konner Jason
Köbel Martin
L Goode Ellen
L. Bolton Kelly
Laslavic Angela
Lawrenson Kate
Lester Jenny
M Armasu Sebastian
M Cunningham Julie
M. Elias Kevin
McCauley Bryan M
Modugno Francesmary
Papaemmanuil Elli
Piskorz Anna
Sekowska Magdalena
Wang Chen
Weiglt Britta
Y. Karlan Beth
Publication venue: 'American Association for Cancer Research (AACR)'
Publication date: 25/10/2021
Field of study

BACKGROUND: Ovarian clear cell carcinoma (OCCC) is a rare ovarian cancer histotype that tends to be resistant to standard platinum-based chemotherapeutics. We sought to better understand the role of DNA methylation in clinical and biological subclassification of OCCC. METHODS: We interrogated genome-wide methylation using DNA from fresh frozen tumors from 271 cases, applied non-smooth non-negative matrix factorization (nsNMF) clustering, and evaluated clinical associations and biological pathways. RESULTS: Two approximately equally sized clusters that associated with several clinical features were identified. Compared to Cluster 2 (N=137), Cluster 1 cases (N=134) presented at a more advanced stage, were less likely to be of Asian ancestry, and tended to have poorer outcomes including macroscopic residual disease following primary debulking surgery (p-values <0.10). Subset analyses of targeted tumor sequencing and immunohistochemical data revealed that Cluster 1 tumors showed TP53 mutation and abnormal p53 expression, and Cluster 2 tumors showed aneuploidy and ARID1A/PIK3CA mutation (p-values <0.05). Cluster-defining CpGs included 1,388 CpGs residing within 200 bp of the transcription start sites of 977 genes; 38% of these genes (N=369 genes) were differentially expressed across cluster in transcriptomic subset analysis (p-values <10(−4)). Differentially expressed genes were enriched for six immune-related pathways, including interferon alpha and gamma responses (p-values < 10(−6)). CONCLUSIONS: DNA methylation clusters in OCCC correlate with disease features and gene expression patterns among immune pathways. IMPACT: This work serves as a foundation for integrative analyses that better understand the complex biology of OCCC in an effort to improve potential for development of targeted therapeutics

PubMed Central

Edinburgh Research Explorer

eScholarship - University of California

Grammatical evolution decision trees for detecting gene-gene interactions

Author: AA Motsinger
AA Motsinger-Reif
AA Motsinger-Reif
Alison A Motsinger-Reif
BA Shepherd
BLG Miller
CS Greene
D Altshuler
DB Goldstein
DR Velez
E Alpaydin
E Cantu-Paz
HJ Cordell
IH Witten
J Koza
J Koza
JH Moore
JH Moore
JH Moore
JH Moore
JH Moore
JN Hirschhorn
JR Quinlan
JS Aguilar-Ruiz
L Brieman
LGL Devroy
M Hall
M O'Neill
M O'Neill
MD Ritchie
MR Nelson
Nicholas E Hardison
R Bellman
R Culverhouse
RJ Neuman
SM Dudek
Stacey J Winham
Sushamna Deodhar
TJ Hastie
W Li
X Yao
Publication venue: BioMed Central
Publication date: 01/11/2010
Field of study

Abstract Background A fundamental goal of human genetics is the discovery of polymorphisms that predict common, complex diseases. It is hypothesized that complex diseases are due to a myriad of factors including environmental exposures and complex genetic risk models, including gene-gene interactions. Such epistatic models present an important analytical challenge, requiring that methods perform not only statistical modeling, but also variable selection to generate testable genetic model hypotheses. This challenge is amplified by recent advances in genotyping technology, as the number of potential predictor variables is rapidly increasing. Methods Decision trees are a highly successful, easily interpretable data-mining method that are typically optimized with a hierarchical model building approach, which limits their potential to identify interacting effects. To overcome this limitation, we utilize evolutionary computation, specifically grammatical evolution, to build decision trees to detect and model gene-gene interactions. In the current study, we introduce the Grammatical Evolution Decision Trees (GEDT) method and software and evaluate this approach on simulated data representing gene-gene interaction models of a range of effect sizes. We compare the performance of the method to a traditional decision tree algorithm and a random search approach and demonstrate the improved performance of the method to detect purely epistatic interactions. Results The results of our simulations demonstrate that GEDT has high power to detect even very moderate genetic risk models. GEDT has high power to detect interactions with and without main effects. Conclusions GEDT, while still in its initial stages of development, is a promising new approach for identifying gene-gene interactions in genetic association studies.</p

Crossref

Directory of Open Access Journals

PubMed Central

SNP interaction detection with Random Forests in high-dimensional genetic data

Author: A Bureau
AA Motsinger-Reif
BA Goldstein
BA Goldstein
BA McKinney
C Cortes
C Strobl
C Strobl
Colin L Colby
DF Schwarz
DF Schwarz
DS Falconer
EE Eichler
G Biau
G Biau
G Biau
G Montana
HJ Cordell
J Marchini
JH Moore
JN Hirschhorn
Joanna M Biernacka
KK Nicodemus
KL Lunetta
L Breiman
L Breiman
L Breiman
LJ Bierut
Marianne Huebner
Mariza de Andrade
MD Ritchie
MD Ritchie
MI McCarthy
P Scheet
PR Lucek
R Culverhouse
R Diaz-Uriarte
R Tibshirani
Robert R Freimuth
Stacey J Winham
TA Manolio
Xin Wang
YA Meng
YV Sun
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref