211 research outputs found

    An R package implementation of multifactor dimensionality reduction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A breadth of high-dimensional data is now available with unprecedented numbers of genetic markers and data-mining approaches to variable selection are increasingly being utilized to uncover associations, including potential gene-gene and gene-environment interactions. One of the most commonly used data-mining methods for case-control data is Multifactor Dimensionality Reduction (MDR), which has displayed success in both simulations and real data applications. Additional software applications in alternative programming languages can improve the availability and usefulness of the method for a broader range of users.</p> <p>Results</p> <p>We introduce a package for the R statistical language to implement the Multifactor Dimensionality Reduction (MDR) method for nonparametric variable selection of interactions. This package is designed to provide an alternative implementation for R users, with great flexibility and utility for both data analysis and research. The 'MDR' package is freely available online at <url>http://www.r-project.org/</url>. We also provide data examples to illustrate the use and functionality of the package.</p> <p>Conclusions</p> <p>MDR is a frequently-used data-mining method to identify potential gene-gene interactions, and alternative implementations will further increase this usage. We introduce a flexible software package for R users.</p

    A comparison of internal validation techniques for multifactor dimensionality reduction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>It is hypothesized that common, complex diseases may be due to complex interactions between genetic and environmental factors, which are difficult to detect in high-dimensional data using traditional statistical approaches. Multifactor Dimensionality Reduction (MDR) is the most commonly used data-mining method to detect epistatic interactions. In all data-mining methods, it is important to consider internal validation procedures to obtain prediction estimates to prevent model over-fitting and reduce potential false positive findings. Currently, MDR utilizes cross-validation for internal validation. In this study, we incorporate the use of a three-way split (3WS) of the data in combination with a post-hoc pruning procedure as an alternative to cross-validation for internal model validation to reduce computation time without impairing performance. We compare the power to detect true disease causing loci using MDR with both 5- and 10-fold cross-validation to MDR with 3WS for a range of single-locus and epistatic disease models. Additionally, we analyze a dataset in HIV immunogenetics to demonstrate the results of the two strategies on real data.</p> <p>Results</p> <p>MDR with 3WS is computationally approximately five times faster than 5-fold cross-validation. The power to find the exact true disease loci without detecting false positive loci is higher with 5-fold cross-validation than with 3WS before pruning. However, the power to find the true disease causing loci in addition to false positive loci is equivalent to the 3WS. With the incorporation of a pruning procedure after the 3WS, the power of the 3WS approach to detect only the exact disease loci is equivalent to that of MDR with cross-validation. In the real data application, the cross-validation and 3WS analyses indicate the same two-locus model.</p> <p>Conclusions</p> <p>Our results reveal that the performance of the two internal validation methods is equivalent with the use of pruning procedures. The specific pruning procedure should be chosen understanding the trade-off between identifying all relevant genetic effects but including false positives and missing important genetic factors. This implies 3WS may be a powerful and computationally efficient approach to screen for epistatic effects, and could be used to identify candidate interactions in large-scale genetic studies.</p

    Methylation of Leukocyte DNA and Ovarian Cancer: Relationships with Disease Status and Outcome

    Get PDF
    Genome-wide interrogation of DNA methylation (DNAm) in blood-derived leukocytes has become feasible with the advent of CpG genotyping arrays. In epithelial ovarian cancer (EOC), one report found substantial DNAm differences between cases and controls; however, many of these disease-associated CpGs were attributed to differences in white blood cell type distributions. We examined blood-based DNAm in 336 EOC cases and 398 controls; we included only high-quality CpG loci that did not show evidence of association with white blood cell type distributions to evaluate association with case status and overall survival

    A Targeted Genetic Association Study of Epithelial Ovarian Cancer Susceptibility

    Get PDF
    BACKGROUND: Genome-wide association studies have identified several common susceptibility alleles for epithelial ovarian cancer (EOC). To further understand EOC susceptibility, we examined previously ungenotyped candidate variants, including uncommon variants and those residing within known susceptibility loci. RESULTS: At nine of eleven previously published EOC susceptibility regions (2q31, 3q25, 5p15, 8q21, 8q24, 10p12, 17q12, 17q21.31, and 19p13), novel variants were identified that were more strongly associated with risk than previously reported variants. Beyond known susceptibility regions, no variants were found to be associated with EOC risk at genome-wide statistical significance (p \u3c5x10(-8)), nor were any significant after Bonferroni correction for 17,000 variants (p\u3c 3x10-6). METHODS: A customized genotyping array was used to assess over 17,000 variants in coding, non-coding, regulatory, and known susceptibility regions in 4,973 EOC cases and 5,640 controls from 13 independent studies. Susceptibility for EOC overall and for select histotypes was evaluated using logistic regression adjusted for age, study site, and population substructure. CONCLUSION: Given the novel variants identified within the 2q31, 3q25, 5p15, 8q21, 8q24, 10p12, 17q12, 17q21.31, and 19p13 regions, larger follow-up genotyping studies, using imputation where necessary, are needed for fine-mapping and confirmation of low frequency variants that fall below statistical significance

    Bipolar disorder with binge eating behavior: a genome-wide association study implicates PRR5-ARHGAP8

    Get PDF
    Bipolar disorder (BD) is associated with binge eating behavior (BE), and both conditions are heritable. Previously, using data from the Genetic Association Information Network (GAIN) study of BD, we performed genome-wide association (GWA) analyses of BD with BE comorbidity. Here, utilizing data from the Mayo Clinic BD Biobank (969 BD cases, 777 controls), we performed a GWA analysis of a BD subtype defined by BE, and case-only analysis comparing BD subjects with and without BE. We then performed a meta-analysis of the Mayo and GAIN results. The meta-analysis provided genome-wide significant evidence of association between single nucleotide polymorphisms (SNPs) in PRR5-ARHGAP8 and BE in BD cases (rs726170 OR=1.91, P=3.05E-08). In the meta-analysis comparing cases with BD with comorbid BE vs. non-BD controls, a genome-wide significant association was observed at SNP rs111940429 in an intergenic region near PPP1R2P5 (p=1.21E-08). PRR5-ARHGAP8 is a read-through transcript resulting in a fusion protein of PRR5 and ARHGAP8. PRR5 encodes a subunit of mTORC2, a serine/threonine kinase that participates in food intake regulation, while ARHGAP8 encodes a member of the RhoGAP family of proteins that mediate cross-talk between Rho GTPases and other signaling pathways. Without BE information in controls, it is not possible to determine whether the observed association reflects a risk factor for BE in general, risk for BE in individuals with BD, or risk of a subtype of BD with BE. The effect of PRR5-ARHGAP8 on BE risk thus warrants further investigation

    CYP2C8*3 predicts benefit/risk profile in breast cancer patients receiving neoadjuvant paclitaxel

    Get PDF
    Paclitaxel is one of the most frequently used chemotherapeutic agents for the treatment of breast cancer patients. Using a candidate gene approach, we hypothesized that polymorphisms in genes relevant to the metabolism and transport of paclitaxel are associated with treatment efficacy and toxicity. Patient and tumor characteristics and treatment outcomes were collected prospectively for breast cancer patients treated with paclitaxel-containing regimens in the neoadjuvant setting. Treatment response was measured before and after each phase of treatment by clinical tumor measurement and categorized according to RECIST criteria, while toxicity data were collected from physician notes. The primary endpoint was achievement of clinical complete response (cCR) and secondary endpoints included clinical response rate (complete response + partial response) and grade 3+ peripheral neuropathy. The genotypes and haplotypes assessed were CYP1B1*3, CYP2C8*3, CYP3A4*1B/CYP3A5*3C, and ABCB1*2. A total of 111 patients were included in this study. Overall, cCR was 30.1 % to the paclitaxel component. CYP2C8*3 carriers (23/111, 20.7 %) had higher rates of cCR (55 % vs. 23 %; OR = 3.92 [95 % CI: 1.46–10.48], corrected p = 0.046). In the secondary toxicity analysis, we observed a trend toward greater risk of severe neuropathy (22 % vs. 8 %; OR = 3.13 [95 % CI: 0.89–11.01], uncorrected p = 0.075) in subjects carrying the CYP2C8*3 variant. Other polymorphisms interrogated were not significantly associated with response or toxicity. Patients carrying CYP2C8*3 are more likely to achieve clinical complete response from neoadjuvant paclitaxel treatment, but may also be at increased risk of experiencing severe peripheral neurotoxicity

    DNA Methylation Profiles of Ovarian Clear Cell Carcinoma

    Get PDF
    BACKGROUND: Ovarian clear cell carcinoma (OCCC) is a rare ovarian cancer histotype that tends to be resistant to standard platinum-based chemotherapeutics. We sought to better understand the role of DNA methylation in clinical and biological subclassification of OCCC. METHODS: We interrogated genome-wide methylation using DNA from fresh frozen tumors from 271 cases, applied non-smooth non-negative matrix factorization (nsNMF) clustering, and evaluated clinical associations and biological pathways. RESULTS: Two approximately equally sized clusters that associated with several clinical features were identified. Compared to Cluster 2 (N=137), Cluster 1 cases (N=134) presented at a more advanced stage, were less likely to be of Asian ancestry, and tended to have poorer outcomes including macroscopic residual disease following primary debulking surgery (p-values <0.10). Subset analyses of targeted tumor sequencing and immunohistochemical data revealed that Cluster 1 tumors showed TP53 mutation and abnormal p53 expression, and Cluster 2 tumors showed aneuploidy and ARID1A/PIK3CA mutation (p-values <0.05). Cluster-defining CpGs included 1,388 CpGs residing within 200 bp of the transcription start sites of 977 genes; 38% of these genes (N=369 genes) were differentially expressed across cluster in transcriptomic subset analysis (p-values <10(−4)). Differentially expressed genes were enriched for six immune-related pathways, including interferon alpha and gamma responses (p-values < 10(−6)). CONCLUSIONS: DNA methylation clusters in OCCC correlate with disease features and gene expression patterns among immune pathways. IMPACT: This work serves as a foundation for integrative analyses that better understand the complex biology of OCCC in an effort to improve potential for development of targeted therapeutics

    Grammatical evolution decision trees for detecting gene-gene interactions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A fundamental goal of human genetics is the discovery of polymorphisms that predict common, complex diseases. It is hypothesized that complex diseases are due to a myriad of factors including environmental exposures and complex genetic risk models, including gene-gene interactions. Such epistatic models present an important analytical challenge, requiring that methods perform not only statistical modeling, but also variable selection to generate testable genetic model hypotheses. This challenge is amplified by recent advances in genotyping technology, as the number of potential predictor variables is rapidly increasing.</p> <p>Methods</p> <p>Decision trees are a highly successful, easily interpretable data-mining method that are typically optimized with a hierarchical model building approach, which limits their potential to identify interacting effects. To overcome this limitation, we utilize evolutionary computation, specifically grammatical evolution, to build decision trees to detect and model gene-gene interactions. In the current study, we introduce the Grammatical Evolution Decision Trees (GEDT) method and software and evaluate this approach on simulated data representing gene-gene interaction models of a range of effect sizes. We compare the performance of the method to a traditional decision tree algorithm and a random search approach and demonstrate the improved performance of the method to detect purely epistatic interactions.</p> <p>Results</p> <p>The results of our simulations demonstrate that GEDT has high power to detect even very moderate genetic risk models. GEDT has high power to detect interactions with and without main effects.</p> <p>Conclusions</p> <p>GEDT, while still in its initial stages of development, is a promising new approach for identifying gene-gene interactions in genetic association studies.</p

    Analyses of germline variants associated with ovarian cancer survival identify functional candidates at the 1q22 and 19p12 outcome loci.

    Get PDF
    We previously identified associations with ovarian cancer outcome at five genetic loci. To identify putatively causal genetic variants and target genes, we prioritized two ovarian outcome loci (1q22 and 19p12) for further study. Bioinformatic and functional genetic analyses indicated that MEF2D and ZNF100 are targets of candidate outcome variants at 1q22 and 19p12, respectively. At 19p12, the chromatin interaction of a putative regulatory element with the ZNF100 promoter region correlated with candidate outcome variants. At 1q22, putative regulatory elements enhanced MEF2D promoter activity and haplotypes containing candidate outcome variants modulated these effects. In a public dataset, MEF2D and ZNF100 expression were both associated with ovarian cancer progression-free or overall survival time. In an extended set of 6,162 epithelial ovarian cancer patients, we found that functional candidates at the 1q22 and 19p12 loci, as well as other regional variants, were nominally associated with patient outcome; however, no associations reached our threshold for statistical significance (p<1×10-5). Larger patient numbers will be needed to convincingly identify any true associations at these loci.The OCAC Oncoarray genotyping project was funded through grants from the U.S. National Institutes of Health 2 (NIH) (CA1X01HG007491-01, U19-CA148112, R01-CA149429 and R01-CA058598); Canadian Institutes of Health 3 Research (MOP-86727) and the Ovarian Cancer Research Fund (OCRF). Funding for the iCOGS infrastructure came from: the European Community’s Seventh Framework Programme under grant agreement n° 223175 (HEALTH-F2-2009-223175) (COGS), Cancer Research UK (C1287/A10118, C1287/A 10710, C12292/A11174, C1281/A12014, C5047/A8384, C5047/A15007, C5047/A10692, C8197/A16565), the National Institutes of Health (CA128978) and Post-Cancer GWAS initiative (1U19 CA148537, 1U19 CA148065 and 1U19 CA148112 - the GAME-ON initiative), the Department of Defence (W81XWH-10-1-0341), the Canadian Institutes of Health Research (CIHR) for the CIHR Team in Familial Risks of Breast Cancer, Komen Foundation for the Cure, the Breast Cancer Research Foundation, and the Ovarian Cancer Research Fund. AUS studies (Australian Ovarian Cancer Study and the Australian Cancer Study) were funded by the U.S. Army Medical Research and Materiel Command (DAMD17-01-1-0729), National Health & Medical Research Council of Australia (199600 and 400281), Cancer Councils of New South Wales, Victoria, Queensland, South Australia and Tasmania, Cancer Foundation of Western Australia (Multi-State Application Numbers 191, 211 and 182). The Bavarian study (BAV) was supported by ELAN Funds of the University of Erlangen-Nuremberg. The Belgian study (BEL) was funded by Nationaal Kankerplan. The BVU study was funded by Vanderbilt CTSA grant from the National Institutes of Health (NIH)/National Center for Advancing Translational Sciences (NCATS) (ULTR000445). The CNIO Ovarian Cancer Study (CNI) study was supported by Instituto de Salud Carlos III (PI 12/01319); Ministerio de Economía y Competitividad (SAF2012). The Hawaii Ovarian Cancer Study (HAW) was supported the U.S. National Institutes of Health (R01-CA58598, N01-CN-55424 and N01-PC-67001). The Hannover-Jena Ovarian Cancer Study (HJO) study was funded by intramural funding through the Rudolf-Bartling Foundation. The Hormones and Ovarian Cancer Prediction study (HOP) was supported by US National Cancer Institute: K07-CA80668; R01CA095023; P50-CA159981; R01-CA126841; US Army Medical Research and Materiel Command: DAMD17-02-1-0669; NIH/National Center for Research Resources/General Clinical Research Center grant MO1- RR000056. The Women’s Cancer Program (LAX) was supported by the American Cancer Society Early Detection Professorship (120950-SIOP-06-258-06-COUN) and the National Center for Advancing Translational Sciences (NCATS), Grant UL1TR000124. The Mayo Clinic Case-Only Ovarian Cancer Study (MAC) and the Mayo Clinic Ovarian Cancer Case-Control Study (MAY) were funded by the National Institutes of Health (R01-CA122443, P30-CA15083, P50-CA136393); Mayo Foundation; Minnesota Ovarian Cancer Alliance; Fred C. and Katherine B. Andersen Foundation; Fraternal Order of Eagles. The MALOVA study (MAL) was funded by research grant R01- CA61107 from the National Cancer Institute, Bethesda, Md; research grant 94 222 52 from the Danish Cancer Society, Copenhagen, Denmark; and the Mermaid I project. The North Carolina Ovarian Cancer Study (NCO) National Institutes of Health (R01-CA76016) and the Department of Defense (DAMD17-02-1-0666). The New England-based Case-Control Study of Ovarian Cancer (NEC) was supported by NIH grants R01 CA 054419-10 and P50 CA105009, and Department of Defense CDMRP grant W81XWH-10-1-0280. The University of Bergen, Haukeland University Hospital, Norway study (NOR) was funded by Helse Vest, The Norwegian Cancer Society, The Research Council of Norway. The Oregon study (ORE) was funded by the Sherie Hildreth Ovarian Cancer Research Fund and the OHSU Foundation. The Ovarian Cancer Prognosis and Lifestyle Study (OPL) was funded by National Health and Medical Research Council (NHMRC) of Australia (APP1025142) and Brisbane Women’s Club. The Danish Pelvic Mass Study (PVD) was funded by Herlev Hospitals Forskningsråd, Direktør Jacob Madsens og Hustru Olga Madsens fond, Arvid Nilssons fond, Gangsted fonden, Herlev Hospitals Forskningsråd and Danish Cancer Society. The Royal Brisbane Hospital (RBH) study was funded by the National Health and Medical Research Council of Australia. The Scottish Randomised Trial in Ovarian Cancer study (SRO) was funded by Cancer Research UK (C536/A13086, C536/A6689) and Imperial Experimental Cancer Research Centre (C1312/A15589). The Princess Margaret Cancer Centre study (UHN) was funded by Princess Margaret Cancer Centre Foundation-Bridge for the Cure. The Gynaecological Oncology Biobank at Westmead (WMH) is a member of the Australasian Biospecimen Network-Oncology group, funded by the Australian National Health and Medical Research Council Enabling Grants ID 310670 & ID 628903 and the Cancer Institute NSW Grants ID 12/RIG/1-17 and 15/RIG/1-16. OVCARE Gynecologic Tissue Bank and Outcomes Unit (VAN) study was funded by BC Cancer Foundation, VGH & UBC Hospital Foundation. Stuart MacGregor acknowledges funding from an Australian Research Council Future Fellowship and an Australian National Health and Medical Research Council project grant (APP1051698). Anna deFazio was funded by the University of Sydney Cancer Research Fund and the Cancer Institute NSW through the Sydney West-Translational Cancer Research Centre. Dr. Beth Y. Karlan is supported by American Cancer Society Early Detection Professorship (SIOP-06-258-01-COUN) and the National Center for Advancing Translational Sciences (NCATS), Grant UL1TR000124. Irene Orlow was supported by NCI CCSG award (P30-CA008748). GCT, PW and TO’M were funded by NHMRC Fellowships
    • …
    corecore