21 research outputs found

    Scientific Workflow Integrity with Pegasus

    No full text
    IU Booth presentation at SC16, November 15, 2016

    Scientific Data Integrity Challenges to be addressed in Pegasus: SWIP Project Update

    No full text
    Lightning presentation at Advancing Research Computing on Campuses workshop located at PEARC17

    Enabling End-to-end Experiment Sharing and Reuse with Workflows via Jupyter Notebooks

    No full text
    Scientific workflows are a mainstream solution to process large-scale modeling, simulations, and data analytics computations in distributed systems, and have supported traditional and breakthrough researches across several domains. While scientific workflows have enabled large-scale scientific computations and data analysis, and lowered the barriers for experiment sharing, preservation (including provenance), and reuse between heterogeneous platforms (HTC and HPC), the reproducibility of an end-to-end scientific experiment is hindered by the lack of methodologies to capture pre- and post-analysis (or steps) performed out of the scope of the workflow execution. Online notebook technologies (e.g., Jupyter Notebook) emerged as an open-source web application that allows scientists to create and share documents that contain live code, equations, visualizations and explanatory text. Jupyter Notebooks has a strong potential to reduce the gap between researchers and the complex knowledge required to run large-scale scientific workflows via a programmatic high-level interface to access/manage workflow capabilities. This poster describes our approach for integrating the Pegasus workflow management system with Jupyter to foster easiness of usage, reproducibility (all the information to run an experiment is in a unique place), and reuse (notebooks are portable if running in equivalent environments). Since Pegasus 4.8, a Python API to declare and manage Pegasus workflows via Jupyter has been provided. The user can create a notebook and declare a workflow application using the Pegasus DAX API – allows the scientists to specify data or control dependencies between computational jobs. This API encapsulates most of Pegasus commands (e.g., plan, run, statistics, among others), and also allows workflow creation, execution, and monitoring. Additionally, the API also provides mechanisms to define Pegasus catalogs (sites, replica, and transformation), as well as to generate tutorial example workflows

    Association between <i>HNF1B</i> variants and endometrial cancer.

    No full text
    1<p>Odds ratio per allele obtained from logistic regression adjusting for age (continuous), 4 ancestry principal components, BMI (<25, 25-<30, ≥30 kg/m<sup>2</sup>).</p>2<p>P interaction with race/ethnicity in the MEC ≥0.63; P interaction with race/ethnicity in the WHI ≥0.21;</p>3<p>Combined ORs were calculated using a fixed effects model.</p

    Association between <i>HNF1B</i> variants and endometrial cancer by diabetes status.

    No full text
    1<p>Odds ratio per allele obtained from logistic regression adjusting for age (continuous), 4 ancestry principal components and BMI.</p>2<p>Combined ORs were calculated using a fixed effects model.</p><p>Test for interaction was assessed using log-likelihood test statistics comparing models with and without the interaction term.</p><p>P interaction for rs4430796 was 0.028 (WHI) and 0.93 (MEC); P interaction for rs7501939 was 0.054 (WHI) and 0.58 (MEC).</p

    Pleiotropy of Cancer Susceptibility Variants on the Risk of Non-Hodgkin Lymphoma: The PAGE Consortium

    No full text
    <div><p>Background</p><p>Risk of non-Hodgkin lymphoma (NHL) is higher among individuals with a family history or a prior diagnosis of other cancers. Genome-wide association studies (GWAS) have suggested that some genetic susceptibility variants are associated with multiple complex traits (pleiotropy).</p><p>Objective</p><p>We investigated whether common risk variants identified in cancer GWAS may also increase the risk of developing NHL as the first primary cancer.</p><p>Methods</p><p>As part of the Population Architecture using Genomics and Epidemiology (PAGE) consortium, 113 cancer risk variants were analyzed in 1,441 NHL cases and 24,183 controls from three studies (BioVU, Multiethnic Cohort Study, Women's Health Initiative) for their association with the risk of overall NHL and common subtypes [diffuse large B-cell lymphoma (DLBCL), follicular lymphoma (FL), chronic lymphocytic leukemia or small lymphocytic lymphoma (CLL/SLL)] using an additive genetic model adjusted for age, sex and ethnicity. Study-specific results for each variant were meta-analyzed across studies.</p><p>Results</p><p>The analysis of NHL subtype-specific GWAS SNPs and overall NHL suggested a shared genetic susceptibility between FL and DLBCL, particularly involving variants in the major histocompatibility complex region (rs6457327 in 6p21.33: FL OR = 1.29, <i>p</i> = 0.013; DLBCL OR = 1.23, <i>p</i> = 0.013; NHL OR = 1.22, <i>p</i> = 5.9×E-05). In the pleiotropy analysis, six risk variants for other cancers were associated with NHL risk, including variants for lung (rs401681 in <i>TERT</i>: OR per C allele = 0.89, <i>p</i> = 3.7×E-03; rs4975616 in <i>TERT</i>: OR per A allele = 0.90, <i>p</i> = 0.01; rs3131379 in <i>MSH5</i>: OR per T allele = 1.16, <i>p</i> = 0.03), prostate (rs7679673 in <i>TET2</i>: OR per C allele = 0.89, <i>p</i> = 5.7×E-03; rs10993994 in <i>MSMB</i>: OR per T allele = 1.09, <i>p</i> = 0.04), and breast (rs3817198 in <i>LSP1</i>: OR per C allele = 1.12, <i>p</i> = 0.01) cancers, but none of these associations remained significant after multiple test correction.</p><p>Conclusion</p><p>This study does not support strong pleiotropic effects of non-NHL cancer risk variants in NHL etiology; however, larger studies are warranted.</p></div

    Pleiotropic association of selected cancer susceptibility variants with the risk of overall non-Hodgkin lymphoma (NHL).

    No full text
    <p>* ORs and 95% CIs in individual studies were estimated in unconditional logistic regression models that were adjusted for age, sex (in BioVU and MEC) and ethnicity (ancestry informative markers). Summary ORs and 95% CIs were estimated in a meta-analysis of fixed-effects models.</p>†<p>The Bonferroni corrected <i>p-value</i> for 53 SNPs/tests is 4.4E-04.</p><p>Abbreviations: <i>p</i>-het. (<i>P</i>-values for heterogeneity across studies measured in Cochran's Q statistic); BioVU (the biorepository of the Vanderbilt University), MEC (the Multiethnic Cohort Study), WHI (the Women's Health Initiative).</p

    Associations between a risk score (RS) for 53 GWAS-identified cancer risk variants and the overall and subtype-specific risks of NHL.

    No full text
    <p>* ORs and 95% CIs in individual studies were estimated per risk allele in unconditional logistic regression models that were adjusted for age, sex (in BioVU and MEC) and ethnicity. Summary odds ratios (ORs) and 95% confidence intervals (CIs) were estimated in a meta-analysis of fixed effects models.</p><p>Abbreviations: <i>p-het</i>. (<i>p-values</i> for heterogeneity across studies measured in Cochran's Q statistic); BioVU (the biorepository of Vanderbilt University), MEC (the Multiethnic Cohort Study), WHI (the Women's Health Initiative).</p
    corecore