13 research outputs found
Comparing FutureGrid, Amazon EC2, and Open Science Grid for Scientific Workflows
Scientists have a number of computing infrastructures available to conduct their research, including grids and public or
private clouds. This paper explores the use of these cyberinfrastructures to execute scientific workflows, an important
class of scientific applications. It examines the benefits and drawbacks of cloud and grid systems using the case study
of an astronomy application. The application analyzes data from the NASA Kepler mission in order to compute
periodograms, which help astronomers detect the periodic dips in the intensity of starlight caused by exoplanets as they
transit their host star. In this paper we describe our experiences modeling the periodogram application as a scientific
workflow using Pegasus, and deploying it on the FutureGrid scientific cloud testbed, the Amazon EC2 commercial
cloud, and the Open Science Grid. We compare and contrast the infrastructures in terms of setup, usability, cost,
resource availability and performance
HNF1B and Endometrial Cancer Risk: Results from the PAGE study
We examined the association between HNF1B variants identified in a recent genome-wide association study and endometrial cancer in two large case-control studies nested in prospective cohorts: the Multiethnic Cohort Study (MEC) and the Women's Health Initiative (WHI) as part of the Population Architecture using Genomics and Epidemiology (PAGE) study. A total of 1,357 incident cases of invasive endometrial cancer and 7,609 controls were included in the analysis (MEC: 426 cases/3,854 controls; WHI: 931cases/3,755 controls). The majority of women in the WHI were European American, while the MEC included sizable numbers of African Americans, Japanese and Latinos. We estimated the odds ratios (ORs) per allele and 95% confidence intervals (CIs) of each SNP using unconditional logistic regression adjusting for age, body mass index, and four principal components of ancestry informative markers. The combined ORs were estimated using fixed effect models. Rs4430796 and rs7501939 were associated with endometrial cancer risk in MEC and WHI with no heterogeneity observed across racial/ethnic groups (P≥0.21) or between studies (P≥0.70). The ORper allele was 0.82 (95% CI: 0.75, 0.89; P = 5.63×10−6) for rs4430796 (G allele) and 0.79 (95% CI: 0.73, 0.87; P = 3.77×10−7) for rs7501939 (A allele). The associations with the risk of Type I and Type II tumors were similar (P≥0.19). Adjustment for additional endometrial cancer risk factors such as parity, oral contraceptive use, menopausal hormone use, and smoking status had little effect on the results. In conclusion, HNF1B SNPs are associated with risk of endometrial cancer and that the associated relative risks are similar for Type I and Type II tumors
Association of Genetic Variants and Incident Coronary Heart Disease in Multiethnic Cohorts: The PAGE Study
Genome wide association studies identified several single nucleotide polymorphisms (SNPs) associated with prevalent coronary heart disease (CHD) but less is known of associations with incident CHD. The association of thirteen published CHD SNPs was examined in five ancestry groups of four large US prospective cohorts
The application of cloud computing to scientific workflows: a study of cost and performance
The current model of transferring data from data centres to desktops for analysis will soon be rendered impractical by the accelerating growth in the volume of science datasets. Processing will instead often take place on high-performance servers co-located with data. Evaluations of how new technologies such as cloud computing would support such a new distributed computing model are urgently needed. Cloud computing is a new way of purchasing computing and storage resources on demand through virtualization technologies. We report here the results of investigations of the applicability of commercial cloud computing to scientific computing, with an emphasis on astronomy, including investigations of what types of applications can be run cheaply and efficiently on the cloud, and an example of an application well suited to the cloud: processing a large dataset to create a new science product
Experiences using cloud computing for a scientific workflow application
Clouds are rapidly becoming an important platform for scientific applications. In this paper we describe our experiences running a scientific workflow application in the cloud. The application was developed to process astronomy data released by the Kepler project, a NASA mission to search for Earth-like planets orbiting other stars. This workflow was deployed across multiple clouds using the Pegasus Workflow Management System. The clouds used include several sites within the FutureGrid, NERSC's Magellan cloud, and Amazon EC2. We describe how the application was deployed, evaluate its performance executing in different clouds (based on Nimbus, Eucalyptus, and EC2), and discuss the challenges of deploying and executing workflows in a cloud environment. We also demonstrate how Pegasus was able to support sky computing by executing a single workflow across multiple cloud infrastructures simultaneously
Characteristics of Cases and Controls in the Multiethnic Cohort Study (MEC) and the Women's Health Initiative Study (WHI).
1<p>Age at diagnosis for cases and age at blood draw for controls in the MEC; age at baseline for cases and controls in the WHI.</p>2<p>Japanese in the MEC, approximately 25% Chinese, 50% Japanese, and 25% other groups in the WHI.</p
Association between <i>HNF1B</i> variants and endometrial cancer.
1<p>Odds ratio per allele obtained from logistic regression adjusting for age (continuous), 4 ancestry principal components, BMI (<25, 25-<30, ≥30 kg/m<sup>2</sup>).</p>2<p>P interaction with race/ethnicity in the MEC ≥0.63; P interaction with race/ethnicity in the WHI ≥0.21;</p>3<p>Combined ORs were calculated using a fixed effects model.</p