Search CORE

6 research outputs found

Reusable, extensible, and modifiable R scripts and Kepler workflows for comprehensive single set ChIP-seq analysis

Author: Mark Bieda
Nathan Cormier
Tyler Kolisnik
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

BACKGROUND: There has been an enormous expansion of use of chromatin immunoprecipitation followed by sequencing (ChIP-seq) technologies. Analysis of large-scale ChIP-seq datasets involves a complex series of steps and production of several specialized graphical outputs. A number of systems have emphasized custom development of ChIP-seq pipelines. These systems are primarily based on custom programming of a single, complex pipeline or supply libraries of modules and do not produce the full range of outputs commonly produced for ChIP-seq datasets. It is desirable to have more comprehensive pipelines, in particular ones addressing common metadata tasks, such as pathway analysis, and pipelines producing standard complex graphical outputs. It is advantageous if these are highly modular systems, available as both turnkey pipelines and individual modules, that are easily comprehensible, modifiable and extensible to allow rapid alteration in response to new analysis developments in this growing area. Furthermore, it is advantageous if these pipelines allow data provenance tracking. RESULTS: We present a set of 20 ChIP-seq analysis software modules implemented in the Kepler workflow system; most (18/20) were also implemented as standalone, fully functional R scripts. The set consists of four full turnkey pipelines and 16 component modules. The turnkey pipelines in Kepler allow data provenance tracking. Implementation emphasized use of common R packages and widely-used external tools (e.g., MACS for peak finding), along with custom programming. This software presents comprehensive solutions and easily repurposed code blocks for ChIP-seq analysis and pipeline creation. Tasks include mapping raw reads, peakfinding via MACS, summary statistics, peak location statistics, summary plots centered on the transcription start site (TSS), gene ontology, pathway analysis, and de novo motif finding, among others. CONCLUSIONS: These pipelines range from those performing a single task to those performing full analyses of ChIP-seq data. The pipelines are supplied as both Kepler workflows, which allow data provenance tracking, and, in the majority of cases, as standalone R scripts. These pipelines are designed for ease of modification and repurposing

Springer - Publisher Connector

PubMed Central

Scipedia

Reusable, extensible, and modifiable R scripts and Kepler workflows for comprehensive single set ChIP-seq analysis

Author: A Barski
AF Bardet
AR Quinlan
B Langmead
B Ludäscher
C Zang
E Mercier
F Leisch
H Ji
H Li
H Xing
H Yan
I Barozzi
I Kouskoumvekaki
J Goecks
J Wang
J Wang
JD Phillips
KR Blahnik
L Shen
LJ Zhu
M Bieda
M Yu
Mark Bieda
Nathan Cormier
R Gentleman
RD Peng
S Falcon
S John
S Roy
S Wang
S Yoo
T Bailey
T Liu
T Stropp
T Ye
TL Bailey
Tyler Kolisnik
W Luo
W Luo
W Ma
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

YesWorkflow:A User-Oriented, Language-Independent Tool for Recovering Workflow Information from Scripts

Author: Bertram Ludäscher
Christopher Jones
Christopher Schwalm
David Koop
Fernando Chirigati
James A. Macklin
James Cheney
James Hanken
Juliana Freire
Keith W. Kintigh
Khalid Belhajjame
Mark Bieda
Mark Schildhauer
Paolo Missier
R. Kyle Bocinsky
Saumen Dey
Steve Aulenbach
Tianhong Song
Timothy A. Kohler
Timothy McPhillips
Tyler Kolisnik
Yang Cao
Yaxing Wei
Publication venue: 'Edinburgh University Library'
Publication date: 01/01/2015
Field of study

Scientific workflow management systems offer features for composing complex computational pipelines from modular building blocks, for executing the resulting automated workflows, and for recording the provenance of data products resulting from workflow runs. Despite the advantages such features provide, many automated workflows continue to be implemented and executed outside of scientific workflow systems due to the convenience and familiarity of scripting languages (such as Perl, Python, R, and MATLAB), and to the high productivity many scientists experience when using these languages. YesWorkflow is a set of software tools that aim to provide such users of scripting languages with many of the benefits of scientific workflow systems. YesWorkflow requires neither the use of a workflow engine nor the overhead of adapting code to run effectively in such a system. Instead, YesWorkflow enables scientists to annotate existing scripts with special comments that reveal the computational modules and dataflows otherwise implicit in these scripts. YesWorkflow tools extract and analyze these comments, represent the scripts in terms of entities based on the typical scientific workflow model, and provide graphical renderings of this workflow-like view of the scripts. Future versions of YesWorkflow also will allow the prospective provenance of the data products of these scripts to be queried in ways similar to those available to users of scientific workflow systems

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Edinburgh Research Explorer

International Journal of Digital Curation

Low PCA3 expression is a marker of poor differentiation in localized prostate tumors: exploratory analysis from 12,076 patients.

Author: Alshalalfa Mohammed
Chelissery Jijumon
Davicioni Elai
Den Robert B
Erho Nicholas
Freedland Stephen J
Gibb Ewan A
Jordan Jennifer
Karnes R Jeffrey
Klein Eric A
Kolisnik Tyler
Lam Lucia L C
Lotan Tamara T
Ross Ashley E
Santiago-Jiménez Maria
Schaeffer Edward M
Schalken Jack A
Seiler Roland
Verhaegh Gerald W
Yousefi Kasra
Publication venue: 'Impact Journals, LLC'
Publication date: 07/02/2017
Field of study

BACKGROUND Prostate cancer antigen 3 (PCA3) is a prostate cancer diagnostic biomarker that has been clinically validated. The limitations of the diagnostic role of PCA3 in initial biopsy and the prognostic role are not well established. Here, we elucidate the limitations of tissue PCA3 to predict high grade tumors in initial biopsy. RESULTS PCA3 has a bimodal distribution in both biopsy and radical prostatectomy (RP) tissues, where low PCA3 expression was significantly associated with high grade disease (p<0.001). PCA3 had a poor performance of predicting high grade disease in initial biopsy (GS≥8) with 55% sensitivity and high false negative rates; 42% of high Gleason (≥8) samples had low PCA3. In RP, low PCA3 is associated with adverse pathological features, clinical recurrence outcome and greater probability of metastatic progression (p<0.001). MATERIALS AND METHODS A total of 1,694 expression profiles from biopsy and 10,382 from RP patients with high risk tumors were obtained from the Decipher Genomic Resource Information Database (GRIDTM)prostate cancer database. The primary clinical endpoint was distant metastasis-free survival for RP and high Gleason grade for biopsy. Logistic regression analyses and Cox proportional hazards models were used to evaluate the association of PCA3 with clinical variables and risk of metastasis. CONCLUSIONS There is high prevalence of high grade tumors with low PCA3 expression in the biopsy setting. Therefore, urologists should be warned that using PCA3 as stand-alone test may lead to high rate of under-diagnosis of high grade disease in initial biopsy setting

Crossref

Bern Open Repository and Information System (BORIS)

Ability of a Genomic Classifier to Predict Metastasis and Prostate Cancer-specific Mortality after Radiation or Surgery based on Needle Biopsy Specimens

Author: Aranes Maria
Buerki Christine
Chelliserry Jijumon
Choeurng Voleak
Davicioni Elai
Davis John W
Deheshi Samineh
Feng Felix Y
Haddad Zaid
Kane Christopher J
Klein Eric A
Kolisnik Tyler
Lam Lucia L.C
Lotan Tamara L
Margrave Jennifer
Martin Neil E
Nguyen Paul L
Ong Kaye
Pollack Alan
Punnen Sanoj
Ross Ashley E
Spratt Daniel E
Stoyanova Radka S
Tosoian Jeffrey J
Trock Bruce J
Yousefi Kasra
Publication venue: Elsevier B.V
Publication date: 01/11/2017
Field of study

Decipher is a validated genomic classifier developed to determine the biological potential for metastasis after radical prostatectomy (RP). To evaluate the ability of biopsy Decipher to predict metastasis and Prostate cancer-specific mortality (PCSM) in primarily intermediate- to high-risk patients treated with RP or radiation therapy (RT). Two hundred and thirty-five patients treated with either RP (n=105) or RT±androgen deprivation therapy (n=130) with available genomic expression profiles generated from diagnostic biopsy specimens from seven tertiary referral centers. The highest-grade core was sampled and Decipher was calculated based on a locked random forest model. Metastasis and PCSM were the primary and secondary outcomes of the study, respectively. Cox analysis and c-index were used to evaluate the performance of Decipher. With a median follow-up of 6 yr among censored patients, 34 patients developed metastases and 11 died of prostate cancer. On multivariable analysis, biopsy Decipher remained a significant predictor of metastasis (hazard ratio: 1.37 per 10% increase in score, 95% confidence interval [CI]: 1.06–1.78, p=0.018) after adjusting for clinical variables. For predicting metastasis 5-yr post-biopsy, Cancer of the Prostate Risk Assessment score had a c-index of 0.60 (95% CI: 0.50–0.69), while Cancer of the Prostate Risk Assessment plus biopsy Decipher had a c-index of 0.71 (95% CI: 0.60–0.82). National Comprehensive Cancer Network risk group had a c-index of 0.66 (95% CI: 0.53–0.77), while National Comprehensive Cancer Network plus biopsy Decipher had a c-index of 0.74 (95% CI: 0.66–0.82). Biopsy Decipher was a significant predictor of PCSM (hazard ratio: 1.57 per 10% increase in score, 95% CI: 1.03–2.48, p=0.037), with a 5-yr PCSM rate of 0%, 0%, and 9.4% for Decipher low, intermediate, and high, respectively. Biopsy Decipher predicted metastasis and PCSM from diagnostic biopsy specimens of primarily intermediate- and high-risk men treated with first-line RT or RP. Biopsy Decipher predicted metastasis and prostate cancer-specific mortality risk from diagnostic biopsy specimens. Biopsy Decipher was able to predict metastasis and prostate cancer-specific mortality from diagnostic biopsy specimens in a cohort of primarily intermediate- and high-risk men regardless of type of first-line treatment

Crossref

University of Miami: Scholarship Miami

Development and Validation of a Novel Integrated Clinical-Genomic Risk Group Classification for Localized Prostate Cancer.

Author: Abdollah Firas
Alter Jason
Aranes Maria
Buerki Christine
Carroll Peter R
Chelliserry Jijumon
Choeurng Voleak
Cole Adam
Davicioni Elai
Davis John W
Den Robert B
Dess Robert T
Dicker Adam P
du Plessis Marguerite
Feng Felix Y
Glass Andrew G
Haddad Zaid
Jordan Jennifer
Kane Christopher J
Karnes R Jeffrey
Klein Eric A
Kolisnik Tyler
Lam Lucia LC
Loeb Stacy
Margrave Jennifer
Mehra Rohit
Nguyen Hao
Nguyen Paul L
Pollack Alan
Randall Josh M
Ross Ashley E
Santiago-Jiménez María
Schaeffer Edward M
Spratt Daniel E
Stoyanova Radka
Tewari Ashutosh
Trabulsi Edouard J
Uchio Edward
Weinmann Sheila
Yousefi Kasra
Zhang Jingbin
Zhao Shuang G
Publication venue: Henry Ford Health System Scholarly Commons
Publication date: 20/02/2018
Field of study

Purpose It is clinically challenging to integrate genomic-classifier results that report a numeric risk of recurrence into treatment recommendations for localized prostate cancer, which are founded in the framework of risk groups. We aimed to develop a novel clinical-genomic risk grouping system that can readily be incorporated into treatment guidelines for localized prostate cancer. Materials and Methods Two multicenter cohorts (n = 991) were used for training and validation of the clinical-genomic risk groups, and two additional cohorts (n = 5,937) were used for reclassification analyses. Competing risks analysis was used to estimate the risk of distant metastasis. Time-dependent c-indices were constructed to compare clinicopathologic risk models with the clinical-genomic risk groups. Results With a median follow-up of 8 years for patients in the training cohort, 10-year distant metastasis rates for National Comprehensive Cancer Network (NCCN) low, favorable-intermediate, unfavorable-intermediate, and high-risk were 7.3%, 9.2%, 38.0%, and 39.5%, respectively. In contrast, the three-tier clinical-genomic risk groups had 10-year distant metastasis rates of 3.5%, 29.4%, and 54.6%, for low-, intermediate-, and high-risk, respectively, which were consistent in the validation cohort (0%, 25.9%, and 55.2%, respectively). C-indices for the clinical-genomic risk grouping system (0.84; 95% CI, 0.61 to 0.93) were improved over NCCN (0.73; 95% CI, 0.60 to 0.86) and Cancer of the Prostate Risk Assessment (0.74; 95% CI, 0.65 to 0.84), and 30% of patients using NCCN low/intermediate/high would be reclassified by the new three-tier system and 67% of patients would be reclassified from NCCN six-tier (very-low- to very-high-risk) by the new six-tier system. Conclusion A commercially available genomic classifier in combination with standard clinicopathologic variables can generate a simple-to-use clinical-genomic risk grouping that more accurately identifies patients at low, intermediate, and high risk for metastasis and can be easily incorporated into current guidelines to better risk-stratify patients

Crossref

Henry Ford Health System Scholarly Commons

University of Miami: Scholarship Miami

eScholarship - University of California