Search CORE

6 research outputs found

MAAMD: a workflow to standardize meta-analyses and comparison of affymetrix microarray data

Author: Alexander C Zambon
Andrew D McCulloch
Gabriel G Haddad
Ilkay Altintas
Jennifer C Stowe
Jianwu Wang
Nathan Salomonis
Zhuohui Gan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Reusable, extensible, and modifiable R scripts and Kepler workflows for comprehensive single set ChIP-seq analysis

Author: Mark Bieda
Nathan Cormier
Tyler Kolisnik
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

BACKGROUND: There has been an enormous expansion of use of chromatin immunoprecipitation followed by sequencing (ChIP-seq) technologies. Analysis of large-scale ChIP-seq datasets involves a complex series of steps and production of several specialized graphical outputs. A number of systems have emphasized custom development of ChIP-seq pipelines. These systems are primarily based on custom programming of a single, complex pipeline or supply libraries of modules and do not produce the full range of outputs commonly produced for ChIP-seq datasets. It is desirable to have more comprehensive pipelines, in particular ones addressing common metadata tasks, such as pathway analysis, and pipelines producing standard complex graphical outputs. It is advantageous if these are highly modular systems, available as both turnkey pipelines and individual modules, that are easily comprehensible, modifiable and extensible to allow rapid alteration in response to new analysis developments in this growing area. Furthermore, it is advantageous if these pipelines allow data provenance tracking. RESULTS: We present a set of 20 ChIP-seq analysis software modules implemented in the Kepler workflow system; most (18/20) were also implemented as standalone, fully functional R scripts. The set consists of four full turnkey pipelines and 16 component modules. The turnkey pipelines in Kepler allow data provenance tracking. Implementation emphasized use of common R packages and widely-used external tools (e.g., MACS for peak finding), along with custom programming. This software presents comprehensive solutions and easily repurposed code blocks for ChIP-seq analysis and pipeline creation. Tasks include mapping raw reads, peakfinding via MACS, summary statistics, peak location statistics, summary plots centered on the transcription start site (TSS), gene ontology, pathway analysis, and de novo motif finding, among others. CONCLUSIONS: These pipelines range from those performing a single task to those performing full analyses of ChIP-seq data. The pipelines are supplied as both Kepler workflows, which allow data provenance tracking, and, in the majority of cases, as standalone R scripts. These pipelines are designed for ease of modification and repurposing

Springer - Publisher Connector

PubMed Central

Scipedia

Reusable, extensible, and modifiable R scripts and Kepler workflows for comprehensive single set ChIP-seq analysis

Author: A Barski
AF Bardet
AR Quinlan
B Langmead
B Ludäscher
C Zang
E Mercier
F Leisch
H Ji
H Li
H Xing
H Yan
I Barozzi
I Kouskoumvekaki
J Goecks
J Wang
J Wang
JD Phillips
KR Blahnik
L Shen
LJ Zhu
M Bieda
M Yu
Mark Bieda
Nathan Cormier
R Gentleman
RD Peng
S Falcon
S John
S Roy
S Wang
S Yoo
T Bailey
T Liu
T Stropp
T Ye
TL Bailey
Tyler Kolisnik
W Luo
W Luo
W Ma
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Differences among brain tumor stem cell types and fetal neural stem cells in focal regions of histone modifications and DNA methylation, broad regions of modifications, and bivalent promoters

Author: Mark C Bieda
Sally Yoo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Workflows for microarray data processing in the Kepler environment

Author: Bieda Mark
Ludäscher Bertram
McPhillips Timothy
Stropp Thomas
Publication venue: BMC
Publication date: 01/01/2012
Field of study

Abstract Background Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. Results We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip) datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data) and therefore are close to traditional shell scripting or R/BioConductor scripting approaches to pipeline design. Finally, we suggest that microarray data processing task workflows may provide a basis for future example-based comparison of different workflow systems. Conclusions We provide a set of tools and complete workflows for microarray data analysis in the Kepler environment, which has the advantages of offering graphical, clear display of conceptual steps and parameters and the ability to easily integrate other resources such as remote data and web services.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

PRISM: University of Calgary Digital Repository