Search CORE

58,975 research outputs found

RACS: Rapid Analysis of ChIP-Seq data for contig based genomes

Author: Fillingham Jeffrey
Nabeel-Shah Syed
Ponce Marcelo
Saettone Alejandro
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/05/2019
Field of study

Background: Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely used technique to investigate the function of chromatin-related proteins in a genome-wide manner. ChIP-Seq generates large quantities of data which can be difficult to process and analyse, particularly for organisms with contig based genomes. Contig-based genomes often have poor annotations for cis-elements, for example enhancers, that are important for gene expression. Poorly annotated genomes make a comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking. Methods: We report a computational pipeline that utilizes traditional High-Performance Computing techniques and open source tools for processing and analysing data obtained from ChIP-Seq. We applied our computational pipeline "Rapid Analysis of ChIP-Seq data" (RACS) to ChIP-Seq data that was generated in the model organism Tetrahymena thermophila, an example of an organism with a genome that is available in contigs. Results: To test the performance and efficiency of RACs, we performed control ChIP-Seq experiments allowing us to rapidly eliminate false positives when analyzing our previously published data set. Our pipeline segregates the found read accumulations between genic and intergenic regions and is highly efficient for rapid downstream analyses. Conclusions: Altogether, the computational pipeline presented in this report is an efficient and highly reliable tool to analyze genome-wide ChIP-Seq data generated in model organisms with contig-based genomes. RACS is an open source computational pipeline available to download from: https://bitbucket.org/mjponce/racs --or-- https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACSComment: Submitted to BMC Bioinformatics. Computational pipeline available at https://bitbucket.org/mjponce/rac

TSpace (University of Toronto)

arXiv.org e-Print Archive

Shape-based peak identification for ChIP-Seq

Author: A Barski
AA Bhinge
B Wold
EG Wilbanks
ET Wang
G Carlsson
G Robertson
GR Grimmett
J Rozowsky
Lior Pachter
M Lupien
MB Noyes
PJ Park
R Development Core Team
RK Bradley
S Bhamidi
S Evans
S MacArthur
S Pepke
SN Evans
Steven N Evans
T Barrett
T Laajala
Valerie Hower
WJ Kent
Y Benjamini
Y Benjamini
Y Zhang
Publication venue
Publication date: 05/05/2010
Field of study

We present a new algorithm for the identification of bound regions from ChIP-seq experiments. Our method for identifying statistically significant peaks from read coverage is inspired by the notion of persistence in topological data analysis and provides a non-parametric approach that is robust to noise in experiments. Specifically, our method reduces the peak calling problem to the study of tree-based statistics derived from the data. We demonstrate the accuracy of our method on existing datasets, and we show that it can discover previously missed regions and can more clearly discriminate between multiple binding events. The software T-PIC (Tree shape Peak Identification for ChIP-Seq) is available at http://math.berkeley.edu/~vhower/tpic.htmlComment: 12 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Caltech Authors

Recommended from our members

CR Cistrome: a ChIP-Seq database for chromatin regulators and histone modification linkages in human and mouse

Author: Huang Jinyan
Liu Jing
Liu X. Shirley
Mei Shenglin
Qin Qian
Sun Hanfei
Wang Juan
Wang Qian
Wang Qixuan
Yang Xiaoqin
Zhang Yong
Zhao Chengchen
Publication venue: 'Oxford University Press (OUP)'
Publication date: 11/04/2014
Field of study

Diversified histone modifications (HMs) are essential epigenetic features. They play important roles in fundamental biological processes including transcription, DNA repair and DNA replication. Chromatin regulators (CRs), which are indispensable in epigenetics, can mediate HMs to adjust chromatin structures and functions. With the development of ChIP-Seq technology, there is an opportunity to study CR and HM profiles at the whole-genome scale. However, no specific resource for the integration of CR ChIP-Seq data or CR-HM ChIP-Seq linkage pairs is currently available. Therefore, we constructed the CR Cistrome database, available online at http://compbio.tongji.edu.cn/cr and http://cistrome.org/cr/, to further elucidate CR functions and CR-HM linkages. Within this database, we collected all publicly available ChIP-Seq data on CRs in human and mouse and categorized the data into four cohorts: the reader, writer, eraser and remodeler cohorts, together with curated introductions and ChIP-Seq data analysis results. For the HM readers, writers and erasers, we provided further ChIP-Seq analysis data for the targeted HMs and schematized the relationships between them. We believe CR Cistrome is a valuable resource for the epigenetics community

Harvard University - DASH

Evaluation of experimental design and computational parameter choices affecting analyses of ChIP-seq and RNA-seq data in undomesticated poplar trees.

Author: Filkov Vladimir
Groover Andrew
Liu Lijun
Missirian Victor
Zinkgraf Matthew
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

BackgroundOne of the great advantages of next generation sequencing is the ability to generate large genomic datasets for virtually all species, including non-model organisms. It should be possible, in turn, to apply advanced computational approaches to these datasets to develop models of biological processes. In a practical sense, working with non-model organisms presents unique challenges. In this paper we discuss some of these challenges for ChIP-seq and RNA-seq experiments using the undomesticated tree species of the genus Populus.ResultsWe describe specific challenges associated with experimental design in Populus, including selection of optimal genotypes for different technical approaches and development of antibodies against Populus transcription factors. Execution of the experimental design included the generation and analysis of Chromatin immunoprecipitation-sequencing (ChIP-seq) data for RNA polymerase II and transcription factors involved in wood formation. We discuss criteria for analyzing the resulting datasets, determination of appropriate control sequencing libraries, evaluation of sequencing coverage needs, and optimization of parameters. We also describe the evaluation of ChIP-seq data from Populus, and discuss the comparison between ChIP-seq and RNA-seq data and biological interpretations of these comparisons.ConclusionsThese and other "lessons learned" highlight the challenges but also the potential insights to be gained from extending next generation sequencing-supported network analyses to undomesticated non-model species

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Joint modeling of ChIP-seq data via a Markov random field model

Author: Dhavala
E. Wit
Ernst
Ji
P. A. C. 't Hoen
Shao
Spyrou
V. Vinciotti
Wang
Y. Bao
Publication venue: 'Oxford University Press (OUP)'
Publication date: 19/06/2013
Field of study

Chromatin ImmunoPrecipitation-sequencing (ChIP-seq) experiments have now become routine in biology for the detection of protein-binding sites. In this paper, we present a Markov random field model for the joint analysis of multiple ChIP-seq experiments. The proposed model naturally accounts for spatial dependencies in the data, by assuming first-order Markov dependence and, for the large proportion of zero counts, by using zero-inflated mixture distributions. In contrast to all other available implementations, the model allows for the joint modeling of multiple experiments, by incorporating key aspects of the experimental design. In particular, the model uses the information about replicates and about the different antibodies used in the experiments. An extensive simulation study shows a lower false non-discovery rate for the proposed method, compared with existing methods, at the same false discovery rate. Finally, we present an analysis on real data for the detection of histone modifications of two chromatin modifiers from eight ChIP-seq experiments, including technical replicates with different IP efficiencies

arXiv.org e-Print Archive

University of Essex Research Repository

Crossref

University of Groningen

ARTS repository - University of Groningen

Leiden University Scholary Publications

Brunel University Research Archive

ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia

Author: A. J. Hartemink
A. Kundaje
A. Milosavljevic
A. Sidow
B. E. Bernstein
B. J. Wold
C. Epstein
D. Raha
F. Pauli
G. DeSalvo
G. Euskirchen
G. K. Marinov
J. A. Stamatoyannopoulos
J. B. Brown
J. D. Lieb
J. Gertz
J. Rozowsky
K. I. Fisher-Aylor
K. P. White
L. Ma
M. D. Perry
M. Gerstein
M. J. Pazin
M. Kellis
M. M. Hoffman
M. Slattery
M. Snyder
M. Y. Tolstorukov
N. Shoresh
P. Bickel
P. Cayting
P. J. Farnham
P. J. Park
P. Kheradpour
P. V. Kharchenko
Q. Li
R. M. Myers
S. Batzoglou
S. G. Landt
S. Karmakar
S. Xi
T. E. Reddy
T. Liu
V. R. Iyer
X. S. Liu
Y. Chen
Y. L. Jung
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/12/2011
Field of study

Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the results are scored and evaluated for quality, and how the data and metadata are archived for public use. These practices affect the quality and utility of any global ChIP experiment. Through our experience in performing ChIP-seq experiments, the ENCODE and modENCODE consortia have developed a set of working standards and guidelines for ChIP experiments that are updated routinely. The current guidelines address antibody validation, experimental replication, sequencing depth, data and metadata reporting, and data quality assessment. We discuss how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data. All data sets used in the analysis have been deposited for public viewing and downloading at the ENCODE (http://encodeproject.org/ENCODE/) and modENCODE (http://www.modencode.org/) portals

DSpace@MIT

Crossref

Caltech Authors

Phylogenetic Analysis of Cell Types using Histone Modifications

Author: Bucher Philipp
Lin Yu
Moret Bernard M. E.
Nair Nishanth Ulhas
Publication venue
Publication date: 01/01/2013
Field of study

In cell differentiation, a cell of a less specialized type becomes one of a more specialized type, even though all cells have the same genome. Transcription factors and epigenetic marks like histone modifications can play a significant role in the differentiation process. In this paper, we present a simple analysis of cell types and differentiation paths using phylogenetic inference based on ChIP-Seq histone modification data. We propose new data representation techniques and new distance measures for ChIP-Seq data and use these together with standard phylogenetic inference methods to build biologically meaningful trees that indicate how diverse types of cells are related. We demonstrate our approach on H3K4me3 and H3K27me3 data for 37 and 13 types of cells respectively, using the dataset to explore various issues surrounding replicate data, variability between cells of the same type, and robustness. The promising results we obtain point the way to a new approach to the study of cell differentiation.Comment: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

De novo ChIP-seq analysis

Author: Bar-Joseph Ziv
Cicek A. Ercument
He Xin
Le Hai-Son
Schulz Marcel H.
Wang Yuhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Methods for the analysis of chromatin immunoprecipitation sequencing (ChIP-seq) data start by aligning the short reads to a reference genome. While often successful, they are not appropriate for cases where a reference genome is not available. Here we develop methods for de novo analysis of ChIP-seq data. Our methods combine de novo assembly with statistical tests enabling motif discovery without the use of a reference genome. We validate the performance of our method using human and mouse data. Analysis of fly data indicates that our method outperforms alignment based methods that utilize closely related species

DSpace@MIT

Crossref

Bilkent University Institutional Repository

Springer - Publisher Connector

PubMed Central

MPG.PuRe