Search CORE

eScholarship - University of California

Caltech Authors

Retrotransposons are specified as DNA replication origins in the gene-poor regions of Arabidopsis heterochromatin

Author: Casacuberta i Suñer Josep M.
Costas Celina
Gutierrez Crisanto
Hénaff Elizabeth
Morata Jordi
Peiró Ramón
Sequeira-Mendes Joana
Vergara Zaida
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

Genomic stability depends on faithful genome replication. This is achieved by the concerted activity of thousands of DNA replication origins (ORIs) scattered throughout the genome. The DNA and chromatin features determining ORI specification are not presently known. We have generated a high-resolution genome-wide map of 3230 ORIs in cultured Arabidopsis thaliana cells. Here, we focused on defining the features associated with ORIs in heterochromatin. In pericentromeric gene-poor domains ORIs associate almost exclusively with the retrotransposon class of transposable elements (TEs), in particular of the Gypsy family. ORI activity in retrotransposons occurs independently of TE expression and while maintaining high levels of H3K9me2 and H3K27me1, typical marks of repressed heterochromatin. ORI-TEs largely colocalize with chromatin signatures defining GC-rich heterochromatin. Importantly, TEs with active ORIs contain a local GC content higher than the TEs lacking them. Our results lead us to conclude that ORI colocalization with retrotransposons is determined by their transposition mechanism based on transcription, and a specific chromatin landscape. Our detailed analysis of ORIs responsible for heterochromatin replication has implications on the mechanisms of ORI specification in other multicellular organisms in which retrotransposons are major components of heterochromatin and of the entire genome

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Diposit Digital de Documents de la UAB

Digital.CSIC

Peak shape clustering reveals biological insights

Author: Gaetano I. Dellino
Laura M. Sangalli
Laura Riva
Marzia A. Cremona
Pier Giuseppe Pelicci
Piercesare Secchi
Simone Vantini
Publication venue: Springer Nature
Publication date: 01/01/2015
Field of study

Background: ChIP-seq experiments are widely used to detect and study DNA-protein interactions, such as transcription factor binding and chromatin modifications. However, downstream analysis of ChIP-seq data is currently restricted to the evaluation of signal intensity and the detection of enriched regions (peaks) in the genome. Other features of peak shape are almost always neglected, despite the remarkable differences shown by ChIP-seq for different proteins, as well as by distinct regions in a single experiment. Results: We hypothesize that statistically significant differences in peak shape might have a functional role and a biological meaning. Thus, we design five indices able to summarize peak shapes and we employ multivariate clustering techniques to divide peaks into groups according to both their complexity and the intensity of their coverage function. In addition, our novel analysis pipeline employs a range of statistical and bioinformatics techniques to relate the obtained peak shapes to several independent genomic datasets, including other genome-wide protein-DNA maps and gene expression experiments. To clarify the meaning of peak shape, we apply our methodology to the study of the erythroid transcription factor GATA-1 in K562 cell line and in megakaryocytes. Conclusions: Our study demonstrates that ChIP-seq profiles include information regarding the binding of other proteins beside the one used for precipitation. In particular, peak shape provides new insights into cooperative transcriptional regulation and is correlated to gene expression

Archivio istituzionale della ricerca - Politecnico di Milano

AIR Universita degli studi di Milano

AIR Universita degli studi di Milano

Peak shape clustering reveals biological insights

Author: G.I. Dellino
L. Riva
L.M. Sangalli
M.A. Cremona
P. Secchi
P.G. Pelicci
S. Vantini
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

An integrated ChIP-seq analysis platform with customizable workflows

Abstract Background Chromatin immunoprecipitation followed by next generation sequencing (ChIP-seq), enables unbiased and genome-wide mapping of protein-DNA interactions and epigenetic marks. The first step in ChIP-seq data analysis involves the identification of peaks (i.e., genomic locations with high density of mapped sequence reads). The next step consists of interpreting the biological meaning of the peaks through their association with known genes, pathways, regulatory elements, and integration with other experiments. Although several programs have been published for the analysis of ChIP-seq data, they often focus on the peak detection step and are usually not well suited for thorough, integrative analysis of the detected peaks. Results To address the peak interpretation challenge, we have developed ChIPseeqer, an integrative, comprehensive, fast and user-friendly computational framework for in-depth analysis of ChIP-seq datasets. The novelty of our approach is the capability to combine several computational tools in order to create easily customized workflows that can be adapted to the user's needs and objectives. In this paper, we describe the main components of the ChIPseeqer framework, and also demonstrate the utility and diversity of the analyses offered, by analyzing a published ChIP-seq dataset. Conclusions ChIPseeqer facilitates ChIP-seq data analysis by offering a flexible and powerful set of computational tools that can be used in combination with one another. The framework is freely available as a user-friendly GUI application, but all programs are also executable from the command line, thus providing flexibility and automatability for advanced users.</p

Directory of Open Access Journals

Identifying peaks in *-seq data using shape information

Author: A Mortazavi
A Valouev
AP Boyle
AP Fejes
C Jiang
C Zang
CA Meyer
E Birney
EG Wilbanks
Francesco Strino
GK Marinov
H Hotelling
H Ji
H Koohy
H Xu
H Xu
HS Rhee
J Ernst
J Feng
J Feng
J Li
J Rozowsky
JD Buenrostro
K Kornacker
KC Wong
KP Stanton
L Song
Li Bing
M Heydarian
M Micsinai
MA Mendoza-Parra
MB Rye
Michael Lappe
NU Rashid
P Huber
PC Mahalanobis
PG Giresi
PJ Sabo
PV Kharchenko
Q Song
R Jothi
S Heinz
SG Landt
V Hower
V Kumar
X Lan
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Detection and classification of peaks in 5' cap RNA sequencing data

Author: Armstrong N.J.
Strbenac D.
Yang J.Y.H.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Background The large-scale sequencing of 5' cap enriched cDNA promises to reveal the diversity of transcription initiation across entire genomes. The process of transcription is noisy, and there is often no single, exact start site. This creates the need for a fast and simple method of identifying transcription start peaks based on this type of data. Due to both biological and technical noise, many of the peaks seen are not real transcription initiation events. Classification of the observed peaks is an essential filtering step in the discovery of genuine initiation locations. Results We develop a two-stage approach consisting of a fast and simple algorithm based on a sliding window with Poisson null distribution for detecting the genomic locations of peaks, followed by a linear support vector machine classifier to distinguish between peaks which represent the initiation of transcription and peaks that do not. Comparison of classification performance to the best existing method based on whole genome segmentation showed comparable precision and improved recall. Internal features, which are intrinsic to the data and require no further experiments, had high precision and recall rates. Addition of pooled external data or matched RNA sequencing data resulted in gains of recall with equivalent precision. Conclusions The Poisson sliding window model is an effective and fast way of taking the peak neighbourhood into account, and finding statistically significant peaks over a range of transcript expression values. It is orders of magnitude faster than doing whole genome segmentation. The support vector classification scheme has better precision and recall than existing methods. Integrating additional datasets is shown to provide minor gains in recall, in comparison to using only the cap-sequencing data

Research Repository

Characterising ChIP-seq binding patterns by model-based peak shape deconvolution

Author: Gronemeyer H. (Hinrich)
Mendoza Parra M. (Marco Antonio) A. (Antonio)
Nowicka M. (M)
Van Gool W. (W)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

BACKGROUND: Chromatin immunoprecipitation combined with massive parallel sequencing (ChIP-seq) is widely used to study protein-chromatin interactions or chromatin modifications at genome-wide level. Sequence reads that accumulate locally at the genome (peaks) reveal loci of selectively modified chromatin or specific sites of chromatin-binding factors. Computational approaches (peak callers) have been developed to identify the global pattern of these sites, most of which assess the deviation from background by applying distribution statistics. RESULTS: We have implemented MeDiChISeq, a regression-based approach, which--by following a learning process--defines a representative binding pattern from the investigated ChIP-seq dataset. Using this model MeDiChISeq identifies significant genome-wide patterns of chromatin-bound factors or chromatin modification. MeDiChISeq has been validated for various publicly available ChIP-seq datasets and extensively compared with other peak callers. CONCLUSIONS: MeDiChI-Seq has a high resolution when identifying binding events, a high degree of peak-assessment reproducibility in biological replicates, a low level of false calls and a high true discovery rate when evaluated in the context of gold-standard benchmark datasets. Importantly, this approach can be applied not only to 'sharp' binding patterns--like those retrieved for transcription factors (TFs)--but also to the broad binding patterns seen for several histone modifications. Notably, we show that at high sequencing depths, MeDiChISeq outperforms other algorithms due to its powerful peak shape recognition capacity which facilitates discerning significant binding events from spurious background enrichment patterns that are enhanced with increased sequencing depths

univOAK

Improving ChIP-seq peak-calling for functional co-regulator binding by integrating multiple sources of biological information

Author: Hartmaier RJ
Lu X
Oesterreich S
Osmanbeyoglu HU
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Background: Chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq) is increasingly being applied to study genome-wide binding sites of transcription factors. There is an increasing interest in understanding the mechanism of action of co-regulator proteins, which do not bind DNA directly, but exert their effects by binding to transcription factors such as the estrogen receptor (ER). However, due to the nature of detecting indirect protein-DNA interaction, ChIP-seq signals from co-regulators can be relatively weak and thus biologically meaningful interactions remain difficult to identify

D-Scholarship@Pitt

Capture‐C reveals preformed chromatin interactions between HIF

Author: Bindra RS
David R Mole
Hani Choudhry
James L Platt
James OJ Davies
James Smythies
Jim R Hughes
Peter J Ratcliffe
Rafik Salama
Publication venue: 'EMBO'
Publication date
Field of study