Search CORE

70 research outputs found

Recommended from our members

Analysis of optimized DNase-seq reveals intrinsic bias in transcription factor footprint identification

Author: Brown Myles
Chen Mei-Wei
Fei Teng
He Housheng Hansen
Hu Sheng'en Shawn
Liu X. Shirley
Liu Yin
Long Henry
Meyer Clifford A.
Rao Prakash K.
Xu Han
Zang Chongzhi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/08/2014
Field of study

DNase-seq is a powerful technique for identifying cis-regulatory elements across the genome. We studied the key experimental parameters to optimize the performance of DNase-seq. We found that sequencing short 50-100bp fragments that accumulate in long inter-nucleosome linker regions is more efficient for identifying transcription factor binding sites than using longer fragments. We also assessed the potential of DNase-seq to predict transcription factor occupancy through the generation of nucleotide-resolution transcription factor footprints. In modeling the sequence-specific DNaseI cutting bias we found a surprisingly strong effect that varied over more than two orders of magnitude. This confounds DNaseI footprint analysis to the extent that the nucleotide resolution cleavage patterns at most transcription factor binding sites are derived from intrinsic DNaseI cleavage bias rather than from specific protein-DNA interactions. In contrast, quantitative comparison of DNaseI hypersensitivity between states can predict transcription factor occupancy associated with particular biological perturbations

Harvard University - DASH

MethylPurify: tumor purity deconvolution and differential methylation detection from single tumor DNA methylomes

Author: Chongzhi Zang
Clifford A Meyer
Fuqiang Li
Haiyun Wang
Han Xu
Hua-Jun Wu
Jianxing He
Jun Wang
Jun Wang
Peng Jiang
Peng Zhang
Qian Alvin Qin
Qian Zhao
Wei Li
Xiaole Shirley Liu
Xiaoqi Zheng
Yong Hou
Yong Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

We propose a statistical algorithm MethylPurify that uses regions with bisulfite reads showing discordant methylation levels to infer tumor purity from tumor samples alone. MethylPurify can identify differentially methylated regions (DMRs) from individual tumor methylome samples, without genomic variation information or prior knowledge from other datasets. In simulations with mixed bisulfite reads from cancer and normal cell lines, MethylPurify correctly inferred tumor purity and identified over 96% of the DMRs. From patient data, MethylPurify gave satisfactory DMR calls from tumor methylome samples alone, and revealed potential missed DMRs by tumor to normal comparison due to tumor heterogeneity. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0419-x) contains supplementary material, which is available to authorized users

Crossref

Harvard University - DASH

Springer - Publisher Connector

Copenhagen University Research Information System

PubMed Central

Integrative single-cell meta-analysis reveals disease-relevant vascular cell states and markers in human atherosclerosis

Coronary artery disease (CAD) is characterized by atherosclerotic plaque formation in the arterial wall. CAD progression involves complex interactions and phenotypic plasticity among vascular and immune cell lineages. Single-cell RNA-seq (scRNA-seq) studies have highlighted lineage-specific transcriptomic signatures, but human cell phenotypes remain controversial. Here, we perform an integrated meta-analysis of 22 scRNA-seq libraries to generate a comprehensive map of human atherosclerosis with 118,578 cells. Besides characterizing granular cell-type diversity and communication, we leverage this atlas to provide insights into smooth muscle cell (SMC) modulation. We integrate genome-wide association study data and uncover a critical role for modulated SMC phenotypes in CAD, myocardial infarction, and coronary calcification. Finally, we identify fibromyocyte/fibrochondrogenic SMC markers (LTBP1 and CRTAC1) as proxies of atherosclerosis progression and validate these through omics and spatial imaging analyses. Altogether, we create a unified atlas of human atherosclerosis informing cell state-specific mechanistic and translational studies of cardiovascular diseases.</p

EUR Research Repository

Integrative single-cell meta-analysis reveals disease-relevant vascular cell states and markers in human atherosclerosis

Utrecht University Repository

Transcriptional Regulation of Rod Photoreceptor Homeostasis Revealed by In Vivo NRL Targetome Analysis

Author: A Barski
A Onishi
A Rehemtulla
A Swaroop
AF Wright
AH Peters
AJ Mears
AK Hennig
AN Bramall
Anand Swaroop
B Venkatesan
Bernward Klocke
Chongzhi Zang
D Mustafi
DG Luo
Douglas S. Kim
DS Johnson
EC Oh
EC Oh
F Su
FJ Livesey
GH Peng
GH Peng
GR Jackson
Gregory S. Barsh
H Cheng
H Cheng
H Hao
H Zipper
Hong Hao
I Solovei
IS Peter
IS Peter
J Chen
J Rister
J Secombe
J Shendure
J Smith
Janina Gregorski
JC Corbo
JC Corbo
JD Hughes
JH Morrison
JM Vaquerizas
K Cartharius
K Quandt
K Yamane
Kairong Cui
Keji Zhao
Kory R. Johnson
KP Mitton
L Jia
L Ng
L Xie
L Xu
LD Carter-Dawson
LE Lerner
Linn Gieser
M Agathocleous
M Akimoto
M Levine
M Levine
M Ptashne
M Ptashne
M Srinivas
MA Kautzmann
Martin Seifert
MB Ramocki
MC Hollander
MJ Brooks
NB Haider
Norimoto Gotoh
PJ Park
R Krishnakumar
R Kumar
S Ben-Tabou de-Leon
S Chen
S Hardy
S Yoshida
SA Boyadjiev
SJ Pittler
T Fujiwara
T Furukawa
T Matsuda
TD Lamb
TH Hsiau
WC Nichols
Weiqun Peng
Y Muranishi
Y Zhang
Yang Fann
Z Wang
Publication venue: Public Library of Science
Publication date: 12/04/2012
Field of study

A stringent control of homeostasis is critical for functional maintenance and survival of neurons. In the mammalian retina, the basic motif leucine zipper transcription factor NRL determines rod versus cone photoreceptor cell fate and activates the expression of many rod-specific genes. Here, we report an integrated analysis of NRL-centered gene regulatory network by coupling chromatin immunoprecipitation followed by high-throughput sequencing (ChIP–Seq) data from Illumina and ABI platforms with global expression profiling and in vivo knockdown studies. We identified approximately 300 direct NRL target genes. Of these, 22 NRL targets are associated with human retinal dystrophies, whereas 95 mapped to regions of as yet uncloned retinal disease loci. In silico analysis of NRL ChIP–Seq peak sequences revealed an enrichment of distinct sets of transcription factor binding sites. Specifically, we discovered that genes involved in photoreceptor function include binding sites for both NRL and homeodomain protein CRX. Evaluation of 26 ChIP–Seq regions validated their enhancer functions in reporter assays. In vivo knockdown of 16 NRL target genes resulted in death or abnormal morphology of rod photoreceptors, suggesting their importance in maintaining retinal function. We also identified histone demethylase Kdm5b as a novel secondary node in NRL transcriptional hierarchy. Exon array analysis of flow-sorted photoreceptors in which Kdm5b was knocked down by shRNA indicated its role in regulating rod-expressed genes. Our studies identify candidate genes for retinal dystrophies, define cis-regulatory module(s) for photoreceptor-expressed genes and provide a framework for decoding transcriptional regulatory networks that dictate rod homeostasis

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Partitioning Heritability of Regulatory and Cell-Type-Specific Variants across 11 Common Diseases

Regulatory and coding variants are known to be enriched with associations identified by genome-wide association studies (GWASs) of complex disease, but their contributions to trait heritability are currently unknown. We applied variance-component methods to imputed genotype data for 11 common diseases to partition the heritability explained by genotyped SNPs (hg2) across functional categories (while accounting for shared variance due to linkage disequilibrium). Extensive simulations showed that in contrast to current estimates from GWAS summary statistics, the variance-component approach partitions heritability accurately under a wide range of complex-disease architectures. Across the 11 diseases DNaseI hypersensitivity sites (DHSs) from 217 cell types spanned 16% of imputed SNPs (and 24% of genotyped SNPs) but explained an average of 79% (SE = 8%) of hg2 from imputed SNPs (5.1× enrichment; p = 3.7 × 10−17) and 38% (SE = 4%) of hg2 from genotyped SNPs (1.6× enrichment, p = 1.0 × 10−4). Further enrichment was observed at enhancer DHSs and cell-type-specific DHSs. In contrast, coding variants, which span 1% of the genome, explained <10% of hg2 despite having the highest enrichment. We replicated these findings but found no significant contribution from rare coding variants in independent schizophrenia cohorts genotyped on GWAS and exome chips. Our results highlight the value of analyzing components of heritability to unravel the functional architecture of common disease

Carolina Digital Repository

Global alteration of CTCF binding in the cancer genome

Author: Fang Celestia
Martinez Carlos A.
Ntziachristos Panagiotis
Wang Zhenjia
Zang Chongzhi
Publication venue: 'American Association for Cancer Research (AACR)'
Publication date: 01/01/2019
Field of study

Crossref

Ghent University Academic Bibliography

Recommended from our members

Active enhancers are delineated de novo during hematopoiesis, with limited lineage fidelity among specified primary blood cells

Author: Liu Xiaole (Shirley) Shirley
Luyten Annouck
Shivdasani Ramesh Arjun
Zang Chongzhi
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 08/01/2018
Field of study

Tissues may adopt diverse strategies to establish specific transcriptional programs in daughter lineages. In intestinal crypts, enhancers for genes expressed in both major cell types appear broadly permissive in stem and specified progenitor cells. In blood, another self-renewing tissue, it is unclear when chromatin becomes permissive for transcription of genes expressed in distinct terminal lineages. Using chromatin immunoprecipitation (ChIP) combined with deep sequencing (ChIP-seq) to profile activating histone marks, we studied enhancer dynamics in primary mouse blood stem, progenitor, and specified cells. Stem and multipotent progenitor cells show scant H3K4me2 marking at enhancers bound by specific transcription factors in their committed progeny. Rather, enhancers are modulated dynamically and serially, with substantial loss and gain of H3K4me2, at each cellular transition. Quantitative analysis of these dynamics accurately modeled hematopoiesis according to Waddington’s notion of epigenotypes. Delineation of enhancers in terminal blood lineages coincides with cell specification, and enhancers active in single lineages show well-positioned H3K4me2- and H3K27ac-marked nucleosomes and DNaseI hypersensitivity in other cell types, revealing limited lineage fidelity. These findings demonstrate that enhancer chronology in blood cells differs markedly from that in intestinal crypts. Chromatin dynamics in hematopoiesis provide a useful foundation to consider classical observations such as cellular reprogramming and multilineage locus priming

Harvard University - DASH

Recommended from our members

NF-E2, FLI1 and RUNX1 collaborate at areas of dynamic chromatin to activate transcription in mature mouse megakaryocytes

Author: Chen Justina
Liu X. Shirley
Luyten Annouck
Shivdasani Ramesh A.
Zang Chongzhi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/10/2016
Field of study

Mutations in mouse and human Nfe2, Fli1 and Runx1 cause thrombocytopenia. We applied genome-wide chromatin dynamics and ChIP-seq to determine these transcription factors’ (TFs) activities in terminal megakaryocyte (MK) maturation. Enhancers with H3K4me2-marked nucleosome pairs were most enriched for NF-E2, FLI and RUNX sequence motifs, suggesting that this TF triad controls much of the late MK program. ChIP-seq revealed NF-E2 occupancy near previously implicated target genes, whose expression is compromised in Nfe2-null cells, and many other genes that become active late in MK differentiation. FLI and RUNX were also the motifs most enriched near NF-E2 binding sites and ChIP-seq implicated FLI1 and RUNX1 in activation of late MK, including NF-E2-dependent, genes. Histones showed limited activation in regions of single TF binding, while enhancers that bind NF-E2 and either RUNX1, FLI1 or both TFs gave the highest signals for TF occupancy and H3K4me2; these enhancers associated best with genes activated late in MK maturation. Thus, three essential TFs co-occupy late-acting cis-elements and show evidence for additive activity at genes responsible for platelet assembly and release. These findings provide a rich dataset of TF and chromatin dynamics in primary MK and explain why individual TF losses cause thrombopocytopenia

Harvard University - DASH

DARDN: A Deep-Learning Approach for CTCF Binding Sequence Classification and Oncogenic Regulatory Feature Discovery

Author: Aidong Zhang
Chongzhi Zang
Hyun Jae Cho
Stefan Bekiranov
Yidan Cong
Zhenjia Wang
Publication venue: MDPI AG
Publication date: 01/01/2024
Field of study

Characterization of gene regulatory mechanisms in cancer is a key task in cancer genomics. CCCTC-binding factor (CTCF), a DNA binding protein, exhibits specific binding patterns in the genome of cancer cells and has a non-canonical function to facilitate oncogenic transcription programs by cooperating with transcription factors bound at flanking distal regions. Identification of DNA sequence features from a broad genomic region that distinguish cancer-specific CTCF binding sites from regular CTCF binding sites can help find oncogenic transcription factors in a cancer type. However, the presence of long DNA sequences without localization information makes it difficult to perform conventional motif analysis. Here, we present DNAResDualNet (DARDN), a computational method that utilizes convolutional neural networks (CNNs) for predicting cancer-specific CTCF binding sites from long DNA sequences and employs DeepLIFT, a method for interpretability of deep learning models that explains the model’s output in terms of the contributions of its input features. The method is used for identifying DNA sequence features associated with cancer-specific CTCF binding. Evaluation on DNA sequences associated with CTCF binding sites in T-cell acute lymphoblastic leukemia (T-ALL) and other cancer types demonstrates DARDN’s ability in classifying DNA sequences surrounding cancer-specific CTCF binding from control constitutive CTCF binding and identifying sequence motifs for transcription factors potentially active in each specific cancer type. We identify potential oncogenic transcription factors in T-ALL, acute myeloid leukemia (AML), breast cancer (BRCA), colorectal cancer (CRC), lung adenocarcinoma (LUAD), and prostate cancer (PRAD). Our work demonstrates the power of advanced machine learning and feature discovery approach in finding biologically meaningful information from complex high-throughput sequencing data

Directory of Open Access Journals