Search CORE

32 research outputs found

Controlling for conservation in genome-wide DNA methylation studies

Author: Pachter Lior
Singer Meromit
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/05/2015
Field of study

BACKGROUND: A commonplace analysis in high-throughput DNA methylation studies is the comparison of methylation extent between different functional regions, computed by averaging methylation states within region types and then comparing averages between regions. For example, it has been reported that methylation is more prevalent in coding regions as compared to their neighboring introns or UTRs, leading to hypotheses about novel forms of epigenetic regulation. RESULTS: We have identified and characterized a bias present in these seemingly straightforward comparisons that results in the false detection of differences in methylation intensities across region types. This bias arises due to differences in conservation rates, rather than methylation rates, and is broadly present in the published literature. When controlling for conservation at coding start sites the differences in DNA methylation rates disappear. Moreover, a re-evaluation of methylation rates at intronexon junctions reveals that the magnitude of previously reported differences is greatly exaggerated. We introduce two correction methods to address this bias, an inference-based matrix completion algorithm and an averaging approach, tailored to address different underlying biological questions. We evaluate how analysis using these corrections affects the detection of differences in DNA methylation across functional boundaries. CONCLUSIONS: We report here on a bias in DNA methylation comparative studies that originates in conservation rate differences and manifests itself in the false discovery of differences in DNA methylation intensities and their extents. We have characterized this bias and its broad implications, and show how to control for it so as to enable the study of a variety of biological questions

Springer - Publisher Connector

PubMed Central

Caltech Authors

Identification and correction of systematic error in high-throughput sequence data

Author: Dario Boffelli
David I. K. Martin
Frazer Meacham
Joseph Dhahbi
Lior Pachter
Meromit Singer
Publication venue
Publication date: 01/01/2011
Field of study

A feature common to all DNA sequencing technologies is the presence of base-call errors in the sequenced reads. The implications of such errors are application specific, ranging from minor informatics nuisances to major problems affecting biological inferences. Recently developed “next-gen” sequencing technologies have greatly reduced the cost of sequencing, but have been shown to be more error prone than previous technologies. Both position specific (depending on the location in the read) and sequence specific (depending on the sequence in the read) errors have been identified in Illumina and Life Technology sequencing platforms. We describe a new type of _systematic_ error that manifests as statistically unlikely accumulations of errors at specific genome (or transcriptome) locations. We characterize and describe systematic errors using overlapping paired reads form high-coverage data. We show that such errors occur in approximately 1 in 1000 base pairs, and that quality scores at systematic error sites do not account for the extent of errors. We identify motifs that are frequent at systematic error sites, and describe a classifier that distinguishes heterozygous sites from systematic error. Our classifier is designed to accommodate data from experiments in which the allele frequencies at heterozygous sites are not necessarily 0.5 (such as in the case of RNA-Seq). Systematic errors can easily be mistaken for heterozygous sites in individuals, or for SNPs in population analyses. Systematic errors are particularly problematic in low coverage experiments, or in estimates of allele-specific expression from RNA-Seq data. Our characterization of systematic error has allowed us to develop a program, called SysCall, for identifying and correcting such errors. We conclude that correction of systematic errors is important to consider in the design and interpretation of high-throughput sequencing experiments

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Caltech Authors

Nature Precedings

A diverse epigenetic landscape at human exons with implication for expression

Author: Kosti Idit
Mandel-Gutfreund Yael
Pachter Lior
Singer Meromit
Publication venue: 'Oxford University Press (OUP)'
Publication date: 12/03/2015
Field of study

DNA methylation is an important epigenetic marker associated with gene expression regulation in eukaryotes. While promoter methylation is relatively well characterized, the role of intragenic DNA methylation remains unclear. Here, we investigated the relationship of DNA methylation at exons and flanking introns with gene expression and histone modifications generated from a human fibroblast cell-line and primary B cells. Consistent with previous work we found that intragenic methylation is positively correlated with gene expression and that exons are more highly methylated than their neighboring intronic environment. Intriguingly, in this study we identified a unique subset of hypomethylated exons that demonstrate significantly lower methylation levels than their surrounding introns. Furthermore, we observed a negative correlation between exon methylation and the density of the majority of histone modifications. Specifically, we demonstrate that hypo-methylated exons at highly expressed genes are associated with open chromatin and have a characteristic histone code comprised of significantly high levels of histone markings. Overall, our comprehensive analysis of the human exome supports the presence of regulatory hypomethylated exons in protein coding genes. In particular our results reveal a previously unrecognized diverse and complex role of the epigenetic landscape within the gene body

PubMed Central

Caltech Authors

MetMap Enables Genome-Scale Methyltyping for Determining Methylation States in Populations

Author: Boffelli Dario
Dhahbi Joseph
Martin David I. K.
Pachter Lior
Schroth Gary P.
Schönhuth Alexander
Singer Meromit
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/08/2010
Field of study

The ability to assay genome-scale methylation patterns using high-throughput sequencing makes it possible to carry out association studies to determine the relationship between epigenetic variation and phenotype. While bisulfite sequencing can determine a methylome at high resolution, cost inhibits its use in comparative and population studies. MethylSeq, based on sequencing of fragment ends produced by a methylation-sensitive restriction enzyme, is a method for methyltyping (survey of methylation states) and is a site-specific and cost-effective alternative to whole-genome bisulfite sequencing. Despite its advantages, the use of MethylSeq has been restricted by biases in MethylSeq data that complicate the determination of methyltypes. Here we introduce a statistical method, MetMap, that produces corrected site-specific methylation states from MethylSeq experiments and annotates unmethylated islands across the genome. MetMap integrates genome sequence information with experimental data, in a statistically sound and cohesive Bayesian Network. It infers the extent of methylation at individual CGs and across regions, and serves as a framework for comparative methylation analysis within and among species. We validated MetMap's inferences with direct bisulfite sequencing, showing that the methylation status of sites and islands is accurately inferred. We used MetMap to analyze MethylSeq data from four human neutrophil samples, identifying novel, highly unmethylated islands that are invisible to sequence-based annotation strategies. The combination of MethylSeq and MetMap is a powerful and cost-effective tool for determining genome-scale methyltypes suitable for comparative and association studies

Directory of Open Access Journals

PubMed Central

Caltech Authors

A Distinct Gene Module for Dysfunction Uncoupled from Activation in Tumor-Infiltrating T Cells

Author: Anderson Ana C.
Cong Le
Gennert David
Herbst Rebecca H.
Kowalczyk Monika S.
Kuchroo Vijay K.
Kurtulus Sema
Kwon John Y.H.
Marjanovic Nemanja
Nevin James
Nyman Jackson
Regev Aviv
Rozenblatt-Rosen Orit
Sakuishi Kaori
Singer Meromit
Wang Chao, 1972-
Xia Junrong
Yanai Itai
Zhang Huiyuan
Publication venue: 'Elsevier BV'
Publication date: 03/07/2018
Field of study

Reversing the dysfunctional T cell state that arises in cancer and chronic viral infections is the focus of therapeutic interventions; however, current therapies are effective in only some patients and some tumor types. To gain a deeper molecular understanding of the dysfunctional T cell state, we analyzed population and single-cell RNA profiles of CD8+tumor-infiltrating lymphocytes (TILs) and used genetic perturbations to identify a distinct gene module for T cell dysfunction that can be uncoupled from T cell activation. This distinct dysfunction module is downstream of intracellular metallothioneins that regulate zinc metabolism and can be identified at single-cell resolution. We further identify Gata-3, a zinc-finger transcription factor in the dysfunctional module, as a regulator of dysfunction, and we use CRISPR-Cas9 genome editing to show that it drives a dysfunctional phenotype in CD8+TILs. Our results open novel avenues for targeting dysfunctional T cell states while leaving activation programs intact

DSpace@MIT

Purine synthesis promotes maintenance of brain tumor initiating cells in glioma

Author: Bao Shideng
Fang Xiaoguang
Flavahan William A
Huang Zhi
Hubert Christopher G
Hwang Tae Hyun
Kim Leo J Y
Liu Xiaojing
Locasale Jason W
Mack Stephen C
Miller Tyler E
Prager Briana C
Regev Aviv
Rich Jeremy N
Shi Yu
Singer Meromit
Suvà Mario L
Wang Xiuxing
Wu Qiulian
Xie Qi
Yang Kailin
Zhou Wenchao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/07/2018
Field of study

Brain tumor initiating cells (BTICs), also known as cancer stem cells, hijack high-affinity glucose uptake active normally in neurons to maintain energy demands. Here we link metabolic dysregulation in human BTICs to a nexus between MYC and de novo purine synthesis, mediating glucose-sustained anabolic metabolism. Inhibiting purine synthesis abrogated BTIC growth, self-renewal and in vivo tumor formation by depleting intracellular pools of purine nucleotides, supporting purine synthesis as a potential therapeutic point of fragility. In contrast, differentiated glioma cells were unaffected by the targeting of purine biosynthetic enzymes, suggesting selective dependence of BTICs. MYC coordinated the control of purine synthetic enzymes, supporting its role in metabolic reprogramming. Elevated expression of purine synthetic enzymes correlated with poor prognosis in glioblastoma patients. Collectively, our results suggest that stem-like glioma cells reprogram their metabolism to self-renew and fuel the tumor hierarchy, revealing potential BTIC cancer dependencies amenable to targeted therapy

DSpace@MIT

Induction and transcriptional regulation of the co-inhibitory gene module in T cells

Author: Acharya Nandini
Anderson Ana C.
Buckley Christopher
Buenrostro Jason D.
Burkett Patrick R.
Chihara Norio
Etminan Yasaman
Kondo Takaaki
Kowalczyk Monika S.
Kuchroo Vijay K.
Kurtulus Sema
Law Travis
Madi Asaf
Marjanovic Nemanja D.
Nevin James
Nyman Jackson
Regev Aviv
Rozenblatt-Rosen Orit
Singer Meromit
Wang Chao
Zhang Huiyuan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Expression of co-inhibitory receptors, such as CTLA-4 and PD-1, on effector T cells is a key mechanism for ensuring immune homeostasis. Dysregulated co-inhibitory receptor expression on CD4+ T cells promotes autoimmunity while sustained overexpression on CD8+ T cells promotes T cell dysfunction or exhaustion, leading to impaired ability to clear chronic viral infections and cancer1,2. Here, we used RNA and protein expression profiling at single-cell resolution to identify a module of co-inhibitory receptors that includes not only several known co-inhibitory receptors (PD-1, Tim-3, Lag-3, and TIGIT), but also a number of novel surface receptors. We functionally validated two novel co-inhibitory receptors, Activated protein C receptor (Procr) and Podoplanin (Pdpn). The module of co-inhibitory receptors is co-expressed in both CD4+ and CD8+ T cells and is part of a larger co-inhibitory gene program that is shared by non-responsive T cells in multiple physiological contexts and is driven by the immunoregulatory cytokine IL-27. Computational analysis identified the transcription factors Prdm1 and c-Maf as cooperative regulators of the co-inhibitory module, which we validated experimentally. This molecular circuit underlies the co-expression of co-inhibitory receptors in T cells and identifies novel regulators of T cell function with the potential to regulate autoimmunity and tumor immunity

Crossref

University of Birmingham Research Portal

Oxford University Research Archive

Controlling for conservation in genome-wide DNA methylation studies

Author: A Akalin
A Feldmann
A Molaro
A Zemach
A Zemach
AM Deaton
AT Chinwalla
CM Bishop
CR Blyth
D Karolchik
D Liang
D Tian
D Zilberman
EC Chi
F Gao
GU Yule
H Guo
H Kobayashi
IJ Good
J Ernst
J Pearl
J Su
JA Hackett
JI Gent
JK Choi
JK Choi
JT Bell
K Daily
K Flores
KB Flores
L Laurent
Lior Pachter
M Ehrlich
M Regulski
M Zhang
Meromit Singer
MP Ball
MP Simmons
P-Y Chen
PJ Bickel
R Lister
R Lister
RK Chodavarapu
S Feng
S Gelfman
S Sati
S Zhong
SJ Cokus
T Khare
T-F Hsieh
WF Doolittle
X Wang
X Wang
X Zhang
Y Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recommended from our members

Statistical algorithms in the study of mammalian DNA methylation

Author: Singer Meromit
Publication venue: eScholarship, University of California
Publication date: 01/01/2012
Field of study

DNA methylation is a dynamic chemical modification that is abundant on DNA sequences and plays a central role in the regulatory mechanisms of cells. This modification can be inherited across cell divisions and generations, providing a ``memory mechanism" for regulatory programs that is more flexible than that coded in the DNA sequence. In recent years, high-throughput sequencing technologies have enabled genome-wide annotation of DNA methylation. Coupled with novel computational machinery, these developments have enabled unperceivable insight to the characteristics, biological function and disease association of this phenomenon. The collaborations between experimental and computational researches who take part in these efforts has been closer than ever before due to the need to involve computational methodologies throughout the entire research pipeline, from experimental design through bias correction to the analysis of large datasets. In the first part of this thesis we present contributions to the field of high-throughput DNA methylation. We introduce statistically sound criteria for the detection of methylation signatures in DNA sequence, and present an algorithm for the annotation of an informative non-overlapping subset of such regions that is optimal under biologically motivated assumptions. Our method outputs a sequence-generated list of regions that are of interest with respect to their methylation states. We then present a Bayesian network to infer corrected site-specific methylation states from a favorable but biased experimental method, and describe its incorporation in a software package. Along with site-specific methylation calls our package annotates experiment-specific regions of interest by considering both the methylation state inferences and the genomic sequence. These regions can serve as a basis for comparative methylation studies. In the last chapter of this section we bring results from a genome-scale comparative study conducted on humans, chimpanzees and an orangutan, providing evidence of DNA methylation differences that propagate through generations and distinguish these closely related species. The second part of this thesis concerns error correction in high-throughput sequencing datasets. In the course of studying DNA methylation with high-throughput sequencing we discovered a systematic error that results in false-positive variant detection and can significantly affect biological inferences in a variety of genomic studies. We present a classifier to correct for such errors and show that it performs very well with respect to both sensitivity and specificity

eScholarship - University of California