10 research outputs found
Integrative Model-based clustering of microarray methylation and expression data
In many fields, researchers are interested in large and complex biological
processes. Two important examples are gene expression and DNA methylation in
genetics. One key problem is to identify aberrant patterns of these processes
and discover biologically distinct groups. In this article we develop a
model-based method for clustering such data. The basis of our method involves
the construction of a likelihood for any given partition of the subjects. We
introduce cluster specific latent indicators that, along with some standard
assumptions, impose a specific mixture distribution on each cluster. Estimation
is carried out using the EM algorithm. The methods extend naturally to multiple
data types of a similar nature, which leads to an integrated analysis over
multiple data platforms, resulting in higher discriminating power.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS533 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Supervised Classification Using Sparse Fisher's LDA
It is well known that in a supervised classification setting when the number
of features is smaller than the number of observations, Fisher's linear
discriminant rule is asymptotically Bayes. However, there are numerous modern
applications where classification is needed in the high-dimensional setting.
Naive implementation of Fisher's rule in this case fails to provide good
results because the sample covariance matrix is singular. Moreover, by
constructing a classifier that relies on all features the interpretation of the
results is challenging. Our goal is to provide robust classification that
relies only on a small subset of important features and accounts for the
underlying correlation structure. We apply a lasso-type penalty to the
discriminant vector to ensure sparsity of the solution and use a shrinkage type
estimator for the covariance matrix. The resulting optimization problem is
solved using an iterative coordinate ascent algorithm. Furthermore, we analyze
the effect of nonconvexity on the sparsity level of the solution and highlight
the difference between the penalized and the constrained versions of the
problem. The simulation results show that the proposed method performs
favorably in comparison to alternatives. The method is used to classify
leukemia patients based on DNA methylation features
Recommended from our members
Cell Type of Origin Influences the Molecular and Functional Properties of Mouse Induced Pluripotent Stem Cells
Induced pluripotent stem cells (iPSCs) have been derived from various somatic cell populations through ectopic expression of defined factors. It remains unclear whether iPSCs generated from different cell types are molecularly and functionally similar. Here we show that iPSCs obtained from mouse fibroblasts, hematopoietic and myogenic cells exhibit distinct transcriptional and epigenetic patterns. Moreover, we demonstrate that cellular origin influences the in vitro differentiation potentials of iPSCs into embryoid bodies and different hematopoietic cell types. Notably, continuous passaging of iPSCs largely attenuates these differences. Our results suggest that early-passage iPSCs retain a transient epigenetic memory of their somatic cells of origin, which manifests as differential gene expression and altered differentiation capacity. These observations may influence ongoing attempts to use iPSCs for disease modeling and could also be exploited in potential therapeutic applications to enhance differentiation into desired cell lineages.Stem Cell and Regenerative Biolog
Cytosine Methylation Dysregulation in Neonates Following Intrauterine Growth Restriction
Perturbations of the intrauterine environment can affect fetal development during critical periods of plasticity, and can increase susceptibility to a number of age-related diseases (e.g., type 2 diabetes mellitus; T2DM), manifesting as late as decades later. We hypothesized that this biological memory is mediated by permanent alterations of the epigenome in stem cell populations, and focused our studies specifically on DNA methylation in CD34+ hematopoietic stem and progenitor cells from cord blood from neonates with intrauterine growth restriction (IUGR) and control subjects.Our epigenomic assays utilized a two-stage design involving genome-wide discovery followed by quantitative, single-locus validation. We found that changes in cytosine methylation occur in response to IUGR of moderate degree and involving a restricted number of loci. We also identify specific loci that are targeted for dysregulation of DNA methylation, in particular the hepatocyte nuclear factor 4alpha (HNF4A) gene, a well-known diabetes candidate gene not previously associated with growth restriction in utero, and other loci encoding HNF4A-interacting proteins.Our results give insights into the potential contribution of epigenomic dysregulation in mediating the long-term consequences of IUGR, and demonstrate the value of this approach to studies of the fetal origin of adult disease
High-resolution genome-wide cytosine methylation profiling with simultaneous copy number analysis and optimization for limited cell numbers
Many genome-wide assays involve the generation of a subset (or representation) of the genome following restriction enzyme digestion. The use of enzymes sensitive to cytosine methylation allows high-throughput analysis of this epigenetic regulatory process. We show that the use of a dual-adapter approach allows us to generate genomic representations that includes fragments of <200 bp in size, previously not possible when using the standard approach of using a single adapter. By expanding the representation to smaller fragments using HpaII or MspI, we increase the representation by these isoschizomers to more than 1.32 million loci in the human genome, representing 98.5% of CpG islands and 91.1% of refSeq promoters. This advance allows the development of a new, high-resolution version of our HpaII-tiny fragment Enrichment by Ligation-mediated PCR (HELP) assay to study cytosine methylation. We also show that the MspI representation generates information about copy-number variation, that the assay can be used on as little as 10 ng of DNA and that massively parallel sequencing can be used as an alternative to microarrays to read the output of the assay, making this a powerful discovery platform for studies of genomic and epigenomic abnormalities
Stem and progenitor cells in myelodysplastic syndromes show aberrant stage-specific expansion and harbor genetic and epigenetic alterations
Even though hematopoietic stem cell (HSC) dysfunction is presumed in myelodysplastic syndrome (MDS), the exact nature of quantitative and qualitative alterations is unknown. We conducted a study of phenotypic and molecular alterations in highly fractionated stem and progenitor populations in a variety of MDS subtypes. We observed an expansion of the phenotypically primitive long-term HSCs (lineage ؊ /CD34 ؉ /CD38 ؊ /CD90 ؉ ) in MDS, which was most pronounced in higher-risk cases. These MDS HSCs demonstrated dysplastic clonogenic activity. Examination of progenitors revealed that lower-risk MDS i
Role for the DNA methylation system in polycomb proteinmediated gene regulation
Chromatin structure and epigenetic mechanisms play an important role in initiating and
maintaining the intricate patterns of gene expression required for embryonic development.
One such mechanism, DNA methylation (5mC), involves the chemical modification of
cytosine bases in DNA and is implicated in maintaining patterns of transcription. However,
many fundamental aspects of DNA methylation are not fully understood, including the
mechanisms by which it influences transcriptional states. Recent data suggest functional
links between DNA methylation and a second epigenetic mechanism that has important roles
in transcriptional repression, the polycomb group (PcG) repressor system. Here, I suggest
that an intact DNA methylation system is required for the repression of many PcG target
genes by influencing the genomic targeting of the polycomb repressor 2 complex (PRC2)
and its signature histone modification, H3K27me3 (K27me3). I demonstrate differential
genomic localisation of K27me3 at gene promoter regions in hypomethylated mouse
embryonic fibroblast (MEF) cells deficient for the major maintenance DNA
methyltransferase, Dnmt1. Globally, Dnmt1-/- MEFs have a higher level of the K27me3
mark than controls, as assessed by western blot and immunofluorescence. I observe
increased K27me3 at a relatively small number of gene promoters in Dnmt1-/- MEFs that
often are associated with high levels of DNA methylation in wildtype MEFs, consistent with
the notion that DNA methylation is capable of antagonising PRC2 binding at certain loci.
Conversely, I show that a large number of developmentally important genes that are
normally repressed and highly bound by K27me3, including classic polycomb targets, the
Hox genes, display dramatically reduced association with K27me3 in Dnmt1-/- MEFs. Many
of these genes, but not all, show reciprocal increases in promoter H3K4me3 modification
and are transcriptionally de-repressed in Dnmt1-/- MEFs. I suggest that these genes are
mostly associated with CpG-rich promoters with low levels of DNA methylation in wildtype
cells, implying that their silencing is not dependent on the canonical role of DNA
methylation. Consistent with the findings of recently published work, I suggest a working
model where PRC2 binding in wildtype cells is restricted by CpG methylation. According to
this model, the differential genomic location of K27me3 in hypomethylated Dnmt1-/- MEFs
is explained by a redistribution of PRC2 to normally DNA methylated, unbound loci,
resulting in a titration effect and coincident loss of K27me3 from normal targets. It was also
apparent that certain PRC2-target genes, including the developmentally important Hox gene
clusters, are strongly affected in Dnmt1-/- MEFs, displaying striking loss of K27me3. As
intergenic transcription has been implicated in relief from polycomb silencing and abundant
intergenic transcription has been reported within Hox clusters, I measured RNA expression
at Hox clusters and a small number of other PcG target genes in Dnmt1-/- MEFs using highdensity
tiling arrays. In Dnmt1-deficient MEFs, widespread increases in intergenic
transcription were observed within Hox clusters. In addition, mapping of the elongatingpolymerase-
associated H3K36me3 histone modification showed widespread increases in this
mark at intergenic and promoter regions in Dnmt1-/- MEFs. Increased local intergenic RNA
and H3K36me3 were found to correlate with K27me3 loss for this cohort of genes. I suggest
a working model where increased intergenic transcription and H3K36me3 in Dnmt1-/- MEFs
leads to accelerated loss of K27me3 at certain loci, including Hox clusters. Taken together
with recently published data, this work suggests that a major role of DNA methylation is in
shaping the PRC2/K27me3 landscape. The potential implications of this putative role for
DNA methylation are widespread, including our knowledge of how DNA methylation
influences transcriptional regulation, and the consequence of rearranged DNA methylation
patterns that are observed in many diseases including cancers