25 research outputs found
MaGIC: a machine learning tool set and web application for monoallelic gene inference from chromatin
Background: A large fraction of human and mouse autosomal genes are subject to random monoallelic expression (MAE), an epigenetic mechanism characterized by allele-specific gene expression that varies between clonal cell lineages. MAE is highly cell-type specific and mapping it in a large number of cell and tissue types can provide insight into its biological function. Its detection, however, remains challenging. Results: We previously reported that a sequence-independent chromatin signature identifies, with high sensitivity and specificity, genes subject to MAE in multiple tissue types using readily available ChIP-seq data. Here we present an implementation of this method as a user-friendly, open-source software pipeline for monoallelic gene inference from chromatin (MaGIC). The source code for the MaGIC pipeline and the Shiny app is available at https://github.com/gimelbrantlab/magic Conclusion: The pipeline can be used by researchers to map monoallelic expression in a variety of cell types using existing models and to train new models with additional sets of chromatin marks.National Institutes of Health (U.S.) (award U54 HG007963
Recommended from our members
Chromatin signature of widespread monoallelic expression
In mammals, numerous autosomal genes are subject to mitotically stable monoallelic expression (MAE), including genes that play critical roles in a variety of human diseases. Due to challenges posed by the clonal nature of MAE, very little is known about its regulation; in particular, no molecular features have been specifically linked to MAE. In this study, we report an approach that distinguishes MAE genes in human cells with great accuracy: a chromatin signature consisting of chromatin marks associated with active transcription (H3K36me3) and silencing (H3K27me3) simultaneously occurring in the gene body. The MAE signature is present in ∼20% of ubiquitously expressed genes and over 30% of tissue-specific genes across cell types. Notably, it is enriched among key developmental genes that have bivalent chromatin structure in pluripotent cells. Our results open a new approach to the study of MAE that is independent of polymorphisms, and suggest that MAE is linked to cell differentiation. DOI: http://dx.doi.org/10.7554/eLife.01256.00
Recommended from our members
Chromatin Signature Identifies Monoallelic Gene Expression Across Mammalian Cell Types
Monoallelic expression of autosomal genes (MAE) is a widespread epigenetic phenomenon which is poorly understood, due in part to current limitations of genome-wide approaches for assessing it. Recently, we reported that a specific histone modification signature is strongly associated with MAE and demonstrated that it can serve as a proxy of MAE in human lymphoblastoid cells. Here, we use murine cells to establish that this chromatin signature is conserved between mouse and human and is associated with MAE in multiple cell types. Our analyses reveal extensive conservation in the identity of MAE genes between the two species. By analyzing MAE chromatin signature in a large number of cell and tissue types, we show that it remains consistent during terminal cell differentiation and is predominant among cell-type specific genes, suggesting a link between MAE and specification of cell identity
Derivation of Pre-X Inactivation Human Embryonic Stem Cells under Physiological Oxygen Concentrations
The presence of two active X chromosomes (XaXa) is a hallmark of the ground state of pluripotency specific to murine embryonic stem cells (ESCs). Human ESCs (hESCs) invariably exhibit signs of X chromosome inactivation (XCI) and are considered developmentally more advanced than their murine counterparts. We describe the establishment of XaXa hESCs derived under physiological oxygen concentrations. Using these cell lines, we demonstrate that (1) differentiation of hESCs induces random XCI in a manner similar to murine ESCs, (2) chronic exposure to atmospheric oxygen is sufficient to induce irreversible XCI with minor changes of the transcriptome, (3) the Xa exhibits heavy methylation of the XIST promoter region, and (4) XCI is associated with demethylation and transcriptional activation of XIST along with H3K27-me3 deposition across the Xi. These findings indicate that the human blastocyst contains pre-X-inactivation cells and that this state is preserved in vitro through culture under physiological oxygen.Susan WhiteheadHillel and Liliana Bachrac
An epigenetic state associated with areas of gene duplication
Asynchronous DNA replication is an epigenetically determined feature found in all cases of monoallelic expression, including genomic imprinting, X-inactivation, and random monoallelic expression of autosomal genes such as immunoglobulins and olfactory receptor genes. Most genes of the latter class were identified in experiments focused on genes functioning in the chemosensory and immune systems. We performed an unbiased survey of asynchronous replication in the mouse genome, excluding known asynchronously replicated genes. Fully 10% (eight of 80) of the genes tested exhibited asynchronous replication. A common feature of the newly identified asynchronously replicated areas is their proximity to areas of tandem gene duplication. Testing of other clustered areas supported the idea that such regions are enriched with asynchronously replicated genes
Recommended from our members
dbMAE: the database of autosomal monoallelic expression
Recently, data on ‘random’ autosomal monoallelic expression has become available for the entire genome in multiple human and mouse tissues and cell types, creating a need for better access and dissemination. The database of autosomal monoallelic expression (dbMAE; https://mae.hms.harvard.edu) incorporates data from multiple recent reports of genome-wide analyses. These include transcriptome-wide analyses of allelic imbalance in clonal cell populations based on sequence polymorphisms, as well as indirect identification, based on a specific chromatin signature present in MAE gene bodies. Currently, dbMAE contains transcriptome-wide chromatin identification calls for 8 human and 21 mouse tissues, and describes over 16 000 murine and ∼700 human cases of directly measured biased expression, compiled from allele-specific RNA-seq and genotyping array data. All data are manually curated. To ensure cross-publication uniformity, we performed re-analysis of transcriptome-wide RNA-seq data using the same pipeline. Data are accessed through an interface that allows for basic and advanced searches; all source references, including raw data, are clearly described and hyperlinked. This ensures the utility of the resource as an initial screening tool for those interested in investigating the role of monoallelic expression in their specific genes and tissues of interest
Risk alleles of genes with monoallelic expression are enriched in gain-of-function variants and depleted in loss-of-function variants for neurodevelopmental disorders
Over 3,000 human genes can be expressed from a single allele in one cell, and from the other allele – or both – in neighboring cells. Little is known about the consequences of this epigenetic phenomenon, monoallelic expression (MAE). We hypothesized that MAE increases expression variability, with potential impact on human disease. Here, we use a chromatin signature to infer MAE for genes in lymphoblastoid cell lines and human fetal brain tissue. We confirm that across clones, MAE status correlates with expression level, and that in human tissue datasets, MAE genes show increased expression variability. We then compare mono- and biallelic genes at three distinct scales. In the human population, we observe that genes with polymorphisms influencing expression variance are more likely to be MAE (P < 1.1 × 10−6). At the trans-species level, we find gene expression differences and directional selection between humans and chimpanzees more common among MAE genes (P < 0.05). Extending to human disease, we show that MAE genes are underrepresented in neurodevelopmental CNVs (P < 2.2×10−10) suggesting that pathogenic variants acting via expression level are less likely to involve MAE genes. Using neuropsychiatric SNP and SNV data, we see that genes with pathogenic expression-altering or loss-of-function variants are less likely MAE (P < 7.5×10−11) and genes with only missense or gain-of-function variants are more likely MAE (P < 1.4×10−6). Together, our results suggest that MAE genes tolerate a greater range of expression level than BAE genes and this information may be useful in prediction of pathogenicity
Replicate sequencing libraries are important for quantification of allelic imbalance
Allele-specific expression in diploid organisms can be quantified by RNA-seq and it is common practice to rely on a single library. Here, the authors show that the standard approach has variable error rate and present Qllelic as a tool to improve reproducibility of allele-specific RNA-seq analysis