58 research outputs found

    Characterization of Human Pseudogene-Derived Non-Coding RNAs for Functional Potential

    No full text
    <div><p>Thousands of pseudogenes exist in the human genome and many are transcribed, but their functional potential remains elusive and understudied. To explore these issues systematically, we first developed a computational pipeline to identify transcribed pseudogenes from RNA-Seq data. Applying the pipeline to datasets from 16 distinct normal human tissues identified ∼3,000 pseudogenes that could produce non-coding RNAs in a manner of low abundance but high tissue specificity under normal physiological conditions. Cross-tissue comparison revealed that the transcriptional profiles of pseudogenes and their parent genes showed mostly positive correlations, suggesting that pseudogene transcription could have a positive effect on the expression of their parent genes, perhaps by functioning as competing endogenous RNAs (ceRNAs), as previously suggested and demonstrated with the <i>PTEN</i> pseudogene, <i>PTENP1</i>. Our analysis of the ENCODE project data also found many transcriptionally active pseudogenes in the GM12878 and K562 cell lines; moreover, it showed that many human pseudogenes produced small RNAs (sRNAs) and some pseudogene-derived sRNAs, especially those from antisense strands, exhibited evidence of interfering with gene expression. Further integrated analysis of transcriptomics and epigenomics data, however, demonstrated that trimethylation of histone 3 at lysine 9 (H3K9me3), a posttranslational modification typically associated with gene repression and heterochromatin, was enriched at many transcribed pseudogenes in a transcription-level dependent manner in the two cell lines. The H3K9me3 enrichment was more prominent in pseudogenes that produced sRNAs at pseudogene loci and their adjacent regions, an observation further supported by the co-enrichment of SETDB1 (a H3K9 methyltransferase), suggesting that pseudogene sRNAs may have a role in regional chromatin repression. Taken together, our comprehensive and systematic characterization of pseudogene transcription uncovers a complex picture of how pseudogene ncRNAs could influence gene and pseudogene expression, at both epigenetic and post-transcriptional levels.</p></div

    Pseudogene-derived sRNAs and their relationship to parental gene repression.

    No full text
    <p>A) Processed pseudogenes had higher sRNA read densities than any other annotated genomic elements and randomly chosen genomic regions in both GM12878 and K562 cell lines. B) Pseudogenes with mapped sRNA reads (≥5 reads per kb) were separated into two groups based on the abundance of sRNA reads in the adjacent non-pseudogene regions (±1 kb, orange). Group I was considered to produce sRNA interactively with their parents while group II produced sRNA independently. Venn diagrams show the data comparison between GM12878 (red) and K562 (green). C) The parental genes of group I pseudogenes showed significantly lower expression than either those of the pseudogenes without sRNA (control) or those of the group II pseudogenes, in both GM12878 (red) and K562 (green). The parents of antisense transcribed pseudogenes (>5 sRNA/kb) exhibited even lower expression. The same trends held when the analysis was carried out for pseudogenes with >10 sRNA/kb. Parents not expressed in the 16 normal tissues (i.e., FPKM = 0) were not included in these plots.</p

    Top pseudogene candidates of three different types of predicted functional potentials (ND, not determined). The full lists can be found in Table S1.

    No full text
    <p>Top pseudogene candidates of three different types of predicted functional potentials (ND, not determined). The full lists can be found in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0093972#pone.0093972.s008" target="_blank">Table S1</a>.</p

    Selection constraints on transcribed pseudogenes.

    No full text
    <p>Comparison of nucleotide diversities in human population (A) and cross-species conservations (B) between non-transcribed (‘n’) and transcribed pseudogenes (‘y’). AluY, a young repeats that emerged recently in primates, was used as control. For duplicated pseudogenes, the median diversities for transcribed and non-transcribed are 0. 00051 and 0.00054 (p<0.02, Wilcoxon test), the values for processed pseudogenes are 0.00055 and 0.00064 (p<3e-06, Wilcoxon test).</p

    Transcriptional correlations (ρ<sub>pg:g</sub>) between pseudogenes and their parents.

    No full text
    <p>A) A heatmap for distribution of ρ<sub>pg:g</sub>, including data from separation of processed and duplicated pseudogenes into two groups based on the presence of a coding gene within 20 kb. The coefficients between transcribed pseudogenes and randomly chosen coding genes (top) were used as a control for p-value estimation. Colors represent relative numbers of pseudogenes in each ρ<sub>pg:g</sub> range (in Z-score transformation). B) Pseudogenes transcribed in the sense direction (S) exhibited higher ρ<sub>pg:g</sub> than those in the antisense (A). C) The transcriptional correlation between pseudogenes and their parents (ρ<sub>pg:g</sub>) is inversely correlated to the transcriptional correlation between miRNAs and their putative targets (ρ<sub>miRNA:g</sub>). Genes were binned on their ρ<sub>miRNA:g</sub> values (x-axis) and then the mean and standard deviation of ρ<sub>pg:g</sub> (y-axis) for each group of genes was plotted. D) Expression of parental genes targeted by miRNAs was less affected by miRNA KD than the targeting genes without pseudogenes. Only genes in response to KD (up >1.3 fold) were analyzed here. Y-axis shows the fold change of KD over control. The miRNA targets were experimentally determined by the CLASH analysis <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0093972#pone.0093972-Helwak1" target="_blank">[49]</a>. The middle line in the boxplots mark median and the box lines mark the first and third quartile values (same for boxplots below).</p

    High tissue specificity of pseudogene transcription.

    No full text
    <p>A) Heatmap for the transcription levels of 982 highly transcribed pseudogenes (maximal FPKM >10). B) Violin plots showing tissue-specificity JS scores of lincRNAs, transcribed pseudogenes, their parents, and the coding genes without pseudogenes. C) Comparison of JS scores at different transcription levels. The white dots mark median and the thick boxes mark the first and third quartile values.</p

    Enrichment of H3K9me3 modification at transcribed pseudogene loci.

    No full text
    <p>A) Heatmap of H3K36me3 near the transcription start sites (TSS) and transcription end sites (TES) of transcribed (bottom) and non-transcribed pseudogenes (top). The color scheme is based on column-based normalization data in GM12878, whereas each row is a pseudogene. B) Transcription level dependent enrichment of H3K9me3 at transcribed pseudogenes. Y-axis shows the average number of H3K9me3 ChIP-Seq reads per 500 bp. C) & D) The level of H3K9me3 (red) but not H3K27me3 (green) was significantly higher at group II pseudogenes (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0093972#pone-0093972-g005" target="_blank">Fig. 5</a>) than at group I pseudogenes or at pseudogenes loci producing no sRNAs (“C”, controls). The H3K9me3 level at a randomly selected set of LINE (blue) was also plotted as positive controls. Y-axis plots ChIP-Seq reads at pseudogene bodies, normalized to per 500-bp sequences. E) The densities of H3K36me3, H3K27me3, and H3K9me3 ChIP-Seq reads and sRNA-Seq reads at a region with multiple pseudogenes derived from a gene encoding NADH dehydrogenase. F–H) The average ChIP-Seq profiles, anchored on pseudogene centers, of H3K9me3 in GM12878 (F) and in K562 (G) and of SETDB1 in K562 (H) for the three groups of pseudogenes. Y-axes show the average numbers of ChIP-Seq reads per 100 bp.</p

    Allele-Biased Expression in Differentiating Human Neurons: Implications for Neuropsychiatric Disorders

    Get PDF
    <div><p>Stochastic processes and imprinting, along with genetic factors, lead to monoallelic or allele-biased gene expression. Stochastic monoallelic expression fine-tunes information processing in immune cells and the olfactory system, and imprinting plays an important role in development. Recent studies suggest that both stochastic events and imprinting may be more widespread than previously considered. We are interested in allele-biased gene expression occurring in the brain because parent-of-origin effects suggestive of imprinting appear to play a role in the transmission of schizophrenia (SZ) and autism spectrum disorders (ASD) in some families. In addition, allele-biased expression could help explain monozygotic (MZ) twin discordance and reduced penetrance. The ability to study allele-biased expression in human neurons has been transformed with the advent of induced pluripotent stem cell (iPSC) technology and next generation sequencing. Using transcriptome sequencing (RNA-Seq) we identified 801 genes in differentiating neurons that were expressed in an allele-biased manner. These included a number of putative SZ and ASD candidates, such as <em>A2BP1</em> (<em>RBFOX1</em>), <em>ERBB4, NLGN4X, NRG1, NRG3, NRXN1,</em> and <em>NLGN1</em>. Overall, there was a modest enrichment for SZ and ASD candidate genes among those that showed evidence for allele-biased expression (chi-square, p = 0.02). In addition to helping explain MZ twin discordance and reduced penetrance, the capacity to group many candidate genes affecting a variety of molecular and cellular pathways under a common regulatory process – allele-biased expression – could have therapeutic implications.</p> </div

    Heat Shock Alters the Expression of Schizophrenia and Autism Candidate Genes in an Induced Pluripotent Stem Cell Model of the Human Telencephalon

    No full text
    <div><p>Schizophrenia (SZ) and autism spectrum disorders (ASD) are highly heritable neuropsychiatric disorders, although environmental factors, such as maternal immune activation (MIA), play a role as well. Cytokines mediate the effects of MIA on neurogenesis and behavior in animal models. However, MIA stimulators can also induce a febrile reaction, which could have independent effects on neurogenesis through heat shock (HS)-regulated cellular stress pathways. However, this has not been well-studied. To help understand the role of fever in MIA, we used a recently described model of human brain development in which induced pluripotent stem cells (iPSCs) differentiate into 3-dimensional neuronal aggregates that resemble a first trimester telencephalon. RNA-seq was carried out on aggregates that were heat shocked at 39°C for 24 hours, along with their control partners maintained at 37°C. 186 genes showed significant differences in expression following HS (p<0.05), including known HS-inducible genes, as expected, as well as those coding for <i>NGFR</i> and a number of SZ and ASD candidates, including <i>SMARCA2, DPP10, ARNT2, AHI1</i> and <i>ZNF804A.</i> The degree to which the expression of these genes decrease or increase during HS is similar to that found in copy loss and copy gain copy number variants (CNVs), although the effects of HS are likely to be transient. The dramatic effect on the expression of some SZ and ASD genes places HS, and perhaps other cellular stressors, into a common conceptual framework with disease-causing genetic variants. The findings also suggest that some candidate genes that are assumed to have a relatively limited impact on SZ and ASD pathogenesis based on a small number of positive genetic findings, such as <i>SMARCA2</i> and <i>ARNT2</i>, may in fact have a much more substantial role in these disorders - as targets of common environmental stressors.</p></div
    corecore