thesis

A comprehensive analysis of Med12 controlled (l)ncRNAs and characterization of a novel Sall1 antisense transcript

Abstract

The function of the Mediator subunit Med12 on gene regulation has been widely studied and its interaction and regulation of protein coding genes broadly documented. However, only recently has its interaction with non-coding genes been verified. Analysis of transcriptome data from Med12 deficient embryonic stem cells (ESCs) revealed hundreds of misregulated protein coding genes, including multiple Wnt targets and genes involved in the developmental processes that were found affected in embryos previously generated with these cells. In addition to the protein coding genes, multiple misregulated non-coding genes were found during the analysis of transcriptome data generated from Med12 mutant cells, including several putative novel transcripts. Among these, an uncharacterized long non-coding (lnc)RNA was found to be differentially expressed cells, tissues and mouse embryos. This gene, designated as LN-BP18, encodes for antisense transcripts of Sall1, a gene also misregulated in the analysed cells. In humans, mutations in this gene are associated with Townes-Brocks syndrome (TBS), which shows several overlapping characteristics with MED12-associated X-linked intellectual disability syndromes. These features led to the deeper characterization of LN-BP18. Detailed gene and transcript analyses of this novel lncRNA led to the identification of two distinct transcription start sites (TSSs), termed TSS1 and TSS2. While TSS1 was active in all analysed tissues, TSS2 was found active only in ESCs. In vitro differentiation of ESCs confirmed this observation, with expression of transcripts originating from TSS1 increasing throughout the differentiation protocol, while the opposite dynamic occurring for TSS2 transcripts. Characterization of the gene structure revealed a complex splicing pattern, with its 7 exons spliced into 9 different isoforms, including spliced variants for three of the exons. Multiple analyses confirmed the lack of coding potential of all identified isoforms. BLAST searches revealed no homologous transcripts in other species, however, a non-conserved predicted lncRNA was described in human, which was also present in a divergent configuration relative to SALL1, suggesting a potential functional similarity to LN-BP18 despite the low sequence similarity. Expression analyses of the different mutant ESCs generated revealed a dynamic expression of LN-BP18. TSS2 transcripts, which were only detected in ESCs and not in embryonic tissues, showed a positive correlation with different pluripotency markers. This correlation, together with the ESC-specific activation of this TSS, suggested a potential role in the pluripotency network for isoforms originating from TSS2. Sall1 and LN-BP18 TSS1 transcripts were downregulated in Med12 depleted ESCs. Additionally, in Sall1-depleted cells, LN-BP18 was downregulated, with a strong effect observed for the TSS1 transcripts compared to TSS2. These observations, together with the co-expression of these two genes in embryonic tissues, suggested LN-BP18, specifically the TSS1, as a target of Sall1 activation. This activation is potentially Med12-dependent, since the effect on LN-BP18 expression was stronger upon Med12 depletion than in Sall1 deplete cells. A heterozygous LN-BP18-β-galactosidase reporter mutant ESC line was generated to detect expression of LN-BP18 in a more sensitive way. Expression of the reporter gene identified in addition to embryonal limb and caudal end expression seen by whole mount in situ hybridization (WISH), a clear expression in the pronephros, somites, neural tube, forebrain/midbrain- and midbrain/hindbrain-boundaries, demonstrating the importance of this reporter line for studying LN-BP18 expression and function during development. Finally, RNA-seq data from Med12 depleted ESC mutant cells was analysed together with public Med12 chromatin immunoprecipitation sequencing (ChIP-seq) data from ESCs. This analysis allowed identifying 12 lncRNAs, both annotated as well as new predictions, representing candidate lncRNAs whose expression is mediated by Med12. Compiled information for these genes presented here, offer insight into possible systems to analyse these genes in future studies as well as putative targets. Data from this thesis describe the genetic structure and expression of a previously uncharacterized lncRNA. These data, together with the different mutants generated of this gene establish the ground work for future studies clarifying the functions of LN-BP18 in ESCs, but also during embryonic development

    Similar works