Long non-coding RNAs in siliques of Arabidopsis thaliana ecotypes and PRC2 mutants

Abstract

Transcriptomic studies from many eukaryotic species have shown that in addition to protein coding mRNA, there exists RNA that appears to have no protein coding potential. One class of such RNAs are long non-coding RNAs (lncRNAs) whose length is greater than 200 nucleotides. In this research project, the extent and diversity of lncRNAs expressed during developing siliques was investigated. To achieve this, next-generation RNA-sequencing (Illumina®) was performed on Arabidopsis thaliana silique RNA for ecotypes Col-0 and C24. Reciprocal crosses of these two ecotypes were also sequenced, to investigate potential parent of origin expression in developing seeds of these siliques. After assembling transcripts, known gene models and transcripts containing peptide potential were removed, revealing 2,807 potential lncRNAs. The lncRNAs identified had diverse genomic locations; antisense to protein coding mRNAs, sense and antisense within both intergenic regions and intronic regions. LncRNAs had a median length of approximately 500 nt, contained one to two exons but minorities were alternatively spliced. Candidate lncRNAs were investigated, some being potentially imprinted, others being in proximity to genes expressed exclusively in the endosperm and a minority of lncRNAs were methylated. In animals and plants, it is known that lncRNAs bind to Polycomb Repressive Complex 2 (PRC2) and are located at loci targeted by PRC2. To further investigate this in silique and seed development, lncRNAs were identified in reproductive-specific PRC2 mutants. A total 2,362 lncRNAs were identified; 55% (1,296) were exclusively identified in PRC2 mutants and are potentially targeted by PRC2. PRC2 mutants induced transcriptome wide differential expression of 8,212 genes, in particular transcriptional regulators, transcription factors and DNA binding proteins. Furthermore, 520 lncRNAs were differentially expressed in PRC2 mutants. Novel lncRNA candidates were explored, many being exclusively expressed in the absence of PRC2 and were in proximity to key genes involved in transcription regulation, such as transcription factors. As PRC2 regulates endosperm development and governs post-zygotic hybridisation barriers, inter-genus crosses between Arabidopsis thaliana and Boechera pinetorum created with mutations in PRC2 were investigated as part of a long term aim. Using PCR and genotyping sequencing, it was confirmed that all hybrids contained the genomes of both distant parents. It was confirmed that PRC2 mutations facilitated the generation of A. thaliana x Boechera pinetorum hybrids, although it is not known how. With the hybrids confirmed, this provided a platform for research into roles of lncRNAs in alleviating post-zygotic hybridisation barriers. Overall, this research project identified a total 4,147 lncRNAs in A. thaliana siliques from various ecotypes, crosses and mutants. 64% (2,652) were novel, not being reported by any other study. However, further experiments are required to validate lncRNAs and elucidate their functions. The 4,147 lncRNAs identified are an important contribution to plant lncRNA research, providing a novel resource to understand the role lncRNAs play in plant biology

    Similar works