Transcriptomic studies from many eukaryotic species have shown
that in addition to protein coding mRNA, there exists RNA that
appears to have no protein coding potential. One class of such
RNAs are long non-coding RNAs (lncRNAs) whose length is greater
than 200 nucleotides. In this research project, the extent and
diversity of lncRNAs expressed during developing siliques was
investigated. To achieve this, next-generation RNA-sequencing
(Illumina®) was performed on Arabidopsis thaliana silique RNA
for ecotypes Col-0 and C24. Reciprocal crosses of these two
ecotypes were also sequenced, to investigate potential parent of
origin expression in developing seeds of these siliques. After
assembling transcripts, known gene models and transcripts
containing peptide potential were removed, revealing 2,807
potential lncRNAs. The lncRNAs identified had diverse genomic
locations; antisense to protein coding mRNAs, sense and antisense
within both intergenic regions and intronic regions. LncRNAs had
a median length of approximately 500 nt, contained one to two
exons but minorities were alternatively spliced. Candidate
lncRNAs were investigated, some being potentially imprinted,
others being in proximity to genes expressed exclusively in the
endosperm and a minority of lncRNAs were methylated.
In animals and plants, it is known that lncRNAs bind to Polycomb
Repressive Complex 2 (PRC2) and are located at loci targeted by
PRC2. To further investigate this in silique and seed
development, lncRNAs were identified in reproductive-specific
PRC2 mutants. A total 2,362 lncRNAs were identified; 55% (1,296)
were exclusively identified in PRC2 mutants and are potentially
targeted by PRC2. PRC2 mutants induced transcriptome wide
differential expression of 8,212 genes, in particular
transcriptional regulators, transcription factors and DNA binding
proteins. Furthermore, 520 lncRNAs were differentially expressed
in PRC2 mutants. Novel lncRNA candidates were explored, many
being exclusively expressed in the absence of PRC2 and were in
proximity to key genes involved in transcription regulation, such
as transcription factors.
As PRC2 regulates endosperm development and governs post-zygotic
hybridisation barriers, inter-genus crosses between Arabidopsis
thaliana and Boechera pinetorum created with mutations in PRC2
were investigated as part of a long term aim. Using PCR and
genotyping sequencing, it was confirmed that all hybrids
contained the genomes of both distant parents. It was confirmed
that PRC2 mutations facilitated the generation of A. thaliana x
Boechera pinetorum hybrids, although it is not known how. With
the hybrids confirmed, this provided a platform for research into
roles of lncRNAs in alleviating post-zygotic hybridisation
barriers.
Overall, this research project identified a total 4,147 lncRNAs
in A. thaliana siliques from various ecotypes, crosses and
mutants. 64% (2,652) were novel, not being reported by any other
study. However, further experiments are required to validate
lncRNAs and elucidate their functions. The 4,147 lncRNAs
identified are an important contribution to plant lncRNA
research, providing a novel resource to understand the role
lncRNAs play in plant biology