5 research outputs found

    Long-Read cDNA Sequencing Revealed Novel Expressed Genes and Dynamic Transcriptome Landscape of Triticale (x Triticosecale Wittmack) Seed at Different Developing Stages

    No full text
    Developing seed is a unique stage of plant development with highly dynamic changes in transcriptome. Here, we aimed to detect the novel previously unannotated, genes of the triticale (x Triticosecale Wittmack, AABBRR genome constitution) genome that are expressed during different stages and at different parts of the developing seed. For this, we carried out the Oxford Nanopore sequencing of cDNA obtained for middle (15 days post-anthesis, dpa) and late (20 dpa) stages of seed development. The obtained data together with our previous direct RNA sequencing of early stage (10 dpa) of seed development revealed 39,914 expressed genes including 7128 (17.6%) genes that were not previously annotated in A, B, and R genomes. The bioinformatic analysis showed that the identified genes belonged to long non-coding RNAs (lncRNAs), protein-coding RNAs, and TE-derived RNAs. The gene set analysis revealed the transcriptome dynamics during seed development with distinct patterns of over-represented gene functions in early and middle/late stages. We performed analysis of the lncRNA genes polymorphism and showed that the genes of some of the tested lncRNAs are indeed polymorphic in the triticale collection. Altogether, our results provide information on thousands of novel loci expressed during seed development that can be used as new targets for GWAS analysis, the marker-assisted breeding of triticale, and functional elucidation

    Long-Read cDNA Sequencing Revealed Novel Expressed Genes and Dynamic Transcriptome Landscape of Triticale (x Triticosecale Wittmack) Seed at Different Developing Stages

    No full text
    Developing seed is a unique stage of plant development with highly dynamic changes in transcriptome. Here, we aimed to detect the novel previously unannotated, genes of the triticale (x Triticosecale Wittmack, AABBRR genome constitution) genome that are expressed during different stages and at different parts of the developing seed. For this, we carried out the Oxford Nanopore sequencing of cDNA obtained for middle (15 days post-anthesis, dpa) and late (20 dpa) stages of seed development. The obtained data together with our previous direct RNA sequencing of early stage (10 dpa) of seed development revealed 39,914 expressed genes including 7128 (17.6%) genes that were not previously annotated in A, B, and R genomes. The bioinformatic analysis showed that the identified genes belonged to long non-coding RNAs (lncRNAs), protein-coding RNAs, and TE-derived RNAs. The gene set analysis revealed the transcriptome dynamics during seed development with distinct patterns of over-represented gene functions in early and middle/late stages. We performed analysis of the lncRNA genes polymorphism and showed that the genes of some of the tested lncRNAs are indeed polymorphic in the triticale collection. Altogether, our results provide information on thousands of novel loci expressed during seed development that can be used as new targets for GWAS analysis, the marker-assisted breeding of triticale, and functional elucidation

    Searching for a Needle in a Haystack: Cas9-Targeted Nanopore Sequencing and DNA Methylation Profiling of Full-Length Glutenin Genes in a Big Cereal Genome

    No full text
    Sequencing and epigenetic profiling of target genes in plants are important tasks with various applications ranging from marker design for plant breeding to the study of gene expression regulation. This is particularly interesting for plants with big genome size for which whole-genome sequencing can be time-consuming and costly. In this study, we asked whether recently proposed Cas9-targeted nanopore sequencing (nCATS) is efficient for target gene sequencing for plant species with big genome size. We applied nCATS to sequence the full-length glutenin genes (Glu-1Ax, Glu-1Bx and Glu-1By) and their promoters in hexaploid triticale (X Triticosecale, AABBRR, genome size is 24 Gb). We showed that while the target gene enrichment per se was quite high for the three glutenin genes (up to 645×), the sequencing depth that was achieved from two MinION flowcells was relatively low (5–17×). However, this sequencing depth was sufficient for various tasks including detection of InDels and single-nucleotide variations (SNPs), read phasing and methylation profiling. Using nCATS, we uncovered SNP and InDel variation of full-length glutenin genes providing useful information for marker design and deciphering of variation of individual Glu-1By alleles. Moreover, we demonstrated that glutenin genes possess a ‘gene-body’ methylation epigenetic profile with hypermethylated CDS part and hypomethylated promoter region. The obtained information raised an interesting question on the role of gene-body methylation in glutenin gene expression regulation. Taken together, our work disclosures the potential of the nCATS approach for sequencing of target genes in plants with big genome size

    Epigenetic Stress and Long-Read cDNA Sequencing of Sunflower (<i>Helianthus annuus</i> L.) Revealed the Origin of the Plant Retrotranscriptome

    No full text
    Transposable elements (TEs) contribute not only to genome diversity but also to transcriptome diversity in plants. To unravel the sources of LTR retrotransposon (RTE) transcripts in sunflower, we exploited a recently developed transposon activation method (‘TEgenesis’) along with long-read cDNA Nanopore sequencing. This approach allows for the identification of 56 RTE transcripts from different genomic loci including full-length and non-autonomous RTEs. Using the mobilome analysis, we provided a new set of expressed and transpositional active sunflower RTEs for future studies. Among them, a Ty3/Gypsy RTE called SUNTY3 exhibited ongoing transposition activity, as detected by eccDNA analysis. We showed that the sunflower genome contains a diverse set of non-autonomous RTEs encoding a single RTE protein, including the previously described TR-GAG (terminal repeat with the GAG domain) as well as new categories, TR-RT-RH, TR-RH, and TR-INT-RT. Our results demonstrate that 40% of the loci for RTE-related transcripts (nonLTR-RTEs) lack their LTR sequences and resemble conventional eucaryotic genes encoding RTE-related proteins with unknown functions. It was evident based on phylogenetic analysis that three nonLTR-RTEs encode GAG (HadGAG1-3) fused to a host protein. These HadGAG proteins have homologs found in other plant species, potentially indicating GAG domestication. Ultimately, we found that the sunflower retrotranscriptome originated from the transcription of active RTEs, non-autonomous RTEs, and gene-like RTE transcripts, including those encoding domesticated proteins

    Transposons Hidden in <i>Arabidopsis thaliana</i> Genome Assembly Gaps and Mobilization of Non-Autonomous LTR Retrotransposons Unravelled by Nanotei Pipeline

    No full text
    Long-read data is a great tool to discover new active transposable elements (TEs). However, no ready-to-use tools were available to gather this information from low coverage ONT datasets. Here, we developed a novel pipeline, nanotei, that allows detection of TE-contained structural variants, including individual TE transpositions. We exploited this pipeline to identify TE insertion in the Arabidopsis thaliana genome. Using nanotei, we identified tens of TE copies, including ones for the well-characterized ONSEN retrotransposon family that were hidden in genome assembly gaps. The results demonstrate that some TEs are inaccessible for analysis with the current A. thaliana (TAIR10.1) genome assembly. We further explored the mobilome of the ddm1 mutant with elevated TE activity. Nanotei captured all TEs previously known to be active in ddm1 and also identified transposition of non-autonomous TEs. Of them, one non-autonomous TE derived from (AT5TE33540) belongs to TR-GAG retrotransposons with a single open reading frame (ORF) encoding the GAG protein. These results provide the first direct evidence that TR-GAGs and other non-autonomous LTR retrotransposons can transpose in the plant genome, albeit in the absence of most of the encoded proteins. In summary, nanotei is a useful tool to detect active TEs and their insertions in plant genomes using low-coverage data from Nanopore genome sequencing
    corecore