18 research outputs found
Different patterns of alternative splicing between total RNA-seq and polyribosomal RNA-seq.
<p>(A) Six classes of alternative splicing events in the two samples. RI-SI: retained intron or skipped intron. RE-SE: retained exon and skipped exon. IWI-TWI: initiation within intron or termination within intron. AA: alternative acceptor. AD: alternative donor. ATE: alternative terminal exon. G-test was used to calculate likelihood ratio statistics. (B) Distribution of retained intron sizes predicted by PASA in total RNA-seq (top circle) and polyribosomal RNA-seq (bottom circle).</p
Polyribosomal RNA-Seq Reveals the Decreased Complexity and Diversity of the Arabidopsis Translatome
<div><p>Recent RNA-seq studies reveal that the transcriptomes in animals and plants are more complex than previously thought, leading to the inclusion of many more splice isoforms in annotated genomes. However, it is possible that a significant proportion of the transcripts are spurious isoforms that do not contribute to functional proteins. One of the current hypotheses is that commonly used mRNA extraction methods isolate both pre-mature (nuclear) mRNA and mature (cytoplasmic) mRNA, and these incompletely spliced pre-mature mRNAs may contribute to a large proportion of these spurious transcripts. To investigate this, we compared a traditional RNA-seq dataset (total RNA-seq) and a ribosome-bound RNA-seq dataset (polyribosomal RNA-seq) from <i>Arabidopsis thaliana</i>. An integrative framework that combined <i>de novo</i> assembly and genome-guided assembly was applied to reconstruct transcriptomes for the two datasets. Up to 44.8% of the <i>de novo</i> assembled transcripts in total RNA-seq sample were of low abundance, whereas only 0.09% in polyribosomal RNA-seq <i>de novo</i> assembly were of low abundance. The final round of assembly using PASA (Program to Assemble Spliced Alignments) resulted in more transcript assemblies in the total RNA-seq than those in polyribosomal sample. Comparison of alternative splicing (AS) patterns between total and polyribosomal RNA-seq showed a significant difference (G-test, p-value<0.01) in intron retention events: 46.4% of AS events in the total sample were intron retention, whereas only 23.5% showed evidence of intron retention in the polyribosomal sample. It is likely that a large proportion of retained introns in total RNA-seq result from incompletely spliced pre-mature mRNA. Overall, this study demonstrated that polyribosomal RNA-seq technology decreased the complexity and diversity of the coding transcriptome by eliminating pre-mature mRNAs, especially those of low abundance.</p></div
Coverage profiles along <i>A</i>. <i>thaliana</i> chromosomes and TAIR10 annotated CDS.
<p>(A) Distribution of RNA-seq read density along chromosome length is shown for total RNA-seq (left) and polyribosomal RNA-seq (right). The y axis represents the log2 scale of median read density. (B) Distribution of the RNA-seq read coverage along the length of the transcriptional unit. The log2 scale of median depth of coverage along the length of each individual TAIR10 annotated cDNA was calculated and plotted against the relative length of the transcriptional unit for the total RNA-seq and polyribosomal RNA-seq. (C) Coverage over the length of TAIR10 annotated CDS. Box-and-whisker plots depict the coverage calculated as the percentage of bases along the length of the cDNA sequence that was supported by reads from the total and polyribosomal RNA-seq datasets. The bottom and top of the boxes represent the 25<sup>th</sup> and 75<sup>th</sup> quartiles, respectively. The lines within boxes represent the medians.</p
Alternative splicing of <i>ATGSTF11</i> (A) and <i>AFC2</i> (B) genes.
<p>TAIR10 gene models, Full-length cDNA (FL-cDNA), transcripts assembled and reads alignments in the total RNA-seq and polyribosomal RNA-seq datasets are listed from top to bottom. Retained introns in total RNA-seq are highlighted using red rectangles. The red and blue colors represent forward and reverse reads in the read-alignment part, respectively.</p
Summary MCMC tree of HPIV-3 genome sequences.
Phylogenetic tree estimated from 268 genome sequences including those sequenced in this project and those available in Genbank. X-axis represents a time scale in years. Due to its size the tree has been split to make it easier to visualize. A) Lineage C3a has been collapsed. B) All subclades besides lineage C3a are collapsed. Subclades are highlighted with different colors and labels. Branch tips are labelled with the Genbank accession, the appropriate ISO 3166–1 alpha-3 code representing the collection country, and the year of sample collection for each sample.</p
Evolutionary rates for each coding sequence in HPIV-1 and HPIV-3.
Error bars represent the 95% highest posterior density.</p
Sites under positive selection for HPIV-1 and HPIV-3.
Sites under positive selection for HPIV-1 and HPIV-3.</p
