13 research outputs found

    Full-Length Enriched cDNA Libraries and ORFeome Analysis of Sugarcane Hybrid and Ancestor Genotypes

    No full text
    <div><p>Sugarcane is a major crop used for food and bioenergy production. Modern cultivars are hybrids derived from crosses between <i>Saccharum officinarum</i> and <i>Saccharum spontaneum</i>. Hybrid cultivars combine favorable characteristics from ancestral species and contain a genome that is highly polyploid and aneuploid, containing 100–130 chromosomes. These complex genomes represent a huge challenge for molecular studies and for the development of biotechnological tools that can facilitate sugarcane improvement. Here, we describe full-length enriched cDNA libraries for <i>Saccharum officinarum</i>, <i>Saccharum spontaneum</i>, and one hybrid genotype (SP803280) and analyze the set of open reading frames (ORFs) in their genomes (i.e., their ORFeomes). We found 38,195 (19%) sugarcane-specific transcripts that did not match transcripts from other databases. Less than 1.6% of all transcripts were ancestor-specific (i.e., not expressed in SP803280). We also found 78,008 putative new sugarcane transcripts that were absent in the largest sugarcane expressed sequence tag database (SUCEST). Functional annotation showed a high frequency of protein kinases and stress-related proteins. We also detected natural antisense transcript expression, which mapped to 94% of all plant KEGG pathways; however, each genotype showed different pathways enriched in antisense transcripts. Our data appeared to cover 53.2% (17,563 genes) and 46.8% (937 transcription factors) of all sugarcane full-length genes and transcription factors, respectively. This work represents a significant advancement in defining the sugarcane ORFeome and will be useful for protein characterization, single nucleotide polymorphism and splicing variant identification, evolutionary and comparative studies, and sugarcane genome assembly and annotation.</p></div

    Description of cloned full-length cDNA libraries.

    No full text
    a<p>Excluding low-quality sequences and repeated clones.</p>b<p>Excluding sugarcane-specific transcripts.</p><p>Description of cloned full-length cDNA libraries.</p

    Functional annotation of sugarcane full-length cDNA contigs.

    No full text
    a<p>For some databases, several contigs showed more than one annotation or categorization.</p>b<p>Based on annotation from <i>Sorghum bicolor, Z. mays</i>, <i>P. virgatum</i>, <i>Setaria italica</i>, <i>O. sativa</i>, and <i>B. distachyon</i>.</p><p>Functional annotation of sugarcane full-length cDNA contigs.</p

    Full-length enrichment for library cloning and next generation sequencing (NGS).

    No full text
    <p>Full-length (blue line with 5′ cap) or truncated (short blue line without 5′ cap) mRNAs were reverse transcribed into first-strand cDNA using oligo-dT primers (red arrow). The mRNA:cDNA hybrid was treated with RNase I (scissor) to remove the single-stranded RNA that was not fully extended by the first-strand cDNA, followed by selection for full-length transcripts using Cap-antibody magnetic beads to enrich the full-length mRNA:cDNA. The full-length single-stranded DNA (FLssDNA) was eluted from beads and used for both cDNA library cloning (lower left) and NGS (lower right). For full-length library cloning, a double-stranded adaptor (green) was linked to the 5′ end of ssDNA. Second-strand cDNA synthesis was then carried out, followed by cloning into a vector. For NGS, the full-length enriched ssDNA was fragmented by sonication to target fragments in the range of 200–400 bp, followed by ligation of the double-stranded DNA sequencing adaptor mixture (purple) to 3′ and 5′ ends of ssDNA. To maintain the complexity of the library while enriching the full-length cDNA for NGS, the original polyA mRNA was also fragmented using RNAse III, followed by ligation of the double-stranded RNA sequencing adaptor mixture (brown) to 3′ and 5′ ends of mRNA. After first- and second-strand synthesis, the polyA and capped mRNA and polyA and non-capped mRNA samples were mixed in a 3∶1 ratio and applied to the downstream NGS procedure.</p

    Length distribution of sugarcane ORFs, full-length transcripts (FL), and UTRs.

    No full text
    <p>Graphs denote the comparison of sugarcane length distribution (gray bars) of ORFs (A) and full-length transcripts (B) to other grasses (colored lines). Length distribution of 5′ and 3′ UTRs (C, black and white bars, respectively) of sugarcane full-length transcripts is shown as well.</p

    Number of full-length transcripts identified by each analysis.

    No full text
    <p>Sugarcane transcripts were first mapped against full-length grass transcripts and were then assigned as full-length if the CDS coverage was ≥95% and the identity was ≥80% (Analysis I). In this analysis, we found 9,960 full-length contigs, which aligned against 29,914 grass genes. Analysis II took into account more than one contig that mapped to the same grass gene (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0107351#pone.0107351.s002" target="_blank">Figure S2</a>), and non-overlapped coverage was calculated. Those that fit in the criteria of coverage ≥95% and identity ≥80% were considered full-length. This analysis yielded 26,384 contigs representing 3,952 unique grass genes and therefore 3,952 full-length sugarcane transcripts. For Analysis III, we calculated the average CDS size based on six grasses (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0107351#pone.0107351.s004" target="_blank">Table S2</a>), and all contigs with a predicted protein larger than this average CDS size were considered to be full-length.</p><p>Number of full-length transcripts identified by each analysis.</p

    Number of grass transcripts mapping to sugarcane contigs.

    No full text
    <p>A, Percentage of total grass transcripts mapping to sugarcane contigs. B, Total grass transcripts mapping to sugarcane contigs (white bars) and total sugarcane contigs mapping to each grass database (black bars). C, Total sugarcane contigs mapping to grasses, Uniprot, and NR databases and total unmatched sugarcane contigs (putative sugarcane-specific transcripts).</p

    Functional annotation of full-length contigs (40,407).

    No full text
    <p>The graph shows the 20 most frequent categories from Phytozome annotation. In total, 5,038 categories were observed (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0107351#pone.0107351.s006" target="_blank">Table S4</a>).</p
    corecore