12 research outputs found

    High-Throughput Sequencing and <i>De Novo</i> Assembly of <i>Brassica oleracea</i> var. <i>Capitata</i> L. for Transcriptome Analysis

    Full text link
    <div><p>Background</p><p>The cabbage, <i>Brassica oleracea</i> var. <i>capitata</i> L., has a distinguishable phenotype within the genus <i>Brassica</i>. Despite the economic and genetic importance of cabbage, there is little genomic data for cabbage, and most studies of <i>Brassica</i> are focused on other species or other <i>B. oleracea</i> subspecies. The lack of genomic data for cabbage, a non-model organism, hinders research on its molecular biology. Hence, the construction of reliable transcriptomic data based on high-throughput sequencing technologies is needed to enhance our understanding of cabbage and provide genomic information for future work.</p><p>Methodology/Principal Findings</p><p>We constructed cDNAs from total RNA isolated from the roots, leaves, flowers, seedlings, and calcium-limited seedling tissues of two cabbage genotypes: 102043 and 107140. We sequenced a total of six different samples using the Illumina HiSeq platform, producing 40.5 Gbp of sequence data comprising 401,454,986 short reads. We assembled 205,046 transcripts (≥ 200 bp) using the Velvet and Oases assembler and predicted 53,562 loci from the transcripts. We annotated 35,274 of the loci with 55,916 plant peptides in the Phytozome database. The average length of the annotated loci was 1,419 bp. We confirmed the reliability of the sequencing assembly using reverse-transcriptase PCR to identify tissue-specific gene candidates among the annotated loci.</p><p>Conclusion</p><p>Our study provides valuable transcriptome sequence data for <i>B. oleracea</i> var. <i>capitata</i> L., offering a new resource for studying <i>B. oleracea</i> and closely related species. Our transcriptomic sequences will enhance the quality of gene annotation and functional analysis of the cabbage genome and serve as a material basis for future genomic research on cabbage. The sequencing data from this study can be used to develop molecular markers and to identify the extreme differences among the phenotypes of different species in the genus <i>Brassica</i>.</p></div

    KEGG annotation of the cabbage assembly.

    Full text link
    <p>KEGG annotation was performed using 18,761 TAIR IDs; 733 of the TAIR IDs covered 14 KEGG pathways. The 1,410 cabbage loci annotated by those TAIR IDs were sorted to the corresponding KEGG pathways.</p

    Histogram of the GO classification.

    Full text link
    <p>The cabbage loci were annotated in three ontology categories: ‘Biological Processes’, ‘Cellular Component’, and ‘Molecular Function’.</p

    RT-PCR of tissue-specific cabbage genes.

    Full text link
    <p>RT-PCR was performed with leaf and root samples of cultivar 107140 and the flower sample of cultivar 102043. The RT-PCR results of the leaf-specific (A), flower-specific (B) and root-specific (C) candidate loci are shown.</p

    Length distribution and reference gene coverage rate of the full-length cabbage loci.

    Full text link
    <p>Of the 35,274 loci annotated with genes from the Phytozome database using BLAST, 11,438 loci were predicted to be full-length loci. (A) The minimum length was 226 bp, and the maximum length was 16,439 bp. The largest number of full-length loci was in the range of 1,201 ∼ 1,500 bp. (B) Pie chart of the 35,274 loci classified by percentage of coverage on the reference gene.</p

    Summary statistics of the assemblies of the cabbage sequence data showing the performances of the multiple-k <i>de novo</i> assemblies.

    Full text link
    1<p><i>k-mer</i>: Required length of identical overlap match between two reads by Velvet.</p>2<p>N50: Contig length-weighted median.</p>3<p>Average length: length of a contig  =  the number of contigs/total length.</p>4<p>Max length: Length of the longest contig.</p>5<p>Total length: Summed length of all contigs.</p

    Workflow of the transcriptome assembly and the analysis of high-throughput sequencing data.

    Full text link
    <p>The analysis of the transcriptome assembly and the full-length transcripts were processed as a workflow. The quality analysis of the sequence data, the data trimming, and the read length sorting were performed by the Solexa QA, Dynamic Trim, and Length sort programs, respectively. The optimal hash length for the assembly was selected by applying several hash lengths according to an in-house pipeline. The assembled transcripts with more than 90% coverage of the Arabidopsis genome were analyzed to identify full-length transcripts. The transcripts with both a 5′UTR and a 3′UTR were defined as full-length transcripts (fl-transcripts).</p

    E-values of the cabbage loci annotation.

    Full text link
    <p>We annotated 35,274 of 53,562 cabbage loci (65.9%) with 26,971 plant peptide sequences from the Phytozome database. The e-values of 25,472 of the cabbage loci were equal to zero, accounting for more than 72% of the annotated loci.</p
    corecore