46 research outputs found

    Comparisons of De Novo Transcriptome Assemblers in Diploid and Polyploid Species Using Peanut (Arachis spp.) RNA-Seq Data

    Get PDF
    The narrow genetic base and limited genetic information on Arachis species have hindered the process of marker-assisted selection of peanut cultivars. However, recent developments in sequencing technologies have expanded opportunities to exploit genetic resources, and at lower cost. To use the genetic information for Arachis species available at the transcriptome level, it is important to have a good quality reference transcriptome. The available Tifrunner 454 FLEX transcriptome sequences have an assembly with 37,000 contigs and low N50 values of 500-751 bp. Therefore, we generated de novo transcriptome assemblies, with about 38 million reads in the tetraploid cultivar OLin, and 16 million reads in each of the diploids, A. duranensis K38901 and A. ipaënsis KGBSPSc30076 using three different de novo assemblers, Trinity, SOAPdenovo-Trans and TransAByss. All these assemblers can use single kmer analysis, and the latter two also permit multiple kmer analysis. Assemblies generated for all three samples had N50 values ranging from 1278-1641 bp in Arachis hypogaea (AABB), 1401-1492 bp in Arachis duranensis (AA), and 1107-1342 bp in Arachis ipaënsis (BB). Comparison with legume ESTs and protein databases suggests that assemblies generated had more than 40% full length transcripts with good continuity. Also, on mapping the raw reads to each of the assemblies generated, Trinity had a high success rate in assembling sequences compared to both TransAByss and SOAPdenovo-Trans. De novo assembly of OLin had a greater number of contigs (67,098) and longer contig length (N50 = 1,641) compared to the Tifrunner TSA. Despite having shorter read length (2 × 50) than the Tifrunner 454FLEX TSA, de novo assembly of OLin proved superior in comparison. Assemblies generated to represent different genome combinations may serve as a valuable resource for the peanut research community

    PeanutMap: an online genome database for comparative molecular maps of peanut

    Get PDF
    BACKGROUND: Molecular maps have been developed for many species, and are of particular importance for varietal development and comparative genomics. However, despite the existence of multiple sets of linkage maps, databases of these data are lacking for many species, including peanut. DESCRIPTION: PeanutMap provides a web-based interface for viewing specific linkage groups of a map set. PeanutMap can display and compare multiple maps of a set based upon marker or trait correspondences, which is particularly important as cultivated peanut is a disomic tetraploid. The database can also compare linkage groups among multiple map sets, allowing identification of corresponding linkage groups from results of different research projects. Data from the two published peanut genome map sets, and also from three maps sets of phenotypic traits are present in the database. Data from PeanutMap have been incorporated into the Legume Information System website to allow peanut map data to be used for cross-species comparisons. CONCLUSION: The utility of the database is expected to increase as several SSR-based maps are being developed currently, and expanded efforts for comparative mapping of legumes are underway. Optimal use of these data will benefit from the development of tools to facilitate comparative analysis

    A First Insight into Population Structure and Linkage Disequilibrium in the U.S. Peanut Minicore Collection

    Get PDF
    Knowledge of genetic diversity, population structure, and degree of linkage disequilibrium (LD) in target association mapping populations is of great importance and is a prerequisite for LD-based mapping. In the present study, 96 genotypes comprising 92 accessions of the US peanut minicore collection, a component line of the tetraploid variety Florunner, diploid progenitors A. duranensis (AA) and A. ipaënsis (BB), and synthetic amphidiploid accession TxAG-6 were investigated with 392 simple sequence repeat (SSR) marker bands amplified using 32 highly-polymorphic SSR primer pairs. Both distance- and model-based (Bayesian) cluster analysis revealed the presence of structured diversity. In general, the wild-species accessions and the synthetic amphidiploid grouped separately from most minicore accessions except for COC155, and were eliminated from most subsequent analyses. UPGMA analysis divided the population into four subgroups, two major subgroups representing subspecies fastigiata and hypogaea, a third group containing individuals from each subspecies or possibly of mixed ancestry, and a fourth group, either consisting of COC155 alone if wild species were excluded, or of COC155, the diploid species, and the synthetic amphidiploid. Modelbased clustering identified four subgroups- one each for fastigiata and hypogaea subspecies, a third consisting of individuals of both subspecies or of mixed ancestry predominantly from Africa or Asia, and a fourth group, consisting of individuals predominantly of var fastigiata, peruviana, and aequatoriana accessions from South America, including COC155. Analysis of molecular variance (AMOVA) revealed statistically-significant (P \u3c 0.0001) genetic variance of 16.87% among subgroups. A total of 4.85% of SSR marker pairs revealed significant LD (at r2 ≥ 0.1). Of the syntenic marker pairs separated by distances \u3c 10 cM, 11–20 cM, 21–50 cM, and \u3e 50 cM, 19.33, 5.19, 6.25 and 5.29% of marker pairs were found in strong LD (P ≤ 0.01), in accord with LD extending to great distances in self pollinated crops. A threshold value of r2 \u3e 0.035 was found to distinguish mean r2 values of linkage distance groups statistically from the mean r2 values of unlinked markers; LD was found to extend to 10 cM over the entire minicore collection by this criterion. However, there were large differences in r2 values among marker pairs even among tightly-linked markers. The implications of these findings with regard to the possibility of using association mapping for detection of genome-wide SSR marker-phenotype association are discussed

    Assemblies compared in a pair-wise fashion using Mummer, and the proportions covered from each of the assemblies are shown below.

    No full text
    <p>The upper triangular values for each accession represent the proportion of sequences at the left that were present in the sequences at the top of the triangle. The lower triangular values represent the proportion of sequences in the accession at the top that were present in the accession at the left.</p><p>Assemblies compared in a pair-wise fashion using Mummer, and the proportions covered from each of the assemblies are shown below.</p

    Statistics on assemblies generated after merging multiple <i>kmer</i> assemblies using Trans-<i>AByss</i> and the non-redundant assemblies from Trinity and <i>AByss</i> [dedup – no redundant sequences, Mmer – multiple merged <i>kmer</i> assemblies].

    No full text
    <p>Statistics on assemblies generated after merging multiple <i>kmer</i> assemblies using Trans-<i>AByss</i> and the non-redundant assemblies from Trinity and <i>AByss</i> [dedup – no redundant sequences, Mmer – multiple merged <i>kmer</i> assemblies].</p
    corecore