525 research outputs found

    The Cassava Genome: Current Progress, Future Directions

    Get PDF
    The starchy swollen roots of cassava provide an essential food source for nearly a billion people, as well as possibilities for bioenergy, yet improvements to nutritional content and resistance to threatening diseases are currently impeded. A 454-based whole genome shotgun sequence has been assembled, which covers 69% of the predicted genome size and 96% of protein-coding gene space, with genome finishing underway. The predicted 30,666 genes and 3,485 alternate splice forms are supported by 1.4 M expressed sequence tags (ESTs). Maps based on simple sequence repeat (SSR)-, and EST-derived single nucleotide polymorphisms (SNPs) already exist. Thanks to the genome sequence, a high-density linkage map is currently being developed from a cross between two diverse cassava cultivars: one susceptible to cassava brown streak disease; the other resistant. An efficient genotyping-by-sequencing (GBS) approach is being developed to catalog SNPs both within the mapping population and among diverse African farmer-preferred varieties of cassava. These resources will accelerate marker-assisted breeding programs, allowing improvements in disease-resistance and nutrition, and will help us understand the genetic basis for disease resistance

    QTL mapping in autotetraploids using SNP dosage information

    Get PDF
    Dense linkage maps derived by analysing SNP dosage in autotetraploids provide detailed information about the location of, and genetic model at, quantitative trait loci. Recent developments in sequencing and genotyping technologies enable researchers to generate high-density single nucleotide polymorphism (SNP) genotype data for mapping studies. For polyploid species, the SNP genotypes are informative about allele dosage, and Hackett et al. (PLoS ONE 8:e63939, 2013) presented theory about how dosage information can be used in linkage map construction and quantitative trait locus (QTL) mapping for an F1 population in an autotetraploid species. Here, QTL mapping using dosage information is explored for simulated phenotypic traits of moderate heritability and possibly non-additive effects. Different mapping strategies are compared, looking at additive and more complicated models, and model fitting as a single step or by iteratively re-weighted modelling. We recommend fitting an additive model without iterative re-weighting, and then exploring non-additive models for the genotype means estimated at the most likely position. We apply this strategy to re-analyse traits of high heritability from a potato population of 190 F1 individuals: flower colour, maturity, height and resistance to late blight (Phytophthora infestans (Mont.) de Bary) and potato cyst nematode (Globodera pallida), using a map of 3839 SNPs. The approximate confidence intervals for QTL locations have been improved by the detailed linkage map, and more information about the genetic model at each QTL has been revealed. For several of the reported QTLs, candidate SNPs can be identified, and used to propose candidate trait genes. We conclude that the high marker density is informative about the genetic model at loci of large effects, but that larger populations are needed to detect smaller QTLs

    Scanning and filling : ultra-dense SNP genotyping combining genotyping-by-sequencing, SNP array and whole-genome resequencing data

    Get PDF
    Genotyping-by-sequencing (GBS) represents a highly cost-effective high-throughput genotyping approach. By nature, however, GBS is subject to generating sizeable amounts of missing data and these will need to be imputed for many downstream analyses. The extent to which such missing data can be tolerated in calling SNPs has not been explored widely. In this work, we first explore the use of imputation to fill in missing genotypes in GBS datasets. Importantly, we use whole genome resequencing data to assess the accuracy of the imputed data. Using a panel of 301 soybean accessions, we show that over 62,000 SNPs could be called when tolerating up to 80% missing data, a five-fold increase over the number called when tolerating up to 20% missing data. At all levels of missing data examined (between 20% and 80%), the resulting SNP datasets were of uniformly high accuracy (96– 98%). We then used imputation to combine complementary SNP datasets derived from GBS and a SNP array (SoySNP50K). We thus produced an enhanced dataset of >100,000 SNPs and the genotypes at the previously untyped loci were again imputed with a high level of accuracy (95%). Of the >4,000,000 SNPs identified through resequencing 23 accessions (among the 301 used in the GBS analysis), 1.4 million tag SNPs were used as a reference to impute this large set of SNPs on the entire panel of 301 accessions. These previously untyped loci could be imputed with around 90% accuracy. Finally, we used the 100K SNP dataset (GBS + SoySNP50K) to perform a GWAS on seed oil content within this collection of soybean accessions. Both the number of significant marker-trait associations and the peak significance levels were improved considerably using this enhanced catalog of SNPs relative to a smaller catalog resulting from GBS alone at 20% missing data. Our results demonstrate that imputation can be used to fill in both missing genotypes and untyped loci with very high accuracy and that this leads to more powerful genetic analyses

    Single nucleotide polymorphism discovery in elite north american potato germplasm

    Get PDF
    BACKGROUND: Current breeding approaches in potato rely almost entirely on phenotypic evaluations; molecular markers, with the exception of a few linked to disease resistance traits, are not widely used. Large-scale sequence datasets generated primarily through Sanger Expressed Sequence Tag projects are available from a limited number of potato cultivars and access to next generation sequencing technologies permits rapid generation of sequence data for additional cultivars. When coupled with the advent of high throughput genotyping methods, an opportunity now exists for potato breeders to incorporate considerably more genotypic data into their decision-making. RESULTS: To identify a large number of Single Nucleotide Polymorphisms (SNPs) in elite potato germplasm, we sequenced normalized cDNA prepared from three commercial potato cultivars: 'Atlantic', 'Premier Russet' and 'Snowden'. For each cultivar, we generated 2 Gb of sequence which was assembled into a representative transcriptome of (~)28-29 Mb for each cultivar. Using the Maq SNP filter that filters read depth, density, and quality, 575,340 SNPs were identified within these three cultivars. In parallel, 2,358 SNPs were identified within existing Sanger sequences for three additional cultivars, 'Bintje', 'Kennebec', and 'Shepody'. Using a stringent set of filters in conjunction with the potato reference genome, we identified 69,011 high confidence SNPs from these six cultivars for use in genotyping with the Infinium platform. Ninety-six of these SNPs were used with a BeadXpress assay to assess allelic diversity in a germplasm panel of 248 lines; 82 of the SNPs proved sufficiently informative for subsequent analyses. Within diverse North American germplasm, the chip processing market class was most distinct, clearly separated from all other market classes. The round white and russet market classes both include fresh market and processing cultivars. Nevertheless, the russet and round white market classes are more distant from each other than processing are from fresh market types within these two groups. CONCLUSIONS: The genotype data generated in this study, albeit limited in number, has revealed distinct relationships among the market classes of potato. The SNPs identified in this study will enable high-throughput genotyping of germplasm and populations, which in turn will enable more efficient marker-assisted breeding efforts in potato

    Sequence-Based Genotyping for Marker Discovery and Co-Dominant Scoring in Germplasm and Populations

    Get PDF
    Conventional marker-based genotyping platforms are widely available, but not without their limitations. In this context, we developed Sequence-Based Genotyping (SBG), a technology for simultaneous marker discovery and co-dominant scoring, using next-generation sequencing. SBG offers users several advantages including a generic sample preparation method, a highly robust genome complexity reduction strategy to facilitate de novo marker discovery across entire genomes, and a uniform bioinformatics workflow strategy to achieve genotyping goals tailored to individual species, regardless of the availability of a reference sequence. The most distinguishing features of this technology are the ability to genotype any population structure, regardless whether parental data is included, and the ability to co-dominantly score SNP markers segregating in populations. To demonstrate the capabilities of SBG, we performed marker discovery and genotyping in Arabidopsis thaliana and lettuce, two plant species of diverse genetic complexity and backgrounds. Initially we obtained 1,409 SNPs for arabidopsis, and 5,583 SNPs for lettuce. Further filtering of the SNP dataset produced over 1,000 high quality SNP markers for each species. We obtained a genotyping rate of 201.2 genotypes/SNP and 58.3 genotypes/SNP for arabidopsis (n = 222 samples) and lettuce (n = 87 samples), respectively. Linkage mapping using these SNPs resulted in stable map configurations. We have therefore shown that the SBG approach presented provides users with the utmost flexibility in garnering high quality markers that can be directly used for genotyping and downstream applications. Until advances and costs will allow for routine whole-genome sequencing of populations, we expect that sequence-based genotyping technologies such as SBG will be essential for genotyping of model and non-model genomes alike

    Genetic Characterization of a Core Set of a Tropical Maize Race Tuxpeño for Further Use in Maize Improvement

    Get PDF
    The tropical maize race Tuxpeño is a well-known race of Mexican dent germplasm which has greatly contributed to the development of tropical and subtropical maize gene pools. In order to investigate how it could be exploited in future maize improvement, a panel of maize germplasm accessions was assembled and characterized using genome-wide Single Nucleotide Polymorphism (SNP) markers. This panel included 321 core accessions of Tuxpeño race from the International Maize and Wheat Improvement Center (CIMMYT) germplasm bank collection, 94 CIMMYT maize lines (CMLs) and 54 U.S. Germplasm Enhancement of Maize (GEM) lines. The panel also included other diverse sources of reference germplasm: 14 U.S. maize landrace accessions, 4 temperate inbred lines from the U.S. and China, and 11 CIMMYT populations (a total of 498 entries with 795 plants). Clustering analyses (CA) based on Modified Rogers Distance (MRD) clearly partitioned all 498 entries into their corresponding groups. No sub clusters were observed within the Tuxpeño core set. Various breeding strategies for using the Tuxpeño core set, based on grouping of the studied germplasm and genetic distance among them, were discussed. In order to facilitate sampling diversity within the Tuxpeño core, a minicore subset of 64 Tuxpeño accessions (20% of its usual size) representing the diversity of the core set was developed, using an approach combining phenotypic and molecular data. Untapped diversity represents further use of the Tuxpeño landrace for maize improvement through the core and/or minicore subset available to the maize community

    Common garden experiments in the genomic era : new perspectives and opportunities

    Get PDF
    PdV was supported by a doctoral studentship from the French Ministère de la Recherche et de l’Enseignement Supérieur. OEG was supported by the Marine Alliance for Science and Technology for Scotland (MASTS)The study of local adaptation is rendered difficult by many evolutionary confounding phenomena (e.g. genetic drift and demographic history). When complex traits are involved in local adaptation, phenomena such as phenotypic plasticity further hamper evolutionary biologists to study the complex relationships between phenotype, genotype and environment. In this perspective paper, we suggest that the common garden experiment, specifically designed to deal with phenotypic plasticity has a clear role to play in the study of local adaptation, even (if not specifically) in the genomic era. After a quick review of some high-throughput genotyping protocols relevant in the context of a common garden, we explore how to improve common garden analyses with dense marker panel data and recent statistical methods. We then show how combining approaches from population genomics and genome-wide association studies with the settings of a common garden can yield to a very efficient, thorough and integrative study of local adaptation. Especially, evidence from genomic (e.g. genome scan) and phenotypic origins constitute independent insights into the possibility of local adaptation scenarios, and genome-wide association studies in the context of a common garden experiment allow to decipher the genetic bases of adaptive traits.PostprintPeer reviewe
    • …
    corecore