15 research outputs found

    Construction and Analysis of High-Density Linkage Map Using High-Throughput Sequencing Data

    No full text
    <div><p>Linkage maps enable the study of important biological questions. The construction of high-density linkage maps appears more feasible since the advent of next-generation sequencing (NGS), which eases SNP discovery and high-throughput genotyping of large population. However, the marker number explosion and genotyping errors from NGS data challenge the computational efficiency and linkage map quality of linkage study methods. Here we report the HighMap method for constructing high-density linkage maps from NGS data. HighMap employs an iterative ordering and error correction strategy based on a k-nearest neighbor algorithm and a Monte Carlo multipoint maximum likelihood algorithm. Simulation study shows HighMap can create a linkage map with three times as many markers as ordering-only methods while offering more accurate marker orders and stable genetic distances. Using HighMap, we constructed a common carp linkage map with 10,004 markers. The singleton rate was less than one-ninth of that generated by JoinMap4.1. Its total map distance was 5,908 cM, consistent with reports on low-density maps. HighMap is an efficient method for constructing high-density, high-quality linkage maps from high-throughput population NGS data. It will facilitate genome assembling, comparative genomic analysis, and QTL studies. HighMap is available at <a href="http://highmap.biomarker.com.cn/" target="_blank">http://highmap.biomarker.com.cn/</a>.</p></div

    Modules of HighMap algorithm.

    No full text
    <p>A: The single-linkage clustering algorithm was used to partition the marker loci into linkage groups based on a pairwise modified independence LOD score for the recombination frequency. B and B': The ordering module combines Gibbs sampling, spatial sampling, and simulated annealing algorithm to order markers and estimate map distances. C: The error correction module identified singletons according to parental contribution of genotypes and eliminated them from the data using <i>k</i>-nearest neighbor algorithm. To order markers correctly, the processes of ordering and error correction were carried out iteratively. D: Heat maps and haplotype maps were constructed to evaluate map quality.</p

    SLAF-seq: An Efficient Method of Large-Scale <em>De Novo</em> SNP Discovery and Genotyping Using High-Throughput Sequencing

    Get PDF
    <div><p>Large-scale genotyping plays an important role in genetic association studies. It has provided new opportunities for gene discovery, especially when combined with high-throughput sequencing technologies. Here, we report an efficient solution for large-scale genotyping. We call it specific-locus amplified fragment sequencing (SLAF-seq). SLAF-seq technology has several distinguishing characteristics: i) deep sequencing to ensure genotyping accuracy; ii) reduced representation strategy to reduce sequencing costs; iii) pre-designed reduced representation scheme to optimize marker efficiency; and iv) double barcode system for large populations. In this study, we tested the efficiency of SLAF-seq on rice and soybean data. Both sets of results showed strong consistency between predicted and practical SLAFs and considerable genotyping accuracy. We also report the highest density genetic map yet created for any organism without a reference genome sequence, common carp in this case, using SLAF-seq data. We detected 50,530 high-quality SLAFs with 13,291 SNPs genotyped in 211 individual carp. The genetic map contained 5,885 markers with 0.68 cM intervals on average. A comparative genomics study between common carp genetic map and zebrafish genome sequence map showed high-quality SLAF-seq genotyping results. SLAF-seq provides a high-resolution strategy for large-scale genotyping and can be generally applicable to various species and populations.</p> </div

    Genetic map validation by recombination mapping.

    No full text
    <p>Each two rows represent a genome in a CP population including 211 progenies and 2 parents. Columns correspond to chromosomes. Red and blue shading indicate maternal or paternal haplotype, respectively. Pink shading indicates ambiguous haplotypes, and grey shading indicates missing data. Only 1.51% of the markers were found in small recombination blocks.</p

    Pilot SLAF-seq data analysis using rice and soybeans.

    No full text
    <p>(a)and(b) Insert size distribution of SLAFs. SLAF length was found to cluster tightly around a mean of 430 bp, with 85% of SLAFs in the centermost 50 bp. (c) and (d)Distribution of SLAFs on the chromosomes. SLAFs were evenly distributed on the chromosomes in rice and soybeans. The gap in the middle was caused by the absence of centromere sequences. (e)and(f) Customized SLAF density design. In the rice pilot case, the density was designed using 20 kb per SLAF. In soybeans, 40 kb per SLAF was used. Both rice and soybean pilot SLAF data were found to be consistent with theoretical predictions.</p

    Genotyping quality in common F1 population.

    No full text
    <p>The genotyping quality score was used to select qualified markers and individuals for subsequent analysis. This is a dynamic optimization process. We counted low-quality markers for each SLAF marker and for each individual, and deleted the worst marker or individual. We repeated this process, deleting an individual or a marker each time until the average genotyping quality score of all SLAF markers reached the cutoff value, which was 13. (a)Detailed genotyping quality of SLAF-seqdata. (b) Cumulative quality score distribution of 7559 markers in 166 individuals.</p
    corecore