11 research outputs found

    Construction and Analysis of High-Density Linkage Map Using High-Throughput Sequencing Data

    No full text
    <div><p>Linkage maps enable the study of important biological questions. The construction of high-density linkage maps appears more feasible since the advent of next-generation sequencing (NGS), which eases SNP discovery and high-throughput genotyping of large population. However, the marker number explosion and genotyping errors from NGS data challenge the computational efficiency and linkage map quality of linkage study methods. Here we report the HighMap method for constructing high-density linkage maps from NGS data. HighMap employs an iterative ordering and error correction strategy based on a k-nearest neighbor algorithm and a Monte Carlo multipoint maximum likelihood algorithm. Simulation study shows HighMap can create a linkage map with three times as many markers as ordering-only methods while offering more accurate marker orders and stable genetic distances. Using HighMap, we constructed a common carp linkage map with 10,004 markers. The singleton rate was less than one-ninth of that generated by JoinMap4.1. Its total map distance was 5,908 cM, consistent with reports on low-density maps. HighMap is an efficient method for constructing high-density, high-quality linkage maps from high-throughput population NGS data. It will facilitate genome assembling, comparative genomic analysis, and QTL studies. HighMap is available at <a href="http://highmap.biomarker.com.cn/" target="_blank">http://highmap.biomarker.com.cn/</a>.</p></div

    Modules of HighMap algorithm.

    No full text
    <p>A: The single-linkage clustering algorithm was used to partition the marker loci into linkage groups based on a pairwise modified independence LOD score for the recombination frequency. B and B': The ordering module combines Gibbs sampling, spatial sampling, and simulated annealing algorithm to order markers and estimate map distances. C: The error correction module identified singletons according to parental contribution of genotypes and eliminated them from the data using <i>k</i>-nearest neighbor algorithm. To order markers correctly, the processes of ordering and error correction were carried out iteratively. D: Heat maps and haplotype maps were constructed to evaluate map quality.</p

    SLAF-seq: An Efficient Method of Large-Scale <em>De Novo</em> SNP Discovery and Genotyping Using High-Throughput Sequencing

    Get PDF
    <div><p>Large-scale genotyping plays an important role in genetic association studies. It has provided new opportunities for gene discovery, especially when combined with high-throughput sequencing technologies. Here, we report an efficient solution for large-scale genotyping. We call it specific-locus amplified fragment sequencing (SLAF-seq). SLAF-seq technology has several distinguishing characteristics: i) deep sequencing to ensure genotyping accuracy; ii) reduced representation strategy to reduce sequencing costs; iii) pre-designed reduced representation scheme to optimize marker efficiency; and iv) double barcode system for large populations. In this study, we tested the efficiency of SLAF-seq on rice and soybean data. Both sets of results showed strong consistency between predicted and practical SLAFs and considerable genotyping accuracy. We also report the highest density genetic map yet created for any organism without a reference genome sequence, common carp in this case, using SLAF-seq data. We detected 50,530 high-quality SLAFs with 13,291 SNPs genotyped in 211 individual carp. The genetic map contained 5,885 markers with 0.68 cM intervals on average. A comparative genomics study between common carp genetic map and zebrafish genome sequence map showed high-quality SLAF-seq genotyping results. SLAF-seq provides a high-resolution strategy for large-scale genotyping and can be generally applicable to various species and populations.</p> </div

    Genetic map validation by recombination mapping.

    No full text
    <p>Each two rows represent a genome in a CP population including 211 progenies and 2 parents. Columns correspond to chromosomes. Red and blue shading indicate maternal or paternal haplotype, respectively. Pink shading indicates ambiguous haplotypes, and grey shading indicates missing data. Only 1.51% of the markers were found in small recombination blocks.</p

    Pilot SLAF-seq data analysis using rice and soybeans.

    No full text
    <p>(a)and(b) Insert size distribution of SLAFs. SLAF length was found to cluster tightly around a mean of 430 bp, with 85% of SLAFs in the centermost 50 bp. (c) and (d)Distribution of SLAFs on the chromosomes. SLAFs were evenly distributed on the chromosomes in rice and soybeans. The gap in the middle was caused by the absence of centromere sequences. (e)and(f) Customized SLAF density design. In the rice pilot case, the density was designed using 20 kb per SLAF. In soybeans, 40 kb per SLAF was used. Both rice and soybean pilot SLAF data were found to be consistent with theoretical predictions.</p

    SLAF-seq flowchart.

    No full text
    <p>i) Pre-design scheme for SLAF selection using training data. The reduced representation design must be decided based on marker efficiency characteristics, which include random distribution throughout the genome, uniqueness in the genome, and consistent amplification efficiency among selected markers. A pilot experiment was performed to evaluate the amplification efficiency based on the pre-designed scheme. ii) SLAF-seq library construction. Genomic DNA was digested by groups of enzymesdesigned for individuals. Double barcodes were added to two round PCR reactions to discriminate each individual and to facilitate the pooling of samples for size selection, which maintained consistent fragment size among individuals. iii) Deep sequencing for the pooled RRLs with the Illumina paired-end sequencing protocol, and genotype definition and validation by software.</p
    corecore