Reference-free high-throughput SNP detection in pea: an example of discoSnp usage for a non-model complex genome

Abstract

International audienceBackground / Purpose:Detecting Single Nucleotide Polymorphisms (SNPs) between genomes is a routine task with Next Generation Sequencers (NGS) data. SNP detection methods generally need a reference genome. As non-model organisms are increasingly investigated, reference-free methods are needed. The discoSnp method detects SNPs directly from raw NGS data set(s) without using any third-party information. The pea non-model organism has a 4.5 GB complex genome without reference. We compared, on the same set of low depth pea sequences, the SNPs generated by discoSnp with those published with a previous SNP discovery pipeline, and those generated using classical mapping approach with the association of Bowtie2 and GATK tools.Main conclusion:The quality of discoSnp results in association with its very low memory needs and low time footprints led us to choose this software for a SNP discovery and direct Genotypin. By Sequencing project on a set of 48 pea genomic DNA libraries from a recombinant inbred lines subpopulation sequenced with Illumina HiSeq2000 technology. The analysis enabled to identify 88,851 SNP polymorphs on this population, from which around 60k SNPs will be genetically mapped

    Similar works