143 research outputs found
Genome-Wide Distribution and Organization of Microsatellites in Plants: An Insight into Marker Development in Brachypodium
Plant genomes are complex and contain large amounts of repetitive DNA including microsatellites that are distributed across entire genomes. Whole genome sequences of several monocot and dicot plants that are available in the public domain provide an opportunity to study the origin, distribution and evolution of microsatellites, and also facilitate the development of new molecular markers. In the present investigation, a genome-wide analysis of microsatellite distribution in monocots (Brachypodium, sorghum and rice) and dicots (Arabidopsis, Medicago and Populus) was performed. A total of 797,863 simple sequence repeats (SSRs) were identified in the whole genome sequences of six plant species. Characterization of these SSRs revealed that mono-nucleotide repeats were the most abundant repeats, and that the frequency of repeats decreased with increase in motif length both in monocots and dicots. However, the frequency of SSRs was higher in dicots than in monocots both for nuclear and chloroplast genomes. Interestingly, GC-rich repeats were the dominant repeats only in monocots, with the majority of them being present in the coding region. These coding GC-rich repeats were found to be involved in different biological processes, predominantly binding activities. In addition, a set of 22,879 SSR markers that were validated by e-PCR were developed and mapped on different chromosomes in Brachypodium for the first time, with a frequency of 101 SSR markers per Mb. Experimental validation of 55 markers showed successful amplification of 80% SSR markers in 16 Brachypodium accessions. An online database âBraMiâ (Brachypodium microsatellite markers) of these genome-wide SSR markers was developed and made available in the public domain. The observed differential patterns of SSR marker distribution would be useful for studying microsatellite evolution in a monocotâdicot system. SSR markers developed in this study would be helpful for genomic studies in Brachypodium and related grass species, especially for the map based cloning of the candidate gene(s)
Scanning and filling : ultra-dense SNP genotyping combining genotyping-by-sequencing, SNP array and whole-genome resequencing data
Genotyping-by-sequencing (GBS) represents a highly cost-effective high-throughput genotyping
approach. By nature, however, GBS is subject to generating sizeable amounts of
missing data and these will need to be imputed for many downstream analyses. The extent
to which such missing data can be tolerated in calling SNPs has not been explored widely.
In this work, we first explore the use of imputation to fill in missing genotypes in GBS datasets.
Importantly, we use whole genome resequencing data to assess the accuracy of the
imputed data. Using a panel of 301 soybean accessions, we show that over 62,000 SNPs
could be called when tolerating up to 80% missing data, a five-fold increase over the number
called when tolerating up to 20% missing data. At all levels of missing data examined
(between 20% and 80%), the resulting SNP datasets were of uniformly high accuracy (96â
98%). We then used imputation to combine complementary SNP datasets derived from
GBS and a SNP array (SoySNP50K). We thus produced an enhanced dataset of >100,000
SNPs and the genotypes at the previously untyped loci were again imputed with a high level
of accuracy (95%). Of the >4,000,000 SNPs identified through resequencing 23 accessions
(among the 301 used in the GBS analysis), 1.4 million tag SNPs were used as a reference
to impute this large set of SNPs on the entire panel of 301 accessions. These previously
untyped loci could be imputed with around 90% accuracy. Finally, we used the 100K SNP
dataset (GBS + SoySNP50K) to perform a GWAS on seed oil content within this collection
of soybean accessions. Both the number of significant marker-trait associations and the
peak significance levels were improved considerably using this enhanced catalog of SNPs
relative to a smaller catalog resulting from GBS alone at 20% missing data. Our results
demonstrate that imputation can be used to fill in both missing genotypes and untyped loci
with very high accuracy and that this leads to more powerful genetic analyses
Development of a RAD-Seq Based DNA Polymorphism Identification Software, AgroMarker Finder, and Its Application in Rice Marker-Assisted Breeding
Abstract
Rapid and accurate genome-wide marker detection is essential to the marker-assisted breeding and functional genomics studies. In this work, we developed an integrated software, AgroMarker Finder (AMF: http://erp.novelbio.com/AMF), for providing graphical user interface (GUI) to facilitate the recently developed restriction-site associated DNA (RAD) sequencing data analysis in rice. By application of AMF, a total of 90,743 high-quality markers (82,878 SNPs and 7,865 InDels) were detected between rice varieties JP69 and Jiaoyuan5A. The density of the identified markers is 0.2 per Kb for SNP markers, and 0.02 per Kb for InDel markers. Sequencing validation revealed that the accuracy of genome-wide marker detection by AMF is 93%. In addition, a validated subset of 82 SNPs and 31 InDels were found to be closely linked to 117 important agronomic trait genes, providing a basis for subsequent marker-assisted selection (MAS) and variety identification. Furthermore, we selected 12 markers from 31 validated InDel markers to identify seed authenticity of variety Jiaoyuanyou69, and we also identified 10 markers closely linked to the fragrant gene BADH2 to minimize linkage drag for Wuxiang075 (BADH2 donor)/Jiachang1 recombinants selection. Therefore, this software provides an efficient approach for marker identification from RAD-seq data, and it would be a valuable tool for plant MAS and variety protection
Genome survey of pistachio (Pistacia vera L.) by next generation sequencing: Development of novel SSR markers and genetic diversity in Pistacia species
Transcriptome analysis in switchgrass discloses ecotype difference in photosynthetic efficiency
Targeted association mapping demonstrating the complex molecular genetics of fatty acid formation in soybean
GBSX: a toolkit for experimental design and demultiplexing genotyping by sequencing experiments
QTL mapping and candidate genes for resistance to Fusarium ear rot and fumonisin contamination in maize
Analysis of microsatellites in the vulnerable orchid Gastrodia flavilabella: the development of microsatellite markers, and cross-species amplification in Gastrodia
- âŠ