30 research outputs found
Quantitative and qualitative differences in celiac disease epitopes among durum wheat varieties identified through deep RNA-amplicon sequencing
BACKGROUND: Wheat gluten is important for the industrial quality of bread wheat (Triticum aestivum L.) and durum wheat (T. turgidum L.). Gluten proteins are also the source of immunogenic peptides that can trigger a T cell reaction in celiac disease (CD) patients, leading to inflammatory responses in the small intestine. Various peptides with three major T cell epitopes involved in CD are derived from alpha-gliadin fraction of gluten. Alpha-gliadins are encoded by a large multigene family and amino acid variation in the CD epitopes is known to influence the immunogenicity of individual gene family members. Current commercial methods of gluten detection are unable to distinguish between immunogenic and non-immunogenic CD epitope variants and thus to accurately quantify the overall CD epitope load of a given wheat variety. Such quantification is indispensable for correct selection of wheat varieties with low potential to cause CD. RESULTS: A 454 RNA-amplicon sequencing method was developed for alpha-gliadin transcripts encompassing the three major CD epitopes and their variants. The method was used to screen developing grains on plants of 61 different durum wheat cultivars and accessions. A dedicated sequence analysis pipeline returned a total of 304 unique alpha-gliadin transcripts, corresponding to a total of 171 ‘unique deduced protein fragments’ of alpha-gliadins. The numbers of these fragments obtained in each plant were used to calculate quantitative and quantitative differences between the CD epitopes expressed in the endosperm of these wheat plants. A few plants showed a lower fraction of CD epitope-encoding alpha-gliadin transcripts, but none were free of CD epitopes. CONCLUSIONS: The dedicated 454 RNA-amplicon sequencing method enables 1) the grouping of wheat plants according to the genetic variation in alpha-gliadin transcripts, and 2) the screening for plants which are potentially less CD-immunogenic. The resulting alpha-gliadin sequence database will be useful as a reference in proteomics analysis regarding the immunogenic potential of mature wheat grains
Characterisation of sugar beet (Beta vulgaris L. ssp. vulgaris) varieties using microsatellite markers
<p>Abstract</p> <p>Background</p> <p>Sugar beet is an obligate outcrossing species. Varieties consist of mixtures of plants from various parental combinations. As the number of informative morphological characteristics is limited, this leads to some problems in variety registration research.</p> <p>Results</p> <p>We have developed 25 new microsatellite markers for sugar beet. A selection of 12 markers with high quality patterns was used to characterise 40 diploid and triploid varieties. For each variety 30 individual plants were genotyped. The markers amplified 3-21 different alleles. Varieties had up to 7 different alleles at one marker locus. All varieties could be distinguished. For the diploid varieties, the expected heterozygosity ranged from 0.458 to 0.744. The average inbreeding coefficient F<sub>is </sub>was 0.282 ± 0.124, but it varied widely among marker loci, from F<sub>is </sub>= +0.876 (heterozygote deficiency) to F<sub>is </sub>= -0.350 (excess of heterozygotes). The genetic differentiation among diploid varieties was relatively constant among markers (F<sub>st </sub>= 0.232 ± 0.027). Among triploid varieties the genetic differentiation was much lower (F<sub>st </sub>= 0.100 ± 0.010). The overall genetic differentiation between diploid and triploid varieties was F<sub>st </sub>= 0.133 across all loci. Part of this differentiation may coincide with the differentiation among breeders' gene pools, which was F<sub>st </sub>= 0.063.</p> <p>Conclusions</p> <p>Based on a combination of scores for individual plants all varieties can be distinguished using the 12 markers developed here. The markers may also be used for mapping and in molecular breeding. In addition, they may be employed in studying gene flow from crop to wild populations.</p
Development of microsatellite markers for identifying Brazilian Coffea arabica varieties
Microsatellite markers, also known as SSRs (Simple Sequence Repeats), have proved to be excellent tools for identifying variety and determining genetic relationships. A set of 127 SSR markers was used to analyze genetic similarity in twenty five Coffea arabica varieties. These were composed of nineteen commercially important Brazilians and six interspecific hybrids of Coffea arabica, Coffea canephora and Coffealiberica. The set used comprised 52 newly developed SSR markers derived from microsatellite enriched libraries, 56 designed on the basis of coffee SSR sequences available from public databases, 6 already published, and 13 universal chloroplast microsatellite markers. Only 22 were polymorphic, these detecting 2-7 alleles per marker, an average of 2.5. Based on the banding patterns generated by polymorphic SSR loci, the set of twenty-five coffee varieties were clustered into two main groups, one composed of only Brazilian varieties, and the other of interspecific hybrids, with a few Brazilians. Color mutants could not be separated. Clustering was in accordance with material genealogy thereby revealing high similarity
Analysis of inheritance mode in chrysanthemum using EST-derived SSR markers
To study the inheritance mode of hexaploid chrysanthemum (random or preferential chromosome pairing), a segregation analysis was carried out using SSR markers derived from chrysanthemum ESTs in the public domain. A total of 248 EST-SSR primer pairs were screened in chrysanthemum cultivars 'Dancer' and 'Puma White', of which 49 EST-SSRs were selected as polymorphic and informative markers. These polymorphic markers were used for genotyping a F1-pseudo test cross population derived from a cross between these two cultivars. The 49 EST-SSRs detected 210 marker alleles with an average number of 4.29 marker alleles per locus. For 180 of these polymorphic SSR marker alleles, segregation could be estimated using a χ2 goodness of fit test (α=0.05) with the expected segregation ratios for hexasomic or disomic inheritance. For 65 SSR marker alleles the segregation ratio was informative for the type of inheritance, 33 marker alleles gave a good fit to the expected segregation ratio for hexasomic inheritance and whereas 24 marker alleles gave a good fit for disomic inheritance showing a higher number of marker alleles supporting autopolyploid segregation in chrysanthemum. In addition, the observed ratio of non-simplex to simplex markers was 20:80 (25 vs. 99) supported hexasomic inheritance. Furthermore, random marker allele assortment was found within the six fully informative markers giving conclusive evidence for hexasomic inheritance in chrysanthemum at these chromosomal regions
Comparative Subsequence Sets Analysis (CoSSA) is a robust approach to identify haplotype specific SNPs; mapping and pedigree analysis of a potato wart disease resistance gene Sen3
Standard strategies to identify genomic regions involved in a specific trait variation are often limited by time and resource consuming genotyping methods. Other limiting pre-requisites are the phenotyping of large segregating populations or of diversity panels and the availability and quality of a closely related reference genome. To overcome these limitations, we designed efficient Comparative Subsequence Sets Analysis (CoSSA) workflows to identify haplotype specific SNPs linked to a trait of interest from Whole Genome Sequencing data. As a model, we used the resistance to Synchytrium endobioticum pathotypes 2, 6 and 18 that co-segregated in a tetraploid full sib population. Genomic DNA from both parents, pedigree genotypes, unrelated potato varieties lacking the wart resistance traits and pools of resistant and susceptible siblings were sequenced. Set algebra and depth filtering of subsequences (k-mers) were used to delete unlinked and common SNPs and to enrich for SNPs from the haplotype(s) harboring the resistance gene(s). Using CoSSA, we identified a major and a minor effect locus. Upon comparison to the reference genome, it was inferred that the major resistance locus, referred to as Sen3, was located on the north arm of chromosome 11 between 1,259,552 and 1,519,485 bp. Furthermore, we could anchor the unanchored superscaffold DMB734 from the potato reference genome to a synthenous interval. CoSSA was also successful in identifying Sen3 in a reference genome independent way thanks to the de novo assembly of paired end reads matching haplotype specific k-mers. The de novo assembly provided more R haplotype specific polymorphisms than the reference genome corresponding region. CoSSA also offers possibilities for pedigree analysis. The origin of Sen3 was traced back until Ora. Finally, the diagnostic power of the haplotype specific markers was shown using a panel of 56 tetraploid varieties. CoSSA is an efficient, robust and versatile set of workflows for the genetic analysis of a trait of interest using WGS data. Because the WGS data are used without intermediate reads mapping, CoSSA does not require the use of a reference genome. This approach allowed the identification of Sen3 and the design of haplotype specific, diagnostic markers
Comparative Subsequence Sets Analysis (CoSSA) is a robust approach to identify haplotype specific SNPs; mapping and pedigree analysis of a potato wart disease resistance gene Sen3
Standard strategies to identify genomic regions involved in a specific trait variation are often limited by time and resource consuming genotyping methods. Other limiting pre-requisites are the phenotyping of large segregating populations or of diversity panels and the availability and quality of a closely related reference genome. To overcome these limitations, we designed efficient Comparative Subsequence Sets Analysis (CoSSA) workflows to identify haplotype specific SNPs linked to a trait of interest from Whole Genome Sequencing data. As a model, we used the resistance to Synchytrium endobioticum pathotypes 2, 6 and 18 that co-segregated in a tetraploid full sib population. Genomic DNA from both parents, pedigree genotypes, unrelated potato varieties lacking the wart resistance traits and pools of resistant and susceptible siblings were sequenced. Set algebra and depth filtering of subsequences (k-mers) were used to delete unlinked and common SNPs and to enrich for SNPs from the haplotype(s) harboring the resistance gene(s). Using CoSSA, we identified a major and a minor effect locus. Upon comparison to the reference genome, it was inferred that the major resistance locus, referred to as Sen3, was located on the north arm of chromosome 11 between 1,259,552 and 1,519,485 bp. Furthermore, we could anchor the unanchored superscaffold DMB734 from the potato reference genome to a synthenous interval. CoSSA was also successful in identifying Sen3 in a reference genome independent way thanks to the de novo assembly of paired end reads matching haplotype specific k-mers. The de novo assembly provided more R haplotype specific polymorphisms than the reference genome corresponding region. CoSSA also offers possibilities for pedigree analysis. The origin of Sen3 was traced back until Ora. Finally, the diagnostic power of the haplotype specific markers was shown using a panel of 56 tetraploid varieties. CoSSA is an efficient, robust and versatile set of workflows for the genetic analysis of a trait of interest using WGS data. Because the WGS data are used without intermediate reads mapping, CoSSA does not require the use of a reference genome. This approach allowed the identification of Sen3 and the design of haplotype specific, diagnostic markers
Conclusive evidence for hexasomic inheritance in chrysanthemum based on analysis of a 183 k SNP array
Abstract Background Cultivated chrysanthemum is an outcrossing hexaploid (2n = 6× = 54) with a disputed mode of inheritance. In this paper, we present a single nucleotide polymorphism (SNP) selection pipeline that was used to design an Affymetrix Axiom array with 183 k SNPs from RNA sequencing data (1). With this array, we genotyped four bi-parental populations (with sizes of 405, 53, 76 and 37 offspring plants respectively), and a cultivar panel of 63 genotypes. Further, we present a method for dosage scoring in hexaploids from signal intensities of the array based on mixture models (2) and validation of selection steps in the SNP selection pipeline (3). The resulting genotypic data is used to draw conclusions on the mode of inheritance in chrysanthemum (4), and to make an inference on allelic expression bias (5). Results With use of the mixture model approach, we successfully called the dosage of 73,936 out of 183,130 SNPs (40.4%) that segregated in any of the bi-parental populations. To investigate the mode of inheritance, we analysed markers that segregated in the large bi-parental population (n = 405). Analysis of segregation of duplex x nulliplex SNPs resulted in evidence for genome-wide hexasomic inheritance. This evidence was substantiated by the absence of strong linkage between markers in repulsion, which indicated absence of full disomic inheritance. We present the success rate of SNP discovery out of RNA sequencing data as affected by different selection steps, among which SNP coverage over genotypes and use of different types of sequence read mapping software. Genomic dosage highly correlated with relative allele coverage from the RNA sequencing data, indicating that most alleles are expressed according to their genomic dosage. Conclusions The large population, genotyped with a very large number of markers, is a unique framework for extensive genetic analyses in hexaploid chrysanthemum. As starting point, we show conclusive evidence for genome-wide hexasomic inheritance
De Novo assembly of complete chloroplast genomes from non-model species based on a K-mer frequency-based selection of chloroplast reads from total DNA sequences
Whole Genome Shotgun (WGS) sequences of plant species often contain an abundance of reads that are derived from the chloroplast genome. Up to now these reads have generally been identified and assembled into chloroplast genomes based on homology to chloroplasts from related species. This re-sequencing approach may select against structural differences between the genomes especially in non-model species for which no close relatives have been sequenced before. The alternative approach is to de novo assemble the chloroplast genome from total genomic DNA sequences. In this study, we used k-mer frequency tables to identify and extract the chloroplast reads from the WGS reads and assemble these using a highly integrated and automated custom pipeline. Our strategy includes steps aimed at optimizing assemblies and filling gaps which are left due to coverage variation in the WGS dataset. We have successfully de novo assembled three complete chloroplast genomes from plant species with a range of nuclear genome sizes to demonstrate the universality of our approach: Solanum lycopersicum (0.9 Gb), Aegilops tauschii (4 Gb) and Paphiopedilum henryanum (25 Gb). We also highlight the need to optimize the choice of k and the amount of data used. This new and cost-effective method for de novo short read assembly will facilitate the study of complete chloroplast genomes with more accurate analyses and inferences, especially in non-model plant genomes