93 research outputs found
Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence
Pucker B, HoltgrÀwe D, Weisshaar B. Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence. BMC Research Notes. 2017;10(1): 667.Abstract
Objective
The Arabidopsis thaliana Niederzenz-1 genome sequence was recently published with an ab initio gene prediction. In depth analysis of the predicted gene set revealed some errors involving genes with non-canonical splice sites in their introns. Since non-canonical splice sites are difficult to predict ab initio, we checked for options to improve the annotation by transferring annotation information from the recently released Columbia-0 reference genome sequence annotation Araport11.
Results
Incorporation of hints generated from Araport11 enabled the precise prediction of non-canonical splice sites. Manual inspection of RNA-Seq read mapping and RT-PCR were applied to validate the structural annotations of non-canonical splice sites. Predictions of untranslated regions were also updated by harnessing the potential of Araport11âs information, which was generated by using high coverage RNA-Seq data. The improved gene set of the Nd-1 genome assembly (GeneSet_Nd-1_v1.1) was evaluated via comparison to the initial gene prediction (GeneSet_Nd-1_v1.0) as well as against Araport11 for the Col-0 reference genome sequence. GeneSet_Nd-1_v1.1 contains previously missed non-canonical splice sites in 1256 genes. Reciprocal best hits for 24,527 (89.4%) of all nuclear Col-0 genes against the GeneSet_Nd-1_v1.1 indicate a high gene prediction quality
Chloroplast genome sequence of Arabidopsis thaliana accession Landsberg erecta assembled from Single-Molecule, Real-Time sequencing data
Stadermann KB, HoltgrÀwe D, Weisshaar B. Chloroplast genome sequence of Arabidopsis thaliana accession Landsberg erecta assembled from Single-Molecule, Real-Time sequencing data. Genome Announcements. 2016;4(5): e00975-16.A publicly available data-set from Pacific Biosciences was used to create an assembly of the chloroplast genome sequence of the Arabidopsis thaliana Landsberg erecta genotype. The assembly is solely based on SMRT sequencing data and hence provides high resolution of the two inverted repeat regions typically contained in chloroplast genomes
SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome
Stadermann KB, Weisshaar B, HoltgrÀwe D. SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome. BMC Bioinformatics. 2015;16(1): 295.Background
Third generation sequencing methods, like SMRT (Single Molecule, Real-Time) sequencing developed by Pacific Biosciences, offer much longer read length in comparison to Next Generation Sequencing (NGS) methods. Hence, they are well suited for de novo- or re-sequencing projects. Sequences generated for these purposes will not only contain reads originating from the nuclear genome, but also a significant amount of reads originating from the organelles of the target organism. These reads are usually discarded but they can also be used for an assembly of organellar replicons. The long read length supports resolution of repetitive regions and repeats within the organelles genome which might be problematic when just using short read data. Additionally, SMRT sequencing is less influenced by GC rich areas and by long stretches of the same base.
Results
We describe a workflow for a de novo assembly of the sugar beet (Beta vulgaris ssp. vulgaris) chloroplast genome sequence only based on data originating from a SMRT sequencing dataset targeted on its nuclear genome. We show that the data obtained from such an experiment are sufficient to create a high quality assembly with a higher reliability than assemblies derived from e.g. Illumina reads only. The chloroplast genome is especially challenging for de novo assembling as it contains two large inverted repeat (IR) regions. We also describe some limitations that still apply even though long reads are used for the assembly.
Conclusions
SMRT sequencing reads extracted from a dataset created for nuclear genome (re)sequencing can be used to obtain a high quality de novo assembly of the chloroplast of the sequenced organism. Even with a relatively small overall coverage for the nuclear genome it is possible to collect more than enough reads to generate a high quality assembly that outperforms short read based assemblies. However, even with long reads it is not always possible to clarify the order of elements of a chloroplast genome sequence reliantly which we could demonstrate with Fosmid End Sequences (FES) generated with Sanger technology. Nevertheless, this limitation also applies to short read sequencing data but is reached in this case at a much earlier stage during finishing
Analysis of a c0t-1 library enables the targeted identification of minisatellite and satellite families in Beta vulgaris
Zakrzewski F, Wenke T, HoltgrÀwe D, Weisshaar B, Schmidt T. Analysis of a c0t-1 library enables the targeted identification of minisatellite and satellite families in Beta vulgaris. BMC Plant Biology. 2010;10(1): 8.BACKGROUND: Repetitive DNA is a major fraction of eukaryotic genomes and occurs particularly often in plants. Currently, the sequencing of the sugar beet (Beta vulgaris) genome is under way and knowledge of repetitive DNA sequences is critical for the genome annotation. We generated a c0t-1 library, representing highly to moderately repetitive sequences, for the characterization of the major B. vulgaris repeat families. While highly abundant satellites are well-described, minisatellites are only poorly investigated in plants. Therefore, we focused on the identification and characterization of these tandemly repeated sequences. RESULTS: Analysis of 1763 c0t-1 DNA fragments, providing 442 kb sequence data, shows that the satellites pBV and pEV are the most abundant repeat families in the B. vulgaris genome while other previously described repeats show lower copy numbers. We isolated 517 novel repetitive sequences and used this fraction for the identification of minisatellite and novel satellite families. Bioinformatic analysis and Southern hybridization revealed that minisatellites are moderately to highly amplified in B. vulgaris. FISH showed a dispersed localization along most chromosomes clustering in arrays of variable size and number with exclusion and depletion in distinct regions. CONCLUSION: The c0t-1 library represents major repeat families of the B. vulgaris genome, and analysis of the c0t-1 DNA was proven to be an efficient method for identification of minisatellites. We established, so far, the broadest analysis of minisatellites in plants and observed their chromosomal localization providing a background for the annotation of the sugar beet genome and for the understanding of the evolution of minisatellites in plant genomes
Genome Sequences of Both Organelles of the Grapevine Rootstock Cultivar âBörnerâ
Frommer B, HoltgrĂ€we D, Hausmann L, et al. Genome Sequences of Both Organelles of the Grapevine Rootstock Cultivar âBörnerâ. Microbiology Resource Announcements. 2020;9(15): e01471-19.Genomic long reads of the interspecific grapevine rootstock cultivar âBörnerâ (Vitis riparia GM183 Ă Vitis cinerea Arnold) were used to assemble its chloroplast and mitochondrion genome sequences. We annotated 133 chloroplast and 172 mitochondrial genes, including the RNA editing sites. The organelle genomes in âBörnerâ were maternally inherited from Vitis riparia
Rapid gene identification in sugar beet using deep sequencing of DNA from phenotypic pools selected from breeding panels
Ries D, HoltgrÀwe D, Viehöver P, Weisshaar B. Rapid gene identification in sugar beet using deep sequencing of DNA from phenotypic pools selected from breeding panels. BMC Genomics. 2016;17(1): 236.Background
The combination of bulk segregant analysis (BSA) and next generation sequencing (NGS), also known as mapping by sequencing (MBS), has been shown to significantly accelerate the identification of causal mutations for species with a reference genome sequence. The usual approach is to cross homozygous parents that differ for the monogenic trait to address, to perform deep sequencing of DNA from F2 plants pooled according to their phenotype, and subsequently to analyze the allele frequency distribution based on a marker table for the parents studied. The method has been successfully applied for EMS induced mutations as well as natural variation. Here, we show that pooling genetically diverse breeding lines according to a contrasting phenotype also allows high resolution mapping of the causal gene in a crop species. The test case was the monogenic locus causing red vs. green hypocotyl color in Beta vulgaris (R locus).
Results
We determined the allele frequencies of polymorphic sequences using sequence data from two diverging phenotypic pools of 180 B. vulgaris accessions each. A single interval of about 31 kbp among the nine chromosomes was identified which indeed contained the causative mutation.
Conclusions
By applying a variation of the mapping by sequencing approach, we demonstrated that phenotype-based pooling of diverse accessions from breeding panels and subsequent direct determination of the allele frequency distribution can be successfully applied for gene identification in a crop species. Our approach made it possible to identify a small interval around the causative gene. Sequencing of parents or individual lines was not necessary. Whenever the appropriate plant material is available, the approach described saves time compared to the generation of an F2 population. In addition, we provide clues for planning similar experiments with regard to pool size and the sequencing depth required
Analysis of the Rpv12 locus in a haplotypeâseparated grapevine genome sequence
Plasmopara viticola, the grapevine downy mildew pathogen, causes severe losses in viticulture if not counteracted by fungicide sprays that need to be repeatedly applied during each growing season. To reduce the amount of plant protection, modern grapevine breeding generates fungusâresistant grapevine cultivars by introgression of resistance loci from wild Vitis spec. sources. However, the presence of only a single resistance locus may provoke the emergence of pathogen races able to overcome the resistance trait of the host. Therefore, a combination of several, independently acting resistance loci is required for sustainable genetic resistance. Quite little is known about the resistanceâconferring genes within the various grapevine resistance loci. To ameliorate this situation and make stacking of resistance loci more efficient, the Rpv12 locus originating from the Asian Vitis amurensis was sequenced and characterized. The complete genome of breeding line Gf.99â03, carrying Rpv12 in heterozygous state, was analyzed. Haplotypes were resolved by assigning the reads to one of the parents of Gf.99â03 using trio binning. Annotation of the resulting genomic sequences was based on RNA-Seq data and predicted gene models. The haplotype carrying the Rpv12 locus, delimited by markers UDVâ014 and UDVâ370 on chromosome 14 (Venuti et al., 2013), diverges strongly from the susceptible haplotype as well as from the reference genome PN40024 12X.v2. It was found to contain two important gene clusters. One cluster includes pathogen-inducible genes similar to the gene ACCELERATED CELL DEATH 6 (A. thaliana) likely involved in hypersensitive response upon pathogen attack. The second cluster comprises positional resistance candidate genes corresponding to typical NLRs (nucleotide binding site, leucine rich repeats), hypothesized to be involved in pathogen perception and cellular defense signalling
QTL analysis of flowering time and ripening traits suggests an impact of a genomic region on linkage group 1 in Vitis.
Fechter I, Hausmann L, Zyprian E, et al. QTL analysis of flowering time and ripening traits suggests an impact of a genomic region on linkage group 1 in Vitis. Theoretical and Applied Genetics. 2014;127(9):1857-1872.In the recent past, genetic analyses of grapevine focused mainly on the identification of resistance loci for major diseases such as powdery and downy mildew. Currently, breeding programs make intensive use of these results by applying molecular markers linked to the resistance traits. However, modern genetics also allows to address additional agronomic traits that have considerable impact on the selection of grapevine cultivars. In this study, we have used linkage mapping for the identification and characterization of flowering time and ripening traits in a mapping population from a cross of V3125 ('Schiava Grossa' à 'Riesling') and the interspecific rootstock cultivar 'Börner' (Vitis riparia à Vitis cinerea). Comparison of the flowering time QTL mapping with data derived from a second independent segregating population identified several common QTLs. Especially a large region on linkage group 1 proved to be of special interest given the genetic divergence of the parents of the two populations. The proximity of the QTL region contains two CONSTANS-like genes. In accordance with data from other plants such as Arabidopsis thaliana and Oryza sativa, we hypothesize that these genes are major contributors to control the time of flowering in Vitis
A Partially Phase-Separated Genome Sequence Assembly of the Vitis Rootstock âBörnerâ (Vitis riparia Ă Vitis cinerea) and Its Exploitation for Marker Development and Targeted Mapping
HoltgrĂ€we D, Rosleff Soerensen T, Hausmann L, et al. A Partially Phase-Separated Genome Sequence Assembly of the Vitis Rootstock âBörnerâ (Vitis riparia Ă Vitis cinerea) and Its Exploitation for Marker Development and Targeted Mapping. Frontiers in Plant Science. 2020;11: 156.Grapevine breeding has become highly relevant due to upcoming challenges like climate change, a decrease in the number of available fungicides, increasing public concern about plant protection, and the demand for a sustainable production. Downy mildew caused by Plasmopara viticola is one of the most devastating diseases worldwide of cultivated Vitis vinifera. In modern breeding programs, therefore, genetic marker technologies and genomic data are used to develop new cultivars with defined and stacked resistance loci. Potential sources of resistance are wild species of American or Asian origin. The interspecific hybrid of Vitis riparia Gm 183 x Vitis cinerea Arnold, available as the rootstock cultivar âBörner,â carries several relevant resistance loci. We applied next-generation sequencing to enable the reliable identification of simple sequence repeats (SSR), and we also generated a draft genome sequence assembly of âBörnerâ to access genome-wide sequence variations in a comprehensive and highly reliable way. These data were used to cover the âBörnerâ genome with genetic marker positions. A subset of these marker positions was used for targeted mapping of the P. viticola resistance locus, Rpv14, to validate the marker position list. Based on the reference genome sequence PN40024, the position of this resistance locus can be narrowed down to less than 0.5 Mbp on chromosome 5
- âŠ