131 research outputs found

    De novo Nd-1 genome assembly reveals genomic diversity of Arabidopsis thaliana and facilitates genome-wide non-canonical splice site analysis across plant species

    Get PDF
    Pucker B. De novo Nd-1 genome assembly reveals genomic diversity of Arabidopsis thaliana and facilitates genome-wide non-canonical splice site analysis across plant species. Bielefeld: Universität Bielefeld; 2019

    Mapping-based genome size estimation

    Get PDF
    Pucker B. Mapping-based genome size estimation. bioRxiv. 2019.While the size of chromosomes can be measured under a microscope, the size of genomes cannot be measured precisely. Biochemical methods and k-mer distribution-based approaches allow only estimations. An alternative approach to predict the genome size based on high contiguity assemblies and short read mappings is presented here and optimized on Arabidopsis thaliana and Beta vulgaris. Brachypodium distachyon, Solanum lycopersicum, Vitis vinifera, and Zea mays were also analyzed to demonstrate the broad applicability of this approach. Mapping-based Genome Size Estimation (MGSE) and additional scripts are available on github: https://github.com/bpucker/MGSE

    Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data

    Get PDF
    Schilbert H, Rempel A, Pucker B. Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data. Plants. 2020;9(4): 439.High-throughput sequencing technologies have rapidly developed during the past years and have become an essential tool in plant sciences. However, the analysis of genomic data remains challenging and relies mostly on the performance of automatic pipelines. Frequently applied pipelines involve the alignment of sequence reads against a reference sequence and the identification of sequence variants. Since most benchmarking studies of bioinformatics tools for this purpose have been conducted on human datasets, there is a lack of benchmarking studies in plant sciences. In this study, we evaluated the performance of 50 different variant calling pipelines, including five read mappers and ten variant callers, on six real plant datasets of the model organism Arabidopsis thaliana. Sets of variants were evaluated based on various parameters including sensitivity and specificity. We found that all investigated tools are suitable for analysis of NGS data in plant research. When looking at different performance metrics, BWA-MEM and Novoalign were the best mappers and GATK returned the best results in the variant calling ste

    Large scale genomic rearrangements in selected Arabidopsis thaliana T-DNA lines are caused by T-DNA insertion mutagenesis.

    Get PDF
    BACKGROUND: Experimental proof of gene function assignments in plants is based on mutant analyses. T-DNA insertion lines provided an invaluable resource of mutants and enabled systematic reverse genetics-based investigation of the functions of Arabidopsis thaliana genes during the last decades. RESULTS: We sequenced the genomes of 14 A. thaliana GABI-Kat T-DNA insertion lines, which eluded flanking sequence tag-based attempts to characterize their insertion loci, with Oxford Nanopore Technologies (ONT) long reads. Complex T-DNA insertions were resolved and 11 previously unknown T-DNA loci identified, resulting in about 2 T-DNA insertions per line and suggesting that this number was previously underestimated. T-DNA mutagenesis caused fusions of chromosomes along with compensating translocations to keep the gene set complete throughout meiosis. Also, an inverted duplication of 800 kbp was detected. About 10 % of GABI-Kat lines might be affected by chromosomal rearrangements, some of which do not involve T-DNA. Local assembly of selected reads was shown to be a computationally effective method to resolve the structure of T-DNA insertion loci. We developed an automated workflow to support investigation of long read data from T-DNA insertion lines. All steps from DNA extraction to assembly of T-DNA loci can be completed within days. CONCLUSIONS: Long read sequencing was demonstrated to be an effective way to resolve complex T-DNA insertions and chromosome fusions. Many T-DNA insertions comprise not just a single T-DNA, but complex arrays of multiple T-DNAs. It is becoming obvious that T-DNA insertion alleles must be characterized by exact identification of both T-DNA::genome junctions to generate clear genotype-to-phenotype relations

    Animal, fungi, and plant genome sequences harbour different non-canonical splice sites

    Get PDF
    Frey K, Pucker B. Animal, fungi, and plant genome sequences harbour different non-canonical splice sites. Cells. 2020;9(2): 458.Most protein encoding genes in eukaryotes contain introns which are interwoven with exons. After transcription, introns need to be removed in order to generate the final mRNA which can be translated into an amino acid sequence by the ribosome. Precise excision of introns by the spliceosome requires conserved dinucleotides which mark the splice sites. However, there are variations of the highly conserved combination of GT at the 5' end and AG at the 3' end of an intron in the genome. GC-AG and AT-AC are two major non-canonical splice site combinations which are known for many years. During the last few years, various minor non-canonical splice site combinations were detected with all possible dinucleotide permutations. Here we expand systematic investigations of non-canonical splice site combinations in plant genomes to all eukaryotes by analysing fungal and animal genome sequences. Comparisons of splice site combinations between these three kingdoms revealed several differences such as a substantially increased CT-AC frequency in fungal genomes. In addition, high numbers of GA-AG splice site combinations were observed in two animal species. In depth investigation of splice site usage based on RNA-Seq read mappings indicates a generally higher flexibility of the 3' splice site compared to the 5' splice site

    Peer-review as a teaching method

    Get PDF
    Friedrich A, Pucker B. Peer-review as a teaching method. Working Paper der AG Forschendes Lernen in der dghd. Vol 2, 2018. Carl von Ossietzky Universität Oldenburg; 2018.Peer-reviews are a common and valued teaching tool at Anglo-American and Asian universities. Previous studies recommended a scaffolded peer-review process and pre-specified criteria. The current study investigated the feasibility and acceptance of scaffolded peer-reviews as a teaching method in German college students. Participants were 7 psychology students and 13 life science students. The students had to write a project report about a psychological experiment or genome research projects. All reports underwent a scaffolded peer-review process according to pre-specified criteria. The students’ feasibility and acceptance ratings were evaluated using customized questionnaires. The results indicated a good feasibility and acceptance in both courses, although the small samples and the different measurements impair comparability and restrict generalization. Descriptive data and qualitative comments indicated similarities between psychology and life science students. In line with evidence from other countries, this subsample of German college students provided first empirical evidence that the scaffolded peer-review might be a feasible and well-accepted teaching method. Future studies should include methodological improvements (e.g. control condition)

    High Contiguity De Novo Genome Sequence Assembly of Trifoliate Yam (Dioscorea dumetorum) Using Long Read Sequencing

    Get PDF
    Siadjeu C, Pucker B, Viehöver P, Albach DC, Weisshaar B. High Contiguity De Novo Genome Sequence Assembly of Trifoliate Yam (Dioscorea dumetorum) Using Long Read Sequencing. Genes. 2020;11(3): 274.Trifoliate yam (Dioscorea dumetorum) is one example of an orphan crop, not traded internationally. Post-harvest hardening of the tubers of this species starts within 24 h after harvesting and renders the tubers inedible. Genomic resources are required for D. dumetorum to improve breeding for non-hardening varieties as well as for other traits. We sequenced the D. dumetorum genome and generated the corresponding annotation. The two haplophases of this highly heterozygous genome were separated to a large extent. The assembly represents 485 Mbp of the genome with an N50 of over 3.2 Mbp. A total of 35,269 protein-encoding gene models as well as 9941 non-coding RNA genes were predicted, and functional annotations were assigned

    Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence

    Get PDF
    Pucker B, Holtgräwe D, Weisshaar B. Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence. BMC Research Notes. 2017;10(1): 667.Abstract Objective The Arabidopsis thaliana Niederzenz-1 genome sequence was recently published with an ab initio gene prediction. In depth analysis of the predicted gene set revealed some errors involving genes with non-canonical splice sites in their introns. Since non-canonical splice sites are difficult to predict ab initio, we checked for options to improve the annotation by transferring annotation information from the recently released Columbia-0 reference genome sequence annotation Araport11. Results Incorporation of hints generated from Araport11 enabled the precise prediction of non-canonical splice sites. Manual inspection of RNA-Seq read mapping and RT-PCR were applied to validate the structural annotations of non-canonical splice sites. Predictions of untranslated regions were also updated by harnessing the potential of Araport11’s information, which was generated by using high coverage RNA-Seq data. The improved gene set of the Nd-1 genome assembly (GeneSet_Nd-1_v1.1) was evaluated via comparison to the initial gene prediction (GeneSet_Nd-1_v1.0) as well as against Araport11 for the Col-0 reference genome sequence. GeneSet_Nd-1_v1.1 contains previously missed non-canonical splice sites in 1256 genes. Reciprocal best hits for 24,527 (89.4%) of all nuclear Col-0 genes against the GeneSet_Nd-1_v1.1 indicate a high gene prediction quality
    • …