134 research outputs found
A high-quality genome assembly of the waterlily aphid Rhopalosiphum nymphaeae
Waterlily aphid, Rhopalosiphum nymphaeae (Linnaeus), is a host-alternating aphid known to feed on both terrestrial and aquatic hosts. It causes damage through direct herbivory and acting as a vector for plant viruses, impacting worldwide Prunus spp. fruits and aquatic plants. Interestingly, R. nymphaeae’s ability to thrive in both aquatic and terrestrial conditions sets it apart from other aphids, offering a unique perspective on adaptation. We present the first high-quality R. nymphaeae genome assembly with a size of 324.4 Mb using PacBio long-read sequencing. The resulting assembly is highly contiguous with a contig N50 reached 12.7 Mb. The BUSCO evaluation suggested a 97.5% completeness. The R. nymphaeae genome consists of 16.9% repetitive elements and 16,834 predicted protein-coding genes. Phylogenetic analysis positioned R. nymphaeae within the Aphidini tribe, showing close relations to R. maidis and R. padi. The high-quality reference genome R. nymphaeae provides a unique resource for understanding genome evolution in aphids and paves the foundation for understanding host plant adaptation mechanisms and developing pest control strategies
Doctor of Philosophy
dissertationWhole genome sequencing projects have expanded our understanding of evolution, organism development, and human disease. Now advances in secondgeneration technologies are making whole genome sequencing routine even for small laboratories. However, advances in annotation technology have not kept pace with genome sequencing, and annotation has become the major bottleneck for many genome projects (especially those with limited bioinformatics expertise). At the same time, challenges associated with genomics research extend beyond merely annotating genomes, as annotations must be subjected to diverse downstream analyses, the complexities of which can confound smaller research groups. Additionally, with improvements in genome assembly and the wide availability of next generation transcriptome data (mRNA-seq), researchers have the opportunity to re-annotate previously published genomes, which creates new difficulties for data integration and management that are not well addressed by existing tools. In response to the challenges facing second-generation genome projects, I have developed the annotation pipeline MAKER2 together with accessory software for downstream analysis and data management. The MAKER2 annotation pipeline finds repeats within a genome, aligns ESTs and cDNAs, identifies sites of protein homology, and produces database-ready gene annotations in association with supporting evidence. However MAKER2 can go beyond structural annotation to identify and integrate functional annotations. MAKER2 also provides researchers iv with the capability to re-annotate legacy genome datasets and to incorporate mRNAseq. Additionally, MAKER2 supports distributed parallelization on computer clusters, thus providing a scalable solution for datasets of any size. Annotations produced by MAKER2 can be directly loaded into many popular downstream annotation analysis and management tools from the Generic Model Organism Database Project. By using MAKER2 with these tools, research groups can quickly build genome annotations, perform analyses, and distribute their data to the wider scientific community. Here I describe the internal architecture of MAKER2, and document its computational capabilities. I also describe my work to annotate and analyze eight emerging model organism genomes in collaboration with their associated genome projects. Thus, in the course of my thesis work, I have addressed a specific need within the scientific community for easy-to-use annotation and analysis tools while also expanding our understanding of evolution and biology
Genome analysis of the necrotrophic fungal pathogens Sclerotinia sclerotiorum and Botrytis cinerea
Sclerotinia sclerotiorum and Botrytis cinerea are closely related necrotrophic plant pathogenic fungi notable for their wide host ranges and environmental persistence. These attributes have made these species models for understanding the complexity of necrotrophic, broad host-range pathogenicity. Despite their similarities, the two species differ in mating behaviour and the ability to produce asexual spores. We have sequenced the genomes of one strain of S. sclerotiorum and two strains of B. cinerea. The comparative analysis of these genomes relative to one another and to other sequenced fungal genomes is provided here. Their 38–39 Mb genomes include 11,860–14,270 predicted genes, which share 83% amino acid identity on average between the two species. We have mapped the S. sclerotiorum assembly to 16 chromosomes and found large-scale co-linearity with the B. cinerea genomes. Seven percent of the S. sclerotiorum genome comprises transposable elements compared t
Multiple independent genetic code reassignments of the UAG stop codon in phyllopharyngean ciliates
The translation of nucleotide sequences into amino acid sequences, governed by the genetic code, is one of the most conserved features of molecular biology. The standard genetic code, which uses 61 sense codons to encode one of the 20 standard amino acids and 3 stop codons (UAA, UAG, and UGA) to terminate translation, is used by most extant organisms. The protistan phylum Ciliophora (the ’ciliates’) are the most prominent exception to this norm, exhibiting the grfeatest diversity of nuclear genetic code variants and evidence of repeated changes in the code. In this study, we report the discovery of multiple independent genetic code changes within the Phyllopharyngea class of ciliates. By mining publicly available ciliate genome datasets, we discovered that three ciliate species from the TARA Oceans eukaryotic metagenome dataset use the UAG codon to putatively encode leucine. We identified novel suppressor tRNA genes in two of these genomes which are predicted to decode the reassigned UAG codon to leucine. Phylogenomics analysis revealed that these three uncultivated taxa form a monophyletic lineage within the Phyllopharyngea class. Expanding our analysis by reassembling published phyllopharyngean genome datasets led to the discovery that the UAG codon had also been reassigned to putatively code for glutamine in Hartmannula sinica and Trochilia petrani. Phylogenomics analysis suggests that this occurred via two independent genetic code change events. These data demonstrate that the reassigned UAG codons have widespread usage as sense codons within the phyllopharyngean ciliates. Furthermore, we show that the function of UAA is firmly fixed as the preferred stop codon. These findings shed light on the evolvability of the genetic code in understudied microbial eukaryotes
Convolutional Neural Network-Based Gene Prediction Using Buffalograss as a Model System
The task of gene prediction has been largely stagnant in algorithmic improvements compared to when algorithms were first developed for predicting genes thirty years ago. Rather than iteratively improving the underlying algorithms in gene prediction tools by utilizing better performing models, most current approaches update existing tools through incorporating increasing amounts of extrinsic data to improve gene prediction performance. The traditional method of predicting genes is done using Hidden Markov Models (HMMs). These HMMs are constrained by having strict assumptions made about the independence of genes that do not always hold true. To address this, a Convolutional Neural Network (CNN) based gene prediction tool was developed and named GeneCNN. Due to their nonlinearity, neural networks are adept at capturing complex relationships between data points when applied to sufficiently large datasets such as whole genomes. Convolutional neural networks further improve upon neural networks through the incorporation of spatial dependence into individual datapoints. GeneCNN was trained using a sequenced buffalograss (Bouteloua dactyloides) genome. Training performance of GeneCNN resulted in a 97% accuracy in correctly identifying genic sequences in test data. GeneCNN uniquely identified a greater number of genes than currently existing gene prediction tools BRAKER3, AUGUSTUS, and Fgenesh at 1,089, 1,535, and 478 respectively, when using a 10 million nucleotide length genome sequence of buffalograss as input. Gene predictions made by combinations of the tools BRAKER3, AUGUSTUS, and Fgenesh, were compared to GeneCNN to assess the percentage of gene predictions made by GeneCNN that are supported by at least one other tool, where support ranged from 40.5% to 84.1% of all GeneCNN gene predictions for every combination of BRAKER3, AUGUSTUS, and Fgenesh. The findings in this study support the use of CNNs for gene prediction and serve as a valuable resource for the improvement of gene prediction algorithms in future research
Genomic analysis of the necrotrophic fungal pathogens Sclerotinia sclerotiorum and Botrytis cinerea
Sclerotinia sclerotiorum and Botrytis cinerea are closely related necrotrophic plant pathogenic fungi notable for their wide host ranges and environmental persistence. These attributes have made these species models for understanding the complexity of necrotrophic, broad host-range pathogenicity. Despite their similarities, the two species differ in mating behaviour and the ability to produce asexual spores. We have sequenced the genomes of one strain of S. sclerotiorum and two strains of B. cinerea. The comparative analysis of these genomes relative to one another and to other sequenced fungal genomes is provided here. Their 38–39 Mb genomes include 11,860–14,270 predicted genes, which share 83% amino acid identity on average between the two species. We have mapped the S. sclerotiorum assembly to 16 chromosomes and found large-scale co-linearity with the B. cinerea genomes. Seven percent of the S. sclerotiorum genome comprises transposable elements compared to <1% of B. cinerea. The arsenal of genes associated with necrotrophic processes is similar between the species, including genes involved in plant cell wall degradation and oxalic acid production. Analysis of secondary metabolism gene clusters revealed an expansion in number and diversity of B. cinerea–specific secondary metabolites relative to S. sclerotiorum. The potential diversity in secondary metabolism might be involved in adaptation to specific ecological niches. Comparative genome analysis revealed the basis of differing sexual mating compatibility systems between S. sclerotiorum and B. cinerea. The organization of the mating-type loci differs, and their structures provide evidence for the evolution of heterothallism from homothallism. These data shed light on the evolutionary and mechanistic bases of the genetically complex traits of necrotrophic pathogenicity and sexual mating. This resource should facilitate the functional studies designed to better understand what makes these fungi such successful and persistent pathogens of agronomic crops.Fil: Ten Have, Arjen. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Mar del Plata. Instituto de Investigaciones Biológicas; Argentina. Universidad Nacional de Mar del Plata. Facultad de Ciencias Exactas y Naturales. Instituto de Investigaciones Biológicas; ArgentinaFil: Amselem, Joelle. Institut National de la Recherche Agronomique; FranciaFil: Cuomo, Christina A.. Broad Institute of MIT and Harvard; Estados UnidosFil: Jan, A. L. van Kan. Wageningen University; Países BajosFil: Viaud, Muriel. Institut National de la Recherche Agronomique; FranciaFil: Benito, Ernesto P.. Universidad de Salamanca; EspañaFil: Couloux, Arnaud. Centre National de Séquençage. Genoscope; FranciaFil: Coutinho, Pedro M.. Centre National de la Recherche Scientifique; FranciaFil: Vries, Ronald P. de. Microbiology and Kluyver Centre for Genomics of Industrial Fermentations; Países Bajos. Fungal Biodiversity Centre; Países BajosFil: Dyer, Paul S.. The University Of Nottingham; Reino UnidoFil: Fillinger, Sabine. Institut National de la Recherche Agronomique; FranciaFil: Fournier, Elisabeth. Institut National de la Recherche Agronomique; Francia. Centre de coopération internationale en recherche agronomique pour le développement; FranciaFil: Gout, Lilian. Institut National de la Recherche Agronomique; FranciaFil: Hahn, Matthias. University Of Kaiserlautern; AlemaniaFil: Kohn, Linda. University Of Toronto; CanadáFil: Lapalu, Nicolas. Institut National de la Recherche Agronomique; FranciaFil: Plummer, Kim M.. la Trobe University; AustraliaFil: Pradier, Jean-Marc. Institut National de la Recherche Agronomique; FranciaFil: Quévillon, Emmanuel. Institut National de la Recherche Agronomique; Francia. Centre National de la Recherche Scientifique; FranciaFil: Sharon, Amir. Tel Aviv University. Department of Molecular Biology and Ecology of Plants; IsraelFil: Simon, Adeline. Institut National de la Recherche Agronomique; FranciaFil: Tudzynski, Bettina. Institut für Biologie und Biotechnologie der Pflanzen; AlemaniaFil: Tudzynski, Paul. Institut für Biologie und Biotechnologie der Pflanzen; AlemaniaFil: Wincker, Patrick. Centre National de Séquençage. Genoscope; FranciaFil: Andrew, Marion. University Of Toronto; CanadáFil: Anthouard, Véronique. Centre National de Séquençage. Genoscope; FranciaFil: Beever, Ross E.. Landcare Research; Nueva ZelandaFil: Beffa, Rolland. Centre National de la Recherche Scientifique; FranciaFil: Benoit, Isabelle . Microbiology and Kluyver Centre for Genomics of Industrial Fermentations; Países BajosFil: Bouzid, Ourdia. Microbiology and Kluyver Centre for Genomics of Industrial Fermentations; Países Bajo
Genomic analysis of the necrotrophic fungal pathogens Sclerotinia sclerotiorum and Botrytis cinerea
This is the final version of the article. Available from the publisher via the DOI in this record.Sclerotinia sclerotiorum and Botrytis cinerea are closely related necrotrophic plant pathogenic fungi notable for their wide host ranges and environmental persistence. These attributes have made these species models for understanding the complexity of necrotrophic, broad host-range pathogenicity. Despite their similarities, the two species differ in mating behaviour and the ability to produce asexual spores. We have sequenced the genomes of one strain of S. sclerotiorum and two strains of B. cinerea. The comparative analysis of these genomes relative to one another and to other sequenced fungal genomes is provided here. Their 38-39 Mb genomes include 11,860-14,270 predicted genes, which share 83% amino acid identity on average between the two species. We have mapped the S. sclerotiorum assembly to 16 chromosomes and found large-scale co-linearity with the B. cinerea genomes. Seven percent of the S. sclerotiorum genome comprises transposable elements compared to <1% of B. cinerea. The arsenal of genes associated with necrotrophic processes is similar between the species, including genes involved in plant cell wall degradation and oxalic acid production. Analysis of secondary metabolism gene clusters revealed an expansion in number and diversity of B. cinerea-specific secondary metabolites relative to S. sclerotiorum. The potential diversity in secondary metabolism might be involved in adaptation to specific ecological niches. Comparative genome analysis revealed the basis of differing sexual mating compatibility systems between S. sclerotiorum and B. cinerea. The organization of the mating-type loci differs, and their structures provide evidence for the evolution of heterothallism from homothallism. These data shed light on the evolutionary and mechanistic bases of the genetically complex traits of necrotrophic pathogenicity and sexual mating. This resource should facilitate the functional studies designed to better understand what makes these fungi such successful and persistent pathogens of agronomic crops.The Sclerotinia sclerotiorum genome project was supported by the USDA Cooperative State Research, Education and Extension Service (USDA-NRI 2004). Sclerotinia sclerotiorum ESTs were funded by a grant to JA Rollins from USDA specific cooperative agreement 58-5442-4-281. The genome sequence of Botrytis cinerea strain T4 was funded by Genoscope, CEA, France. M Viaud was funded by the “Projet INRA Jeune-Equipe”. PM Coutinho and B Henrissat were funded by the ANR to project E-Tricel (grant ANR-07-BIOE-006). The CAZy database is funded in part by GIS-IBiSA. DM Soanes and NJ Talbot were partly funded by the UK Biotechnology and Biological Sciences Research Council. KM Plummer was partially funded by the New Zealand Bio-Protection Research Centre, http://bioprotection.org.nz/. BJ Howlett and A Sexton were partially funded by the Australian Grains Research and Development Corporation, www.grdc.com.au. L Kohn was partially funded by NSERC Discovery Grant (Natural Sciences and Engineering Research Council of Canada) - Grant number 458078. M Dickman was supported by the NSF grant MCB-092391 and BARD grant US-4041-07C. O Yarden was supported by BARD grant US-4041-07C. EG Danchin obtained financial support from the European Commission (STREP FungWall grant, contract: LSHB - CT- 2004 - 511952). A Botrytis Genome Workshop (Kaiserslautern, Germany) was supported by a grant from the German Science Foundation (DFG; HA1486) to M Hahn
- …
