6 research outputs found
Improved annotation with <i>de novo</i> transcriptome assembly in four social amoeba species
Background: Annotation of gene models and transcripts is a fundamental step in genome sequencing projects. Often this is performed with automated prediction pipelines, which can miss complex and atypical genes or transcripts. RNA sequencing (RNA-seq) data can aid the annotation with empirical data. Here we present de novo transcriptome assemblies generated from RNA-seq data in four Dictyostelid species: D. discoideum, P. pallidum, D. fasciculatum and D. lacteum. The assemblies were incorporated with existing gene models to determine corrections and improvement on a whole-genome scale. This is the first time this has been performed in these eukaryotic species. Results: An initial de novo transcriptome assembly was generated by Trinity for each species and then refined with Program to Assemble Spliced Alignments (PASA). The completeness and quality were assessed with the Benchmarking Universal Single-Copy Orthologs (BUSCO) and Transrate tools at each stage of the assemblies. The final datasets of 11,315-12,849 transcripts contained 5,610-7,712 updates and corrections to >50% of existing gene models including changes to hundreds or thousands of protein products. Putative novel genes are also identified and alternative splice isoforms were observed for the first time in P. pallidum, D. lacteum and D. fasciculatum. Conclusions: In taking a whole transcriptome approach to genome annotation with empirical data we have been able to enrich the annotations of four existing genome sequencing projects. In doing so we have identified updates to the majority of the gene annotations across all four species under study and found putative novel genes and transcripts which could be worthy for follow-up. The new transcriptome data we present here will be a valuable resource for genome curators in the Dictyostelia and we propose this effective methodology for use in other genome annotation projects
Comparative genomics of the bacterial genus Listeria: Genome evolution is characterized by limited gene acquisition and limited gene loss
<p>Abstract</p> <p>Background</p> <p>The bacterial genus <it>Listeria </it>contains pathogenic and non-pathogenic species, including the pathogens <it>L. monocytogenes </it>and <it>L. ivanovii</it>, both of which carry homologous virulence gene clusters such as the <it>prfA </it>cluster and clusters of internalin genes. Initial evidence for multiple deletions of the <it>prfA </it>cluster during the evolution of <it>Listeria </it>indicates that this genus provides an interesting model for studying the evolution of virulence and also presents practical challenges with regard to definition of pathogenic strains.</p> <p>Results</p> <p>To better understand genome evolution and evolution of virulence characteristics in <it>Listeria</it>, we used a next generation sequencing approach to generate draft genomes for seven strains representing <it>Listeria </it>species or clades for which genome sequences were not available. Comparative analyses of these draft genomes and six publicly available genomes, which together represent the main <it>Listeria </it>species, showed evidence for (i) a pangenome with 2,032 core and 2,918 accessory genes identified to date, (ii) a critical role of gene loss events in transition of <it>Listeria </it>species from facultative pathogen to saprotroph, even though a consistent pattern of gene loss seemed to be absent, and a number of isolates representing non-pathogenic species still carried some virulence associated genes, and (iii) divergence of modern pathogenic and non-pathogenic <it>Listeria </it>species and strains, most likely circa 47 million years ago, from a pathogenic common ancestor that contained key virulence genes.</p> <p>Conclusions</p> <p>Genome evolution in <it>Listeria </it>involved limited gene loss and acquisition as supported by (i) a relatively high coverage of the predicted pan-genome by the observed pan-genome, (ii) conserved genome size (between 2.8 and 3.2 Mb), and (iii) a highly syntenic genome. Limited gene loss in <it>Listeria </it>did include loss of virulence associated genes, likely associated with multiple transitions to a saprotrophic lifestyle. The genus <it>Listeria </it>thus provides an example of a group of bacteria that appears to evolve through a loss of virulence rather than acquisition of virulence characteristics. While <it>Listeria </it>includes a number of species-like clades, many of these putative species include clades or strains with atypical virulence associated characteristics. This information will allow for the development of genetic and genomic criteria for pathogenic strains, including development of assays that specifically detect pathogenic <it>Listeria </it>strains.</p