117 research outputs found

    Improving pan-genome annotation using whole genome multiple alignment

    Get PDF
    Background: Rapid annotation and comparisons of genomes from multiple isolates (pan-genomes) is becoming commonplace due to advances in sequencing technology. Genome annotations can contain inconsistencies and errors that hinder comparative analysis even within a single species. Tools are needed to compare and improve annotation quality across sets of closely related genomes. Results: We introduce a new tool, Mugsy-Annotator, that identifies orthologs and evaluates annotation quality in prokaryotic genomes using whole genome multiple alignment. Mugsy-Annotator identifies anomalies in annotated gene structures, including inconsistently located translation initiation sites and disrupted genes due to draft genome sequencing or pseudogenes. An evaluation of species pan-genomes using the tool indicates that such anomalies are common, especially at translation initiation sites. Mugsy-Annotator reports alternate annotations that improve consistency and are candidates for further review. Conclusions: Whole genome multiple alignment can be used to efficiently identify orthologs and annotation problem areas in a bacterial pan-genome. Comparisons of annotated gene structures within a species may show more variation than is actually present in the genome, indicating errors in genome annotation. Our new tool Mugsy-Annotator assists re-annotation efforts by highlighting edits that improve annotation consistency.https://doi.org/10.1186/1471-2105-12-27

    Draft Genome Sequence of Pseudomonas sp. Strain LD120, Isolated from the Marine Alga Saccharina latissima

    Get PDF
    We report the draft genome sequence of Pseudomonas sp. strain LD120, which was isolated from a brown macroalga in the Baltic Sea. The genome of this marine Pseudomonas protegens subgroup bacterium harbors biosynthetic gene clusters for toxic metabolites typically produced by members of this Pseudomonas subgroup, including 2,4-diacetylphloroglucinol, pyoluteorin, and rhizoxin analogs.ISSN:2576-098

    Serendipitous discovery of Wolbachia genomes in multiple Drosophila species

    Get PDF
    BACKGROUND: The Trace Archive is a repository for the raw, unanalyzed data generated by large-scale genome sequencing projects. The existence of this data offers scientists the possibility of discovering additional genomic sequences beyond those originally sequenced. In particular, if the source DNA for a sequencing project came from a species that was colonized by another organism, then the project may yield substantial amounts of genomic DNA, including near-complete genomes, from the symbiotic or parasitic organism. RESULTS: By searching the publicly available repository of DNA sequencing trace data, we discovered three new species of the bacterial endosymbiont Wolbachia pipientis in three different species of fruit fly: Drosophila ananassae, D. simulans, and D. mojavensis. We extracted all sequences with partial matches to a previously sequenced Wolbachia strain and assembled those sequences using customized software. For one of the three new species, the data recovered were sufficient to produce an assembly that covers more than 95% of the genome; for a second species the data produce the equivalent of a 'light shotgun' sampling of the genome, covering an estimated 75-80% of the genome; and for the third species the data cover approximately 6-7% of the genome. CONCLUSIONS: The results of this study reveal an unexpected benefit of depositing raw data in a central genome sequence repository: new species can be discovered within this data. The differences between these three new Wolbachia genomes and the previously sequenced strain revealed numerous rearrangements and insertions within each lineage and hundreds of novel genes. The three new genomes, with annotation, have been deposited in GenBank

    Correction: Serendipitous discovery of Wolbachia genomes in multiple Drosophila species

    Get PDF
    A correction to Serendipitous discovery of Wolbachia genomes in multiple Drosophila species by SL Salzberg, JC Dunning Hotopp, AL Delcher, M Pop, DR Smith, MB Eisen and WC Nelson. Genome Biology 2005, 6:R2

    New criteria for selecting the origin of DNA replication in Wolbachia and closely related bacteria

    Get PDF
    © 2007 Ioannidis et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The definitive version was published in BMC Genomics 8 (2007): 182, doi:10.1186/1471-2164-8-182.Background: The annotated genomes of two closely related strains of the intracellular bacterium Wolbachia pipientis have been reported without the identifications of the putative origin of replication (ori). Identifying the ori of these bacteria and related alpha-Proteobacteria as well as their patterns of sequence evolution will aid studies of cell replication and cell density, as well as the potential genetic manipulation of these widespread intracellular bacteria. Results: Using features that have been previously experimentally verified in the alpha-Proteobacterium Caulobacter crescentus, the origin of DNA replication (ori) regions were identified in silico for Wolbachia strains and eleven other related bacteria belonging to Ehrlichia, Anaplasma, and Rickettsia genera. These features include DnaA-, CtrA- and IHF-binding sites as well as the flanking genes in C. crescentus. The Wolbachia ori boundary genes were found to be hemE and COG1253 protein (CBS domain protein). Comparisons of the putative ori region among related Wolbachia strains showed higher conservation of bases within binding sites. Conclusion: The sequences of the ori regions described here are only similar among closely related bacteria while fundamental characteristics like presence of DnaA and IHF binding sites as well as the boundary genes are more widely conserved. The relative paucity of CtrA binding sites in the ori regions, as well as the absence of key enzymes associated with DNA replication in the respective genomes, suggest that several of these obligate intracellular bacteria may have altered replication mechanisms. Based on these analyses, criteria are set forth for identifying the ori region in genome sequencing projects.PI, PS, SS, GT and KB acknowledge support of their work from intramural funding from the University of Ioannina. SB, JDH, LB and JW acknowledge support of their work from the U.S. National Science Foundation grant EF-0328363. SB also acknowledges the support from the NASA Astrobiology Institute (NNA04CC04A

    Complete genome sequences of dengue virus type 2 strains from Kilifi, Kenya

    Get PDF
    Dengue infection remains poorly characterized in Africa and little is known regarding its associated viral genetic diversity. Here, we report dengue virus type 2 (DENV-2) sequence data from 10 clinical samples, including 5 complete genome sequences of the cosmopolitan genotype, obtained from febrile adults seeking outpatient care in coastal Kenya

    Rapid transcriptome sequencing of an invasive pest, the brown marmorated stink bug Halyomorpha halys

    Get PDF
    Halyomorpha halys (Stål) (Insecta:Hemiptera;Pentatomidae), commonly known as the Brown Marmorated Stink Bug (BMSB), is an invasive pest of the mid-Atlantic region of the United States, causing economically important damage to a wide range of crops. Native to Asia, BMSB was first observed in Allentown, PA, USA, in 1996, and this pest is now well-established throughout the US mid-Atlantic region and beyond. In addition to the serious threat BMSB poses to agriculture, BMSB has become a nuisance to homeowners, invading home gardens and congregating in large numbers in human-made structures, including homes, to overwinter. Despite its significance as an agricultural pest with limited control options, only 100 bp of BMSB sequence data was available in public databases when this project began. Transcriptome sequencing was undertaken to provide a molecular resource to the research community to inform the development of pest control strategies and to provide molecular data for population genetics studies of BMSB. Using normalized, strand-specific libraries, we sequenced pools of all BMSB life stages on the Illumina HiSeq. Trinity was used to assemble 200,000 putative transcripts in >100,000 components. A novel bioinformatic method that analyzed the strand-specificity of the data reduced this to 53,071 putative transcripts from 18,573 components. By integrating multiple other data types, we narrowed this further to 13,211 representative transcripts. Bacterial endosymbiont genes were identified in this dataset, some of which have a copy number consistent with being lateral gene transfers between endosymbiont genomes and Hemiptera, including ankyrin-repeat related proteins, lysozyme, and mannanase. Such genes and endosymbionts may provide novel targets for BMSB-specific biocontrol. This study demonstrates the utility of strand-specific sequencing in generating shotgun transcriptomes and that rapid sequencing shotgun transcriptomes is possible without the need for extensive inbreeding to generate homozygous lines. Such sequencing can provide a rapid response to pest invasions similar to that already described for disease epidemiology.https://doi.org/10.1186/1471-2164-15-73
    corecore