Search CORE

117 research outputs found

Improving pan-genome annotation using whole genome multiple alignment

Author: Angiuoli Samuel V
Dunning Hotopp Julie C
Salzberg Steven L
Tettelin Herve
Publication venue
Publication date: 30/06/2011
Field of study

Background: Rapid annotation and comparisons of genomes from multiple isolates (pan-genomes) is becoming commonplace due to advances in sequencing technology. Genome annotations can contain inconsistencies and errors that hinder comparative analysis even within a single species. Tools are needed to compare and improve annotation quality across sets of closely related genomes. Results: We introduce a new tool, Mugsy-Annotator, that identifies orthologs and evaluates annotation quality in prokaryotic genomes using whole genome multiple alignment. Mugsy-Annotator identifies anomalies in annotated gene structures, including inconsistently located translation initiation sites and disrupted genes due to draft genome sequencing or pseudogenes. An evaluation of species pan-genomes using the tool indicates that such anomalies are common, especially at translation initiation sites. Mugsy-Annotator reports alternate annotations that improve consistency and are candidates for further review. Conclusions: Whole genome multiple alignment can be used to efficiently identify orthologs and annotation problem areas in a bacterial pan-genome. Comparisons of annotated gene structures within a species may show more variation than is actually present in the genome, indicating errors in genome annotation. Our new tool Mugsy-Annotator assists re-annotation efforts by highlighting edits that improve annotation consistency.https://doi.org/10.1186/1471-2105-12-27

Springer - Publisher Connector

PubMed Central

Digital Repository at the University of Maryland

Draft Genome Sequence of Pseudomonas sp. Strain LD120, Isolated from the Marine Alga Saccharina latissima

Author: Dunning Hotopp Julie C.
Heiman Clara Margot
Keel Christoph
Kupferschmied Peter
Maurhofer Monika
Vacheron Jordan
Wiese Jutta
Publication venue: 'American Society for Microbiology'
Publication date: 01/01/2020
Field of study

We report the draft genome sequence of Pseudomonas sp. strain LD120, which was isolated from a brown macroalga in the Baltic Sea. The genome of this marine Pseudomonas protegens subgroup bacterium harbors biosynthetic gene clusters for toxic metabolites typically produced by members of this Pseudomonas subgroup, including 2,4-diacetylphloroglucinol, pyoluteorin, and rhizoxin analogs.ISSN:2576-098

OceanRep

Repository for Publications and Research Data

Serveur académique lausannois

Recommended from our members

Cost effective, experimentally robust differential-expression analysis for human/mammalian, pathogen and dual-species transcriptomics.

Author: Bruno Vincent M
Chung Matthew
Dunning Hotopp Julie C
Filler Scott G
Fraser Claire M
Mahurkar Anup
Mattick John
McCracken Carrie
Rasko David A
Shetty Amol C
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

As sequencing read length has increased, researchers have quickly adopted longer reads for their experiments. Here, we examine 14 pathogen or host-pathogen differential gene expression data sets to assess whether using longer reads is warranted. A variety of data sets was used to assess what genomic attributes might affect the outcome of differential gene expression analysis including: gene density, operons, gene length, number of introns/exons and intron length. No genome attribute was found to influence the data in principal components analysis, hierarchical clustering with bootstrap support, or regression analyses of pairwise comparisons that were undertaken on the same reads, looking at all combinations of paired and unpaired reads trimmed to 36, 54, 72 and 101 bp. Read pairing had the greatest effect when there was little variation in the samples from different conditions or in their replicates (e.g. little differential gene expression). But overall, 54 and 72 bp reads were typically most similar. Given differences in costs and mapping percentages, we recommend 54 bp reads for organisms with no or few introns and 72 bp reads for all others. In a third of the data sets, read pairing had absolutely no effect, despite paired reads having twice as much data. Therefore, single-end reads seem robust for differential-expression analyses, but in eukaryotes paired-end reads are likely desired to analyse splice variants and should be preferred for data sets that are acquired with the intent to be community resources that might be used in secondary data analyses

eScholarship - University of California

Serendipitous discovery of Wolbachia genomes in multiple Drosophila species

Author: Delcher Arthur L
Eisen Michael B
Hotopp Julie C Dunning
Nelson William C
Pop Mihai
Salzberg Steven L
Smith Douglas R
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: The Trace Archive is a repository for the raw, unanalyzed data generated by large-scale genome sequencing projects. The existence of this data offers scientists the possibility of discovering additional genomic sequences beyond those originally sequenced. In particular, if the source DNA for a sequencing project came from a species that was colonized by another organism, then the project may yield substantial amounts of genomic DNA, including near-complete genomes, from the symbiotic or parasitic organism. RESULTS: By searching the publicly available repository of DNA sequencing trace data, we discovered three new species of the bacterial endosymbiont Wolbachia pipientis in three different species of fruit fly: Drosophila ananassae, D. simulans, and D. mojavensis. We extracted all sequences with partial matches to a previously sequenced Wolbachia strain and assembled those sequences using customized software. For one of the three new species, the data recovered were sufficient to produce an assembly that covers more than 95% of the genome; for a second species the data produce the equivalent of a 'light shotgun' sampling of the genome, covering an estimated 75-80% of the genome; and for the third species the data cover approximately 6-7% of the genome. CONCLUSIONS: The results of this study reveal an unexpected benefit of depositing raw data in a central genome sequence repository: new species can be discovered within this data. The differences between these three new Wolbachia genomes and the previously sequenced strain revealed numerous rearrangements and insertions within each lineage and hundreds of novel genes. The three new genomes, with annotation, have been deposited in GenBank

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Digital Repository at the University of Maryland

Correction: Serendipitous discovery of Wolbachia genomes in multiple Drosophila species

Author: Salzberg Steven L
Dunning Hotopp Julie C
Delcher Arthur L
Pop Mihai
Smith Douglas R
Eisen Michael B
Nelson William C
Publication venue: BioMed Central
Publication date: 24/06/2005
Field of study

A correction to Serendipitous discovery of Wolbachia genomes in multiple Drosophila species by SL Salzberg, JC Dunning Hotopp, AL Delcher, M Pop, DR Smith, MB Eisen and WC Nelson. Genome Biology 2005, 6:R2

Crossref

PubMed Central

Caltech Authors

New criteria for selecting the origin of DNA replication in Wolbachia and closely related bacteria

Author: Baldo Laura
Bordenstein Seth R.
Bourtzis Kostas
Dunning Hotopp Julie C.
Ioannidis Panagiotis
Sapountzis Panagiotis
Siozios Stefanos
Tsiamis Georgios
Werren John H.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

© 2007 Ioannidis et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The definitive version was published in BMC Genomics 8 (2007): 182, doi:10.1186/1471-2164-8-182.Background: The annotated genomes of two closely related strains of the intracellular bacterium Wolbachia pipientis have been reported without the identifications of the putative origin of replication (ori). Identifying the ori of these bacteria and related alpha-Proteobacteria as well as their patterns of sequence evolution will aid studies of cell replication and cell density, as well as the potential genetic manipulation of these widespread intracellular bacteria. Results: Using features that have been previously experimentally verified in the alpha-Proteobacterium Caulobacter crescentus, the origin of DNA replication (ori) regions were identified in silico for Wolbachia strains and eleven other related bacteria belonging to Ehrlichia, Anaplasma, and Rickettsia genera. These features include DnaA-, CtrA- and IHF-binding sites as well as the flanking genes in C. crescentus. The Wolbachia ori boundary genes were found to be hemE and COG1253 protein (CBS domain protein). Comparisons of the putative ori region among related Wolbachia strains showed higher conservation of bases within binding sites. Conclusion: The sequences of the ori regions described here are only similar among closely related bacteria while fundamental characteristics like presence of DnaA and IHF binding sites as well as the boundary genes are more widely conserved. The relative paucity of CtrA binding sites in the ori regions, as well as the absence of key enzymes associated with DNA replication in the respective genomes, suggest that several of these obligate intracellular bacteria may have altered replication mechanisms. Based on these analyses, criteria are set forth for identifying the ori region in genome sequencing projects.PI, PS, SS, GT and KB acknowledge support of their work from intramural funding from the University of Ioannina. SB, JDH, LB and JW acknowledge support of their work from the U.S. National Science Foundation grant EF-0328363. SB also acknowledges the support from the NASA Astrobiology Institute (NNA04CC04A

Crossref

Woods Hole Open Access Server

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Recommended from our members

Future-Proofing Your Microbiology Resource Announcements Genome Assembly for Reproducibility and Clarity.

Author: Baltrus David A
Cuomo Christina A
Dennehy John J
Dunning Hotopp Julie C
Maresca Julia A
Newton Irene LG
Rasko David A
Rokas Antonis
Roux Simon
Stajich Jason E
Publication venue: eScholarship, University of California
Publication date: 05/09/2019
Field of study

Descriptions of resources, like the genome assemblies reported in Microbiology Resource Announcements, are often frozen at their time of publication, yet they will need to be interpreted in the midst of continually evolving technologies. It is therefore important to ensure that researchers accessing published resources have access to all of the information required to repeat, interpret, and extend these original analyses. Here, we provide a set of suggestions to help make certain that published resources remain useful and repeatable for the foreseeable future

eScholarship - University of California

Complete genome sequences of dengue virus type 2 strains from Kilifi, Kenya

Author: Agoti Charles N.
Cotten Matthew
de Laurent Zaydah R.
Delwart Eric
Dunning Hotopp Julie C.
Gitonga John
Kamau Everlyn
Ngoi Joyce M.
Nokes D. James
Phan My V. T.
Sanders Eduard
Warimwe George M.
Publication venue: 'American Society for Microbiology'
Publication date: 24/01/2019
Field of study

Dengue infection remains poorly characterized in Africa and little is known regarding its associated viral genetic diversity. Here, we report dengue virus type 2 (DENV-2) sequence data from 10 clinical samples, including 5 complete genome sequences of the cosmopolitan genotype, obtained from febrile adults seeking outpatient care in coastal Kenya

Warwick Research Archives Portal Repository

Recommended from our members

Genomics of Loa loa, a Wolbachia-free filarial parasite of humans

Author: Birren Bruce W.
Cerqueira Gustavo C.
Desjardins Christopher A.
Fan Lin
Fink Doran L.
Goldberg Jonathan M.
Haas Brian J.
Hotopp Julie C. Dunning
Levin Joshua Z.
Nutman Thomas B.
Ribeiro Jose’ M.C.
Russ Carsten
Saif Sakina
Wortman Jennifer R.
Zeng Qiandong
Zucker Jeremy
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/12/2014
Field of study

Loa loa, the African eyeworm, is a major filarial pathogen of humans. Unlike most filariae, Loa loa does not contain the obligate intracellular Wolbachia endosymbiont. We describe the 91.4 Mb genome of Loa loa, and the genome of the related filarial parasite Wuchereria bancrofti, and predict 14,907 Loa loa genes based on microfilarial RNA sequencing. By comparing these genomes to that of another filarial parasite, Brugia malayi, and to several other nematode genomes, we demonstrate synteny among filariae but not with non-parasitic nematodes. The Loa loa genome encodes many immunologically relevant genes, as well as protein kinases targeted by drugs currently approved for humans. Despite lacking Wolbachia, Loa loa shows no new metabolic synthesis or transport capabilities compared to other filariae. These results suggest that the role played by Wolbachia in filarial biology is more subtle than previously thought and reveal marked differences between parasitic and non-parasitic nematodes

Harvard University - DASH

Rapid transcriptome sequencing of an invasive pest, the brown marmorated stink bug Halyomorpha halys

Author: Chibucos Marcus C
Creasy Todd
Daugherty Sean
Dunning Hotopp Julie C
Flowers Melissa
Ioannidis Panagiotis
Kumar Nikhil
Lu Yong
Orvis Joshua
Ott Sandra
Pick Leslie
Sengamalay Naomi
Shetty Amol
Tallon Luke J
Publication venue: Springer Nature
Publication date: 01/01/2014
Field of study

Halyomorpha halys (Stål) (Insecta:Hemiptera;Pentatomidae), commonly known as the Brown Marmorated Stink Bug (BMSB), is an invasive pest of the mid-Atlantic region of the United States, causing economically important damage to a wide range of crops. Native to Asia, BMSB was first observed in Allentown, PA, USA, in 1996, and this pest is now well-established throughout the US mid-Atlantic region and beyond. In addition to the serious threat BMSB poses to agriculture, BMSB has become a nuisance to homeowners, invading home gardens and congregating in large numbers in human-made structures, including homes, to overwinter. Despite its significance as an agricultural pest with limited control options, only 100 bp of BMSB sequence data was available in public databases when this project began. Transcriptome sequencing was undertaken to provide a molecular resource to the research community to inform the development of pest control strategies and to provide molecular data for population genetics studies of BMSB. Using normalized, strand-specific libraries, we sequenced pools of all BMSB life stages on the Illumina HiSeq. Trinity was used to assemble 200,000 putative transcripts in >100,000 components. A novel bioinformatic method that analyzed the strand-specificity of the data reduced this to 53,071 putative transcripts from 18,573 components. By integrating multiple other data types, we narrowed this further to 13,211 representative transcripts. Bacterial endosymbiont genes were identified in this dataset, some of which have a copy number consistent with being lateral gene transfers between endosymbiont genomes and Hemiptera, including ankyrin-repeat related proteins, lysozyme, and mannanase. Such genes and endosymbionts may provide novel targets for BMSB-specific biocontrol. This study demonstrates the utility of strand-specific sequencing in generating shotgun transcriptomes and that rapid sequencing shotgun transcriptomes is possible without the need for extensive inbreeding to generate homozygous lines. Such sequencing can provide a rapid response to pest invasions similar to that already described for disease epidemiology.https://doi.org/10.1186/1471-2164-15-73

Crossref

Springer - Publisher Connector

PubMed Central

Digital Repository at the University of Maryland