Location of Repository

VMD: a community annotation database for oomycetes and microbial genomes

By Sucheta Tripathy, Varun N. Pandey, Bing Fang, Fidel Salas and Brett M. Tyler


The VBI Microbial Database (VMD) is a database system designed to host a range of microbial genome sequences. At present, the database contains genome sequence and annotation data of two plant pathogens Phytophthora sojae and Phytophthora ramorum. With the completion of the draft genome sequences of these pathogens in collaboration with the DOE Joint Genome Institute (JGI), we have created this resource to make the sequences publicly available. The genome sequences (95 MB for P.sojae and 65 MB for P.ramorum) were annotated with ∼19 000 and ∼16 000 gene models, respectively. We used two different statistical methods to validate these gene models, Fickett's and a log-likelihood method. Functional annotation of the gene models is based on results from BlastX and InterProScan screens. From the InterProScan results, we could assign putative functions to 17 694 genes in P.sojae and 14 700 genes in P.ramorum. We created an easy-to-use genome browser to view the genome sequence data, which opens to detailed annotation pages for each gene model. A community annotation interface is available for registered community members to add or edit annotations. There are ∼ 1600 gene models for P.sojae and ∼700 models for P.ramorum that have already been manually curated. A toolkit is provided as an additional resource for users to perform a variety of sequence analysis jobs. The database is publicly available at

Topics: Article
Publisher: Oxford University Press
OAI identifier: oai:pubmedcentral.nih.gov:1347405
Provided by: PubMed Central
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://www.pubmedcentral.nih.g... (external link)
  • Suggested articles



    1. (2000). Ab initio gene finding in Drosophila.
    2. and Kahn,D. (2005)TheProDomdatabaseofproteindomainfamilies:more emphasis on 3D.
    3. (2000). EMBOSS: the European Molecular Biology Open Software Suite.
    4. (1997). Gapped Blast and Psi-Blast: a new generation of protein database search programs.
    5. (2005). InterPro, progress and status in 2005.
    6. (2001). InterProScan—an integration platform for the signature-recognition methods in InterPro.
    7. (2001). K2/Kleisli and GUS: experiments in integrated access to genomic data sources.
    8. McLachlan,A.D.(1984)Amethodformeasuringthenon-randombiasofa codon usage table.
    9. Nickerson,E.,Stajich,J.E.,Harris,T.W.,Arva,A.andLewis,S.(2002)The generic genome browser: a building block for a model organism system database.
    10. (2003). PRINTS and its automatic supplement, prePRINTS.
    11. (2004). Recent improvements to the PROSITE database.
    12. (2002). Recent improvements to the SMART domain-based sequence annotation resource.
    13. (1982). Recognition of protein coding regions in DNA sequences.
    14. (2004). The Pfam protein families database.
    15. (2003). The TIGRFAMs database of protein families.

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.