Abstract Background Modern biomedical research depends on a complete and accurate proteome. With the widespread adoption of new sequencing technologies, genome sequences are generated at a near exponential rate, diminishing the time and effort that can be invested in genome annotation. The resulting gene set contains numerous errors in even the most basic form of annotation: the primary structure of the proteins. Results The application of experimental proteomics data to genome annotation, called proteogenomics, can quickly and efficiently discover misannotations, yielding a more accurate and complete genome annotation. We present a comprehensive proteogenomic analysis of the plague bacterium, <it>Yersinia pestis KIM</it>. We discover non-annotated genes, correct protein boundaries, remove spuriously annotated ORFs, and make major advances towards accurate identification of signal peptides. Finally, we apply our data to 21 other <it>Yersinia </it>genomes, correcting and enhancing their annotations. Conclusions In total, 141 gene models were altered and have been updated in RefSeq and Genbank, which can be accessed seamlessly through any NCBI tool (e.g. blast) or downloaded directly. Along with the improved gene models we discover new, more accurate means of identifying signal peptides in proteomics data.</p

AJ Link

AM Frank

C Ansong

C Sacerdot

C Wei

CA Ouzounis

D Perlman

IB Rogozin

J Crabtree

JD Bendtsen

JD Jaffe

JE Elias

M Aivaliotis

M Baudet

M Mann

N Gupta

NE Castellana

PR Jungblut

PS Chain

R Pieper

Rembert Pieper

RR Brubaker

S Gallien

S Tanner

Samuel H Payne

Shih-Ting Huang

SL Salzberg

T Dandekar

T Gaasterland

W Deng

English

PubMed

Springer - Publisher Connector

A proteogenomic update to Yersinia: enhancing genome annotation

Abstract Background Modern biomedical research depends on a complete and accurate proteome. With the widespread adoption of new sequencing technologies, genome sequences are generated at a near exponential rate, diminishing the time and effort that can be invested in genome annotation. The resulting gene set contains numerous errors in even the most basic form of annotation: the primary structure of the proteins. Results The application of experimental proteomics data to genome annotation, called proteogenomics, can quickly and efficiently discover misannotations, yielding a more accurate and complete genome annotation. We present a comprehensive proteogenomic analysis of the plague bacterium, Yersinia pestis KIM. We discover non-annotated genes, correct protein boundaries, remove spuriously annotated ORFs, and make major advances towards accurate identification of signal peptides. Finally, we apply our data to 21 other Yersinia genomes, correcting and enhancing their annotations. Conclusions In total, 141 gene models were altered and have been updated in RefSeq and Genbank, which can be accessed seamlessly through any NCBI tool (e.g. blast) or downloaded directly. Along with the improved gene models we discover new, more accurate means of identifying signal peptides in proteomics data.</p

Huang Shih-Ting

Payne Samuel H

Pieper Rembert

Directory of Open Access Journals

BMC Genomics

Crossref

A ranking-based scoring function for peptide-spectrum matches.

Bafna V: InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. Anal Chem

Brunak S: Improved prediction of signal peptides:

EV: Purifying and directional selection in overlapping prokaryotic genes. Trends Genet

Genome re-annotation: a wiki solution? Genome Biol

GL: Complete genome sequence of Yersinia pestis strains Antiqua and Nepal516: evidence of gene reduction in an emerging pathogen.

GM: Comparing the predicted and observed properties of proteins encoded in the genome of Escherichia coli K-12. Electrophoresis

GM: The complete genome and proteome of Mycoplasma mobile. Genome Res

Gygi SP: Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods

Halvorson HO: A putative signal peptide recognition site and sequence in eukaryotic and prokaryotic signal peptides.

Oesterhelt D: Large-scale identification of N-terminal peptides in the halophilic archaea Halobacterium salinarum and Natronomonas pharaonis.

Ortho-proteogenomics: Multiple proteomes investigation through orthology and a new MS-based protocol. Genome Res

Pandey A: Use of mass spectrometry-derived data to annotate nucleotide and protein sequence databases. Trends Biochem Sci

PD: The past, present and future of genome-wide reannotation.

Pevzner PA: Whole proteome analysis of post-translational modifications: applications of mass-spectrometry for proteogenomic annotation. Genome Res

Proteomic-based refinement of Deinococcus deserti genome annotation reveals an unwonted use of non-canonical translation initiation codons. Mol Cell Proteomics

RD: Genome sequence of Yersinia pestis KIM.

RD: Proteogenomics: needs and roles to be filled by proteomics in genome annotation. Brief Funct Genomic Proteomic

Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames. Nucleic Acids Res

Sequence of a 1.26-kb DNA fragment containing the structural gene for E.coli initiation factor IF3: presence of an AUU initiator codon.

SH: Proteomics reveals open reading frames in Mycobacterium tuberculosis H37Rv not predicted by genomics. Infect Immun

SN: Characterizing the dynamic nature of the Yersinia pestis periplasmic proteome in response to nutrient exhaustion and temperature change. Proteomics

SN: Integral and peripheral association of proteins and protein complexes with Yersinia pestis inner and outer membranes. Proteome Sci

SN: Temperature and growth phase influence the outer-membrane proteome and the expression of a type VI secretion system in Yersinia pestis.

SP: Discovery and revision of Arabidopsis genes by proteogenomics.

Subproteomic tools to increase genome annotation complexity. Proteomics

Sybil: methods and software for multiple genome comparison and visualization. Methods Mol Biol

Whole-genome analysis: annotations and updates.

Yersinia pestis.

http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3091656

A proteogenomic update to Yersinia: enhancing genome annotation

Abstract

Similar works

Full text

Available Versions

Springer - Publisher Connector

Directory of Open Access Journals

Crossref