<i>De Novo</i> Protein
Sequencing by Combining
Top-Down and Bottom-Up Tandem Mass Spectra
- Publication date
- Publisher
Abstract
There
are two approaches for <i>de novo</i> protein sequencing:
Edman degradation and mass spectrometry (MS). Existing MS-based methods
characterize a novel protein by assembling tandem mass spectra of
overlapping peptides generated from multiple proteolytic digestions
of the protein. Because each tandem mass spectrum covers only a short
peptide of the target protein, the key to high coverage protein sequencing
is to find spectral pairs from overlapping peptides in order to assemble
tandem mass spectra to long ones. However, overlapping regions of
peptides may be too short to be confidently identified. High-resolution
mass spectrometers have become accessible to many laboratories. These
mass spectrometers are capable of analyzing molecules of large mass
values, boosting the development of top-down MS. Top-down tandem mass
spectra cover whole proteins. However, top-down tandem mass spectra,
even combined, rarely provide full ion fragmentation coverage of a
protein. We propose an algorithm, TBNovo, for <i>de novo</i> protein sequencing by combining top-down and bottom-up MS. In TBNovo,
a top-down tandem mass spectrum is utilized as a scaffold, and bottom-up
tandem mass spectra are aligned to the scaffold to increase sequence
coverage. Experiments on data sets of two proteins showed that TBNovo
achieved high sequence coverage and high sequence accuracy