De Novo Sequencing of Peptides
from Top-Down Tandem
Mass Spectra
- Publication date
- Publisher
Abstract
De novo sequencing of proteins and
peptides is one of the most
important problems in mass spectrometry-driven proteomics. A variety
of methods have been developed to accomplish this task from a set
of bottom-up tandem (MS/MS) mass spectra. However, a more recently
emerged top-down technology, now gaining more and more popularity,
opens new perspectives for protein analysis and characterization,
implying a need for efficient algorithms to process this kind of MS/MS
data. Here, we describe a method that allows for the retrieval, from
a set of top-down MS/MS spectra, of long and accurate sequence fragments
of the proteins contained in the sample. To this end, we outline a
strategy for generating high-quality sequence tags from top-down spectra,
and introduce the concept of a <i>T</i>-Bruijn graph by
adapting to the case of tags the notion of an <i>A</i>-Bruijn
graph widely used in genomics. The output of the proposed approach
represents the set of amino acid strings spelled out by optimal paths
in the connected components of a <i>T</i>-Bruijn graph.
We illustrate its performance on top-down data sets acquired from
carbonic anhydrase 2 (CAH2) and the Fab region of alemtuzumab