Proteogenomic Definition of Biomarkers for the Large <i>Roseobacter</i> Clade and Application for a Quick Screening
of New Environmental Isolates
- Publication date
- Publisher
Abstract
Whole-cell, matrix-assisted laser desorption/ionization time-of-flight
(MALDI-TOF) mass spectrometry has become a routine and reliable method
for microbial characterization due to its simplicity, low cost, and
high reproducibility. The identification of microbial isolates relies
on the spectral resemblance of low-molecular-weight proteins to already-existing
isolates within the databases. This is a gold standard for clinicians
who have a finite number of well-defined pathogenic strains but represents
a problem for environmental microbiologists with an overwhelming number
of organisms to be defined. Here we set a milestone for implementing
whole-cell MALDI-TOF mass spectrometry to identify isolates from the
biosphere. To make this technique accessible for environmental studies,
we propose to (i) define biomarkers that will always show up with
an intense <i>m</i>/<i>z</i> signal in the MALDI-TOF
spectra and (ii) create a database with all the possible <i>m</i>/<i>z</i> values that these biomarkers can generate to
screen new isolates. We tested our method with the relevant marine <i>Roseobacter</i> lineage. The use of shotgun nanoLC-MS/MS proteomics
on the small proteome fraction of nine <i>Roseobacter</i> strains and the proteogenomic toolbox helped us to identify potential
biomarkers in terms of protein abundance and low variability among
strains. We show that the DNA binding protein, HU, and the ribosomal
proteins, L29 and L30, are the most robust biomarkers within the <i>Roseobacter</i> clade. The molecular weights of these three
biomarkers, as for other conserved homologous proteins, vary due to
sequence variation above the genus level. Therefore, we calculated
the <i>m</i>/<i>z</i> values expected for each
one of the known <i>Roseobacter</i> genera and tested our
strategy during an extensive screening of natural marine isolates
obtained from coastal waters of the Western Mediterranean Sea. The
use of this technique versus standard sequencing methods is discussed