Skip to main content
Article thumbnail
Location of Repository

Mining whole sample mass spectrometry proteomics data for biomarkers: an overview

By R McDonald, Paul Skipp, Chris N. Potts, Lyn C. Thomas, David O'Connor and J. Bennell


In this paper we aim to provide a concise overview of designing and conducting an MS proteomics experiment in such a way as to allow statistical analysis that may lead to the discovery of novel biomarkers. We provide a summary of the various stages that make up such an experiment, highlighting the need for experimental goals to be decided upon in advance. We discuss issues in experimental design at the sample collection stage, and good practise for standardising protocols within the proteomics laboratory. We then describe approaches to the data mining stage of the experiment, including the processing steps that transform a raw mass spectrum into a useable form. We propose a permutation-based procedure for determining the significance of reported error rates. Finally, because of its general advantages in speed and cost, we suggest that MS proteomics may be a good candidate for an early primary screening approach to disease diagnosis, identifying areas of risk and making referrals for more specific tests without necessarily making a diagnosis in its own right. Our discussion is illustrated with examples drawn from experiments on bovine blood serum conducted in the Centre for Proteomic Research (CPR) at Southampton University

Topics: H1
Year: 2009
OAI identifier:
Provided by: e-Prints Soton

Suggested articles


  1. (2005). Application of fast fourier transform cross-correlation for the alignment of large chromatographic and spectral datasets. Analytical Chemistry, doi
  2. (2005). Classi¯cation of bacterial species from proteomic data using combinatorial approaches incorporating arti¯cial neural networks, cluster analysis, and principal components analysis. doi
  3. (2004). Combination in Supervised Classi¯cation Problems. PhD thesis,
  4. (2005). Correcting common errors in identifying cancer-speci¯c serum peptide signatures. doi
  5. (2005). Enhancement of sensitivity and resolution of surface-enhanced laser desorption/ionization time-of-°ight mass spectrometric records for serum peptides using time-series analysis techniques. doi
  6. (2004). High-resolution serum proteomic features for ovarian cancer detection. Endocrine-Related Cancer,
  7. (2005). Impact of replicate types on proteomic expression analysis. doi
  8. (2004). Improved peak detection and quanti¯cation of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform. doi
  9. (2005). In biomarkers we trust? doi
  10. (2003). Mass spectrometry-based proteomics. doi
  11. (2004). Meta{Analysis of Classi¯cation Methods.
  12. (2003). Multiple approaches to data-mining of proteomic data based on statistical and pattern classi¯cation methods. doi
  13. (2004). Normalization, baseline correction and alignment of high-throughput mass spectrometry data.
  14. (1994). Preprocessing of analytical pro¯les in the presence of homoscedastic or heroscrdastic noise. doi
  15. (2007). Pretreatment of mass spectral pro¯les: Application to proteomic data. Analytical Chemistry, doi
  16. (2005). Proteomics: From basic research to diagnostic application. a review of requirements and needs. doi
  17. (2005). Quest for novel cardiovascular biomarkers by proteomic analysis. doi
  18. (2005). Sample handling for mass spectrometric proteomic investigations of human sera. Analytical Chemistry, doi
  19. (2004). Serum proteomics in cancer diagnosis and management. Annual Review of Medicine, doi
  20. (2005). Serum proteomics pro¯ling - a young technology begins to mature. doi
  21. (2005). So, you want to look for biomarkers (introduction to the special biomarkers issue). doi
  22. (2005). The importance of experimental design in protemic mass spectrometry experiments: Some cautionary tales.
  23. (2006). The mean subjective utility score, a novel metric for cost-sensitive classi¯er evaluation. doi
  24. (2002). Use of proteomic patterns in serum to identify ovarian cancer. The Lancet, doi
  25. (2005). Utilizing human blood plasma for proteomic biomarker discovery. doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.