Search CORE

52 research outputs found

A mass accuracy sensitive probability based scoring algorithm for database searching of tandem mass spectrometry data

Author: A Keller
AG Sullivan
AI Nesvizhskii
B Paizs
BT Hansen
DF Hunt
DF Hunt
DN Perkins
EA Kapp
EN Nikolaev
Hua Xu
J Colinge
JEP Syka
JK Eng
JV Olsen
K Biemann
KG Standing
KR Clauser
LY Geer
M Havilio
M Mann
Michael A Freitas
MJ MacCoss
N Zhang
R Bakhtiar
RG Sadygov
RG Sadygov
RG Sadygov
V Bafna
V Dancík
W Qian
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) has become one of the most used tools in mass spectrometry based proteomics. Various algorithms have since been developed to automate the process for modern high-throughput LC-MS/MS experiments. Results A probability based statistical scoring model for assessing peptide and protein matches in tandem MS database search was derived. The statistical scores in the model represent the probability that a peptide match is a random occurrence based on the number or the total abundance of matched product ions in the experimental spectrum. The model also calculates probability based scores to assess protein matches. Thus the protein scores in the model reflect the significance of protein matches and can be used to differentiate true from random protein matches. Conclusion The model is sensitive to high mass accuracy and implicitly takes mass accuracy into account during scoring. High mass accuracy will not only reduce false positives, but also improves the scores of true positive matches. The algorithm is incorporated in an automated database search program MassMatrix.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

KnowledgeBank at OSU

PubMed Central

A Dynamic Noise Level Algorithm for Spectral Screening of Peptide MS/MS Spectra

Author: DN Perkins
H Xu
H Xu
H Xu
Hua Xu
I Sures
JE Elias
JWH Wong
K Flikka
LW Zhang
LY Geer
M Bern
Michael A Freitas
R Aebersold
R Craig
RE Moore
RG Sadygov
S Purvine
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background High-throughput shotgun proteomics data contain a significant number of spectra from non-peptide ions or spectra of too poor quality to obtain highly confident peptide identifications. These spectra cannot be identified with any positive peptide matches in some database search programs or are identified with false positives in others. Removing these spectra can improve the database search results and lower computational expense. Results A new algorithm has been developed to filter tandem mass spectra of poor quality from shotgun proteomic experiments. The algorithm determines the noise level dynamically and independently for each spectrum in a tandem mass spectrometric data set. Spectra are filtered based on a minimum number of required signal peaks with a signal-to-noise ratio of 2. The algorithm was tested with 23 sample data sets containing 62,117 total spectra. Conclusions The spectral screening removed 89.0% of the tandem mass spectra that did not yield a peptide match when searched with the MassMatrix database search software. Only 6.0% of tandem mass spectra that yielded peptide matches considered to be true positive matches were lost after spectral screening. The algorithm was found to be very effective at removal of unidentified spectra in other database search programs including Mascot, OMSSA, and X!Tandem (75.93%-91.00%) with a small loss (3.59%-9.40%) of true positive matches.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)

SAMPI: Protein Identification with Mass Spectra Alignments

Author: A Bairoch
A Wilke
Andreas Wilke
C Wenk
D Bylund
D Gusfield
D Perkins
DC Chamrad
F Schütz
Hans-Michael Kaltenbach
HM Kaltenbach
M Havilio
M Karas
R Aebersold
RG Sadygov
S Böcker
S Gay
Sebastian Böcker
SF Altschul
V Bafna
W Zhang
WE Wolski
WJ Henzel
X Huang
Y Wan
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Mass spectrometry based peptide mass fingerprints (PMFs) offer a fast, efficient, and robust method for protein identification. A protein is digested (usually by trypsin) and its mass spectrum is compared to simulated spectra for protein sequences in a database. However, existing tools for analyzing PMFs often suffer from missing or heuristic analysis of the significance of search results and insufficient handling of missing and additional peaks. RESULTS: We present an unified framework for analyzing Peptide Mass Fingerprints that offers a number of advantages over existing methods: First, comparison of mass spectra is based on a scoring function that can be custom-designed for certain applications and explicitly takes missing and additional peaks into account. The method is able to simulate almost every additive scoring scheme. Second, we present an efficient deterministic method for assessing the significance of a protein hit, independent of the underlying scoring function and sequence database. We prove the applicability of our approach using biological mass spectrometry data and compare our results to the standard software Mascot. CONCLUSION: The proposed framework for analyzing Peptide Mass Fingerprints shows performance comparable to Mascot on small peak lists. Introducing more noise peaks, we are able to keep identification rates at a similar level by using the flexibility introduced by scoring schemes

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Publications at Bielefeld University

Identification of alternative splice variants in Aspergillus flavus through comparison of multiple tandem MS search algorithms

Author: A Marchler-Bauer
AI Nesvizhskii
AJ Link
BC Searle
BM Balgley
C Barber
C Hughes
David C Muddiman
DL Tabb
DN Perkins
DR Georgianna
EA Kapp
H Choi
HM Holden
J Cox
JB Thoden
JE Elias
JK Eng
Kung-Yen Chang
KY Chang
L Florea
L Käll
LY Geer
M Margulies
MN Bainbridge
MP Washburn
N Edwards
R Craig
RG Sadygov
S Heber
S Tanner
SF Altschul
W Yu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Database searching is the most frequently used approach for automated peptide assignment and protein inference of tandem mass spectra. The results, however, depend on the sequences in target databases and on search algorithms. Recently by using an alternative splicing database, we identified more proteins than with the annotated proteins in <it>Aspergillus flavus</it>. In this study, we aimed at finding a greater number of eligible splice variants based on newly available transcript sequences and the latest genome annotation. The improved database was then used to compare four search algorithms: Mascot, OMSSA, X! Tandem, and InsPecT. Results The updated alternative splicing database predicted 15833 putative protein variants, 61% more than the previous results. There was transcript evidence for 50% of the updated genes compared to the previous 35% coverage. Database searches were conducted using the same set of spectral data, search parameters, and protein database but with different algorithms. The false discovery rates of the peptide-spectrum matches were estimated < 2%. The numbers of the total identified proteins varied from 765 to 867 between algorithms. Whereas 42% (1651/3891) of peptide assignments were unanimous, the comparison showed that 51% (568/1114) of the RefSeq proteins and 15% (11/72) of the putative splice variants were inferred by all algorithms. 12 plausible isoforms were discovered by focusing on the consensus peptides which were detected by at least three different algorithms. The analysis found different conserved domains in two putative isoforms of UDP-galactose 4-epimerase. Conclusions We were able to detect dozens of new peptides using the improved alternative splicing database with the recently updated annotation of the <it>A. flavus </it>genome. Unlike the identifications of the peptides and the RefSeq proteins, large variations existed between the putative splice variants identified by different algorithms. 12 candidates of putative isoforms were reported based on the consensus peptide-spectrum matches. This suggests that applications of multiple search engines effectively reduced the possible false positive results and validated the protein identifications from tandem mass spectra using an alternative splicing database.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

OpenMS – An open-source software framework for mass spectrometry

Author: A Keller
A Savitzky
Alexandra Zerck
Andreas Bertsch
Andreas Hildebrandt
BM Mayr
C Gröpl
CA Smith
CC Chang
Clemens Gröpl
D Ballard
DM Horn
DN Perkins
E Lange
E Lange
EA Kapp
Eva Lange
G Stockman
J Hartler
JB Breen
K Reinert
KC Leptos
Knut Reinert
LNN Mueller
LY Geer
M Bellew
M Katajamaa
Marc Sturm
ME Monroe
N Pfeifer
Nico Pfeifer
O Kohlbacher
O Schulz-Trieglaff
Ole Schulz-Trieglaff
Oliver Kohlbacher
P Soille
PGA Pedrioli
R Hussong
Rene Hussong
RG Sadygov
S Orchard
S Tanner
VB Di Marco
W Press
XJ Li
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Mass spectrometry is an essential analytical technique for high-throughput analysis in proteomics and metabolomics. The development of new separation techniques, precise mass analyzers and experimental protocols is a very active field of research. This leads to more complex experimental setups yielding ever increasing amounts of data. Consequently, analysis of the data is currently often the bottleneck for experimental studies. Although software tools for many data analysis tasks are available today, they are often hard to combine with each other or not flexible enough to allow for rapid prototyping of a new analysis workflow. Results We present OpenMS, a software framework for rapid application development in mass spectrometry. OpenMS has been designed to be portable, easy-to-use and robust while offering a rich functionality ranging from basic data structures to sophisticated algorithms for data analysis. This has already been demonstrated in several studies. Conclusion OpenMS is available under the Lesser GNU Public License (LGPL) from the project website at <url>http://www.openms.de</url>.</p

Crossref

Springer

Springer - Publisher Connector

Directory of Open Access Journals

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

PubMed Central

Gutenberg Open

Modular Mass Spectrometric Tool for Analysis of Composition and Phosphorylation of Protein Complexes

Author: A Makarov
AD Rudner
AJ Tackett
AN Krutchinsky
AN Krutchinsky
Andrew N. Krutchinsky
AS Robeva
AV Loboda
C Bassmann
C Kraft
Changhui Deng
Chao Tang
D Fenyo
EJ Chang
EM Phizicky
G Rigaut
G Stafford Jr
IH Su
IM Cristea
IV Chernushevich
J Qin
JC Schwartz
JE Elias
JE Syka
JM Peters
JT Wu
Justin D. Blethrow
JV Olsen
JW Hager
JZ Ye
Jürg Bähler
LM de Godoy
M Knop
M Schroeder
ML Vestal
O Puig
Q Hu
R Odegrip
R Yost
RA Yost
RG Sadygov
RL Martin
RS Annan
RW Purves
S Ghaemmaghami
SE Martin
TJ Garrett
W Ens
X Zhang
Publication venue: Public Library of Science
Publication date: 04/04/2007
Field of study

The combination of high accuracy, sensitivity and speed of single and multiple-stage mass spectrometric analyses enables the collection of comprehensive sets of data containing detailed information about complex biological samples. To achieve these properties, we combined two high-performance matrix-assisted laser desorption ionization mass analyzers in one modular mass spectrometric tool, and applied this tool for dissecting the composition and post-translational modifications of protein complexes. As an example of this approach, we here present studies of the Saccharomyces cerevisiae anaphase-promoting complexes (APC) and elucidation of phosphorylation sites on its components. In general, the modular concept we describe could be useful for assembling mass spectrometers operating with both matrix-assisted laser desorption ionization (MALDI) and electrospray ionization (ESI) ion sources into powerful mass spectrometric tools for the comprehensive analysis of complex biological samples

Crossref

Directory of Open Access Journals

PubMed Central

The influence of cultivation methods on Shewanella oneidensis physiology and proteome expression

High-throughput analyses that are central to microbial systems biology and ecophysiology research benefit from highly homogeneous and physiologically well-defined cell cultures. While attention has focused on the technical variation associated with high-throughput technologies, biological variation introduced as a function of cell cultivation methods has been largely overlooked. This study evaluated the impact of cultivation methods, controlled batch or continuous culture in bioreactors versus shake flasks, on the reproducibility of global proteome measurements in Shewanellaoneidensis MR-1. Variability in dissolved oxygen concentration and consumption rate, metabolite profiles, and proteome was greater in shake flask than controlled batch or chemostat cultures. Proteins indicative of suboxic and anaerobic growth (e.g., fumarate reductase and decaheme c-type cytochromes) were more abundant in cells from shake flasks compared to bioreactor cultures, a finding consistent with data demonstrating that “aerobic” flask cultures were O2 deficient due to poor mass transfer kinetics. The work described herein establishes the necessity of controlled cultivation for ensuring highly reproducible and homogenous microbial cultures. By decreasing cell to cell variability, higher quality samples will allow for the interpretive accuracy necessary for drawing conclusions relevant to microbial systems biology research

Crossref

Springer - Publisher Connector

PubMed Central

Altered Retinoic Acid Metabolism in Diabetic Mouse Kidney Identified by 18O Isotopic Labeling and 2D Mass Spectrometry

Numerous metabolic pathways have been implicated in diabetes-induced renal injury, yet few studies have utilized unbiased systems biology approaches for mapping the interconnectivity of diabetes-dysregulated proteins that are involved. We utilized a global, quantitative, differential proteomic approach to identify a novel retinoic acid hub in renal cortical protein networks dysregulated by type 2 diabetes.Total proteins were extracted from renal cortex of control and db/db mice at 20 weeks of age (after 12 weeks of hyperglycemia in the diabetic mice). Following trypsinization, (18)O- and (16)O-labeled control and diabetic peptides, respectively, were pooled and separated by two dimensional liquid chromatography (strong cation exchange creating 60 fractions further separated by nano-HPLC), followed by peptide identification and quantification using mass spectrometry. Proteomic analysis identified 53 proteins with fold change >or=1.5 and p<or=0.05 after Benjamini-Hochberg adjustment (out of 1,806 proteins identified), including alcohol dehydrogenase (ADH) and retinaldehyde dehydrogenase (RALDH1/ALDH1A1). Ingenuity Pathway Analysis identified altered retinoic acid as a key signaling hub that was altered in the diabetic renal cortical proteome. Western blotting and real-time PCR confirmed diabetes-induced upregulation of RALDH1, which was localized by immunofluorescence predominantly to the proximal tubule in the diabetic renal cortex, while PCR confirmed the downregulation of ADH identified with mass spectrometry. Despite increased renal cortical tissue levels of retinol and RALDH1 in db/db versus control mice, all-trans-retinoic acid was significantly decreased in association with a significant decrease in PPARbeta/delta mRNA.Our results indicate that retinoic acid metabolism is significantly dysregulated in diabetic kidneys, and suggest that a shift in all-trans-retinoic acid metabolism is a novel feature in type 2 diabetic renal disease. Our observations provide novel insights into potential links between altered lipid metabolism and other gene networks controlled by retinoic acid in the diabetic kidney, and demonstrate the utility of using systems biology to gain new insights into diabetic nephropathy

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The SEQUEST Family Tree

Author: A Goffeau
A Keller
B Lu
B MacLean
BA Risk
BJ Diament
BK Faherty
CD Wenger
CL Gatlin
CY Park
DF Hunt
DH Lundgren
F Gluck
HS Chittum
JA Milloy
JJ Howbert
JK Eng
JK Eng
JK Eng
JK Eng
JR Yates
JR Yates
JR Yates
JR Yates
JR Yates III
KG Owens
L Käll
M Mann
MJ MacCoss
MP Washburn
PR Griffin
R Sadygov
RG Sadygov
RG Sadygov
RG Sadygov
S Dasari
S Kim
S McIlwain
V Dorfer
WH McDonald
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

An improved machine learning protocol for the identification of correct Sequest search results

Author: A Keller
A Keller
AC Gavin
AI Nesvizhskii
AI Nesvizhskii
D Tabb
DC Anderson
DN Perkins
H Steen
Hui Lu
J Davis
J Eriksson
J Fang
J Friedman
J Razumovskaya
J Wan
JA Falkner
JA Falkner
JE Elias
JK Eng
JR Quinlan
L Breiman
M Bern
M Kinter
Morten Källberg
MP Washburn
MS Lipton
N Bhardwaj
N Bhardwaj
N Bhardwaj
NS Baliga
PJ Ulintz
R Aebersold
R Aebersold
R Craig
R Langlois
R Langlois
RE Langlois
RE Moore
RG Sadygov
RG Sadygov
T Guina
Y Freund
Y Freund
Y Ho
Z Song
Publication venue: BMC
Publication date: 01/01/2010
Field of study

Abstract Background Mass spectrometry has become a standard method by which the proteomic profile of cell or tissue samples is characterized. To fully take advantage of tandem mass spectrometry (MS/MS) techniques in large scale protein characterization studies robust and consistent data analysis procedures are crucial. In this work we present a machine learning based protocol for the identification of correct peptide-spectrum matches from Sequest database search results, improving on previously published protocols. Results The developed model improves on published machine learning classification procedures by 6% as measured by the area under the ROC curve. Further, we show how the developed model can be presented as an interpretable tree of additive rules, thereby effectively removing the 'black-box' notion often associated with machine learning classifiers, allowing for comparison with expert rule-of-thumb. Finally, a method for extending the developed peptide identification protocol to give probabilistic estimates of the presence of a given protein is proposed and tested. Conclusions We demonstrate the construction of a high accuracy classification model for Sequest search results from MS/MS spectra obtained by using the MALDI ionization. The developed model performs well in identifying correct peptide-spectrum matches and is easily extendable to the protein identification problem. The relative ease with which additional experimental parameters can be incorporated into the classification framework, to give additional discriminatory power, allows for future tailoring of the model to take advantage of information from specific instrument set-ups.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)