2 research outputs found
Protein Inference Using Peptide Quantification Patterns
Determining
the list of proteins present in a sample, based on
the list of identified peptides, is a crucial step in the untargeted
proteomics LC–MS/MS data-processing pipeline. This step, commonly
referred to as protein inference, turns out to be a very challenging
problem because many peptide sequences are found across multiple proteins.
Current protein inference engines typically use peptide to spectrum
match (PSM) quality measures and spectral count information to score
protein identifications in LC–MS/MS data sets. This is, however,
not enough to confidently validate or otherwise rule out many of the
proteins. Here we introduce the basis for a new way of performing
protein inference based on accurate quantification patterns of identified
peptides using the correlation of these patterns to validate peptide
to protein matches. For the first implementation of this new approach,
we focused on (1) distinguishing between unambiguously and ambiguously
identified proteins and (2) generating hypotheses for the discrimination
of subsets of the ambiguously identified proteins. Our preprocessing
pipelines support both labeled LC–MS/MS or label-free LC–MS
followed by LC–MS/MS providing the peptide quantification.
We apply our procedure to two published data sets and show that it
is able to detect and infer proteins that would otherwise not be confidently
inferred
Protein Inference Using Peptide Quantification Patterns
Determining
the list of proteins present in a sample, based on
the list of identified peptides, is a crucial step in the untargeted
proteomics LC–MS/MS data-processing pipeline. This step, commonly
referred to as protein inference, turns out to be a very challenging
problem because many peptide sequences are found across multiple proteins.
Current protein inference engines typically use peptide to spectrum
match (PSM) quality measures and spectral count information to score
protein identifications in LC–MS/MS data sets. This is, however,
not enough to confidently validate or otherwise rule out many of the
proteins. Here we introduce the basis for a new way of performing
protein inference based on accurate quantification patterns of identified
peptides using the correlation of these patterns to validate peptide
to protein matches. For the first implementation of this new approach,
we focused on (1) distinguishing between unambiguously and ambiguously
identified proteins and (2) generating hypotheses for the discrimination
of subsets of the ambiguously identified proteins. Our preprocessing
pipelines support both labeled LC–MS/MS or label-free LC–MS
followed by LC–MS/MS providing the peptide quantification.
We apply our procedure to two published data sets and show that it
is able to detect and infer proteins that would otherwise not be confidently
inferred