5 research outputs found
Aggregation, Dissemination, and Analysis of High-throughput Scientific Data Sets in the Field of Proteomics.
Open-access software and data sets play an important role in the analysis and reproducibility of analysis for high-throughput proteomics data sets. This work details the development of a set of tools, websites, and practices for aggregating, disseminating, and analyzing high-throughput mass spectrometry based proteomics data sets. The majority of this work is framed around development of the ProteomeCommons.org website and the Tranche P2P network. Work is also presented that involves development of an open-access reference data set for proteomics and a post-analysis refinement tool that can identify unexpected peptide and protein modifications, estimate identifiable spectra, and act as a spectral library for cross data set comparisons.Ph.D.BioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/58433/1/jfalkner_1.pd
Proteomics FASTA Archive and Reference Resource
A FASTA file archive and reference resource has been added to ProteomeCommons.org. Motivation for this new functionality derives from two primary sources. The first is the recent FASTA standardization work done by the Human Proteome Organization's Proteomics Standards Initiative (HUPO-PSI). Second is the general lack of a uniform mechanism to properly cite FASTA files used in a study, and to publicly access such FASTA files post-publication. An extension to the Tranche data sharing network has been developed that includes web-pages, documentation, and tools for facilitating the use of FASTA files. These include conversion to the new HUPO-PSI format, and provisions for both citing and publicly archiving FASTA files. This new resource is available immediately, free of charge, and can be accessed at http://www.proteomecommons.org/data/fasta/. Source-code for related tools is also freely available under the BSD license.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/58584/1/1756_ftp.pd
A HUPO test sample study reveals common problems in mass spectrometry-based proteomics
We performed a test sample study to try to identify errors leading to irreproducibility, including incompleteness of peptide sampling, in liquid chromatography-mass spectrometry-based proteomics. We distributed an equimolar test sample, comprising 20 highly purified recombinant human proteins, to 27 laboratories. Each protein contained one or more unique tryptic peptides of 1,250 Da to test for ion selection and sampling in the mass spectrometer. Of the 27 labs, members of only 7 labs initially reported all 20 proteins correctly, and members of only 1 lab reported all tryptic peptides of 1,250 Da. Centralized analysis of the raw data, however, revealed that all 20 proteins and most of the 1,250 Da peptides had been detected in all 27 labs. Our centralized analysis determined missed identifications (false negatives), environmental contamination, database matching and curation of protein identifications as sources of problems. Improved search engines and databases are needed for mass spectrometry-based proteomics.8 page(s