16 research outputs found

    Analysis of the Saccharomyces cerevisiae proteome with PeptideAtlas

    Get PDF
    We present the Saccharomyces cerevisiae PeptideAtlas composed from 47 diverse experiments and 4.9 million tandem mass spectra. The observed peptides align to 61% of Saccharomyces Genome Database (SGD) open reading frames (ORFs), 49% of the uncharacterized SGD ORFs, 54% of S. cerevisiae ORFs with a Gene Ontology annotation of 'molecular function unknown', and 76% of ORFs with Gene names. We highlight the use of this resource for data mining, construction of high quality lists for targeted proteomics, validation of proteins, and software development

    The Generation R Study: design and cohort update 2010

    Get PDF
    The Generation R Study is a population-based prospective cohort study from fetal life until young adulthood. The study is designed to identify early environmental and genetic causes of normal and abnormal growth, development and health during fetal life, childhood and adulthood. The study focuses on four primary areas of research: (1) growth and physical development; (2) behavioural and cognitive development; (3) diseases in childhood; and (4) health and healthcare for pregnant women and children. In total, 9,778 mothers with a delivery date from April 2002 until January 2006 were enrolled in the study. General follow-up rates until the age of 4 years exceed 75%. Data collection in mothers, fathers and preschool children included questionnaires, detailed physical and ultrasound examinations, behavioural observations, and biological samples. A genome wide association screen is available in the participating children. Regular detailed hands on assessment are performed from the age of 5 years onwards. Eventually, results forthcoming from the Generation R Study have to contribute to the development of strategies for optimizing health and healthcare for pregnant women and children

    Building consensus spectral libraries for peptide identification in proteomics

    No full text
    Spectral searching has drawn increasing interest as an alternative to sequence-database searching in proteomics. We developed and validated an open-source software toolkit, SpectraST, to enable proteomics researchers to build spectral libraries and to integrate this promising approach in their data-analysis pipeline. It allows individual researchers to condense raw data into spectral libraries, summarizing information about observed proteomes into a concise and retrievable format for future data analyses

    Tryptic Peptide Reference Data Sets for MALDI Imaging Mass Spectrometry on Formalin-fixed Ovarian Cancer Tissues

    No full text
    MALDI imaging mass spectrometry is a powerful tool for morphology-based proteomic tissue analysis. However, peptide identification is still a major challenge due to low S/N ratios, low mass accuracy and difficulties in correlating observed <i>m</i>/<i>z</i> species with peptide identities. To address this, we have analyzed tryptic digests of formalin-fixed paraffin-embedded tissue microarray cores, from 31 ovarian cancer patients, by LC–MS/MS. The sample preparation closely resembled the MALDI imaging workflow in order to create representative reference data sets containing peptides also observable in MALDI imaging experiments. This resulted in 3844 distinct peptide sequences, at a false discovery rate of 1%, for the entire cohort and an average of 982 distinct peptide sequences per sample. From this, a total of 840 proteins and, on average, 297 proteins per sample could be inferred. To support the efforts of the Chromosome-centric Human Proteome Project Consortium, we have annotated these proteins with their respective chromosome location. In the presented work, the benefit of using a large cohort of data sets was exemplified by correct identification of several <i>m</i>/<i>z</i> species observed in a MALDI imaging experiment. The tryptic peptide data sets generated will facilitate peptide identification in future MALDI imaging studies on ovarian cancer

    Tryptic Peptide Reference Data Sets for MALDI Imaging Mass Spectrometry on Formalin-fixed Ovarian Cancer Tissues

    No full text
    MALDI imaging mass spectrometry is a powerful tool for morphology-based proteomic tissue analysis. However, peptide identification is still a major challenge due to low S/N ratios, low mass accuracy and difficulties in correlating observed <i>m</i>/<i>z</i> species with peptide identities. To address this, we have analyzed tryptic digests of formalin-fixed paraffin-embedded tissue microarray cores, from 31 ovarian cancer patients, by LC–MS/MS. The sample preparation closely resembled the MALDI imaging workflow in order to create representative reference data sets containing peptides also observable in MALDI imaging experiments. This resulted in 3844 distinct peptide sequences, at a false discovery rate of 1%, for the entire cohort and an average of 982 distinct peptide sequences per sample. From this, a total of 840 proteins and, on average, 297 proteins per sample could be inferred. To support the efforts of the Chromosome-centric Human Proteome Project Consortium, we have annotated these proteins with their respective chromosome location. In the presented work, the benefit of using a large cohort of data sets was exemplified by correct identification of several <i>m</i>/<i>z</i> species observed in a MALDI imaging experiment. The tryptic peptide data sets generated will facilitate peptide identification in future MALDI imaging studies on ovarian cancer

    Differential Plasma Glycoproteome of p19 Skin Cancer Mouse Model Using the Corra Label-Free LC-MS Proteomics Platform

    Get PDF
    A proof-of-concept demonstration of the use of label-free quantitative glycoproteomics for biomarker discovery workflow is presented here, using a mouse model for skin cancer as an example. Blood plasma was collected from 10 control mice, and 10 mice having a mutation in the p19(ARF) gene, conferring them high propensity to develop skin cancer after carcinogen exposure. We enriched for N-glycosylated plasma proteins, ultimately generating deglycosylated forms of the modified tryptic peptides for liquid chromatography mass spectrometry (LC-MS) analyses. LC-MS runs for each sample were then performed with a view to identifying proteins that were differentially abundant between the two mouse populations. We then used a recently developed computational framework, Corra, to perform peak picking and alignment, and to compute the statistical significance of any observed changes in individual peptide abundances. Once determined, the most discriminating peptide features were then fragmented and identified by tandem mass spectrometry with the use of inclusion lists. We next assessed the identified proteins to see if there were sets of proteins indicative of specific biological processes that correlate with the presence of disease, and specifically cancer, according to their functional annotations. As expected for such sick animals, many of the proteins identified were related to host immune response. However, a significant number of proteins also directly associated with processes linked to cancer development, including proteins related to the cell cycle, localisation, trasport, and cell death. Additional analysis of the same samples in profiling mode, and in triplicate, confirmed that replicate MS analysis of the same plasma sample generated less variation than that observed between plasma samples from different individuals, demonstrating that the reproducibility of the LC-MS platform was sufficient for this application. These results thus show that an LC-MS-based workflow can be a useful tool for the generation of candidate proteins of interest as part of a disease biomarker discovery effort

    The standard protein mix database: a diverse data set to assist in the production of improved Peptide and protein identification software tools

    No full text
    Tandem mass spectrometry (MS/MS) is frequently used in the identification of peptides and proteins. Typical proteomic experiments rely on algorithms such as SEQUEST and MASCOT to compare thousands of tandem mass spectra against the theoretical fragment ion spectra of peptides in a database. The probabilities that these spectrum-to-sequence assignments are correct can be determined by statistical software such as PeptideProphet or through estimations based on reverse or decoy databases. However, many of the software applications that assign probabilities for MS/MS spectra to sequence matches were developed using training data sets from 3D ion-trap mass spectrometers. Given the variety of types of mass spectrometers that have become commercially available over the last 5 years, we sought to generate a data set of reference data covering multiple instrumentation platforms to facilitate both the refinement of existing computational approaches and the development of novel software tools. We analyzed the proteolytic peptides in a mixture of tryptic digests of 18 proteins, named the `ISB standard protein mix`, using 8 different mass spectrometers. These include linear and 3D ion traps, two quadrupole time-of-flight platforms (qq-TOF), and two MALDI-TOF-TOF platforms. The resulting data set, which has been named the Standard Protein Mix Database, consists of over 1.1 million spectra in 150+ replicate runs on the mass spectrometers. The data were inspected for quality of separation and searched using SEQUEST. All data, including the native raw instrument and mzXML formats and the PeptideProphet validated peptide assignments, are available at http://regis-web.systemsbiology.net/PublicDatasets/

    Internal calibrants allow high accuracy peptide matching between MALDI imaging MS and LC-MS/MS

    No full text
    One of the important challenges for MALDI imaging mass spectrometry (MALDI-IMS) is the unambiguous identification of measured analytes. One way to do this is to match tryptic peptide MALDI-IMS m/z values with LC-MS/MS identified m/z values. Matching using current MALDI-TOF/TOF MS instruments is difficult due to the variability of in situ time-of-flight (TOF) m/z measurements. This variability is currently addressed using external calibration, which limits achievable mass accuracy for MALDI-IMS and makes it difficult to match these data to downstream LC-MS/MS results. To overcome this challenge, the work presented here details a method for internally calibrating data sets generated from tryptic peptide MALDI-IMS on formalin-fixed paraffin-embedded sections of ovarian cancer. By calibrating all spectra to internal peak features the m/z error for matches made between MALDI-IMS m/z values and LC-MS/MS identified peptide m/z values was significantly reduced. This improvement was confirmed by follow up matching of LC-MS/MS spectra to in situ MS/MS spectra from the same m/z peak features. The sum of the data presented here indicates that internal calibrants should be a standard component of tryptic peptide MALDI-IMS experiments.Johan O.R. Gustafsson, James S. Eddes, Stephan Meding, Tomas Koudelka, Martin K. Oehler, Shaun R. McColl and Peter Hoffman
    corecore