47 research outputs found

    Discovering and linking public omics data sets using the Omics Discovery Index.

    Get PDF
    Biomedical data are being produced at an unprecedented rate owing to the falling cost of experiments and wider access to genomics, transcriptomics, proteomics and metabolomics platforms1, 2. As a result, public deposition of omics data is on the increase. This presents new challenges, including finding ways to store, organize and access different types of biomedical data stored on different platforms. Here, we present the Omics Discovery Index (OmicsDI; http://www.omicsdi.org), an open-source platform that enables access, discovery and dissemination of omics data sets

    Metrics for the Human Proteome Project 2015: Progress on the Human Proteome and Guidelines for High-Confidence Protein Identification

    No full text
    Remarkable progress continues on the annotation of the proteins identified in the Human Proteome and on finding credible proteomic evidence for the expression of "missing proteins". Missing proteins are those with no previous protein-level evidence or insufficient evidence to make a confident identification upon reanalysis in PeptideAtlas and curation in neXtProt. Enhanced with several major new data sets published in 2014, the human proteome presented as neXtProt, version 2014-09-19, has 16 491 unique confident proteins (PE level 1), up from 13 664 at 2012-12 and 15 646 at 2013-09. That leaves 2948 missing proteins from genes classified having protein existence level PE 2, 3, or 4, as well as 616 dubious proteins at PE 5. Here, we document the progress of the HPP and discuss the importance of assessing the quality of evidence, confirming automated findings and considering alternative protein matches for spectra and peptides. We provide guidelines for proteomics investigators to apply in reporting newly identified proteins

    Metrics for the Human Proteome Project 2016: Progress on Identifying and Characterizing the Human Proteome, Including Post-Translational Modifications

    No full text
    The HUPO Human Proteome Project (HPP) has two overall goals: (1) stepwise completion of the protein parts list, the draft human proteome, confidently identifying and characterizing at least one protein product from each protein-coding gene, with increasing emphasis on the sequence variants, post-translational modifications, and splice isoforms of those proteins, and (2) making proteomics an integrated counterpart to genomics throughout the biomedical and life sciences community. PeptideAtlas and GPMDB reanalyze all major mass spectrometry datasets available through ProteomeXchange with standardized protocols and stringent quality filters; neXtProt curates and integrates mass spectrometry and other findings. The HPP Guidelines for Mass Spectrometry Data Interpretation version 2.0 were applied to manuscripts submitted for this 2016 C-HPP-led special issue [www.thehpp.org/guidelines]. The Human Proteome presented as neXtProt version 2016-02 has 16,518 confident protein identifications (Protein Existence [PE] Level 1), up from 13,664 at 2012-12, 15,646 at 2013-09, and 16,491 at 2014-10. There are 485 proteins that would have been PE1 under the Guidelines v1.0 from 2012, but now have insufficient evidence due to the agreed-upon more stringent Guidelines v2.0 to reduce false-positives. neXtProt and PeptideAtlas now both require two non-nested, uniquely-mapping (proteotypic) peptides of at least 9 aa in length. There are 2949 missing proteins (PE2+3+4) as the baseline for submissions for the 4th annual C-HPP special issue of Journal of Proteome Research. PeptideAtlas has 14,629 canonical (plus 1187 uncertain and 1755 redundant) entries. GPMdb has 16,190 EC4 entries, and the Human Protein Atlas has 10,475 entries with supportive evidence. neXtProt, PeptideAtlas, and GPMDB are rich resources of information about PTMs, SAAVs, and splice isoforms. Meanwhile, the Biology and Disease-driven B/D-HPP has created comprehensive SRM resources, generated popular-protein lists to guide targeted proteomics assays for specific diseases, and launched an Early Career Researchers initiative
    corecore