16 research outputs found

    A proteomics sample metadata representation for multiomics integration and big data analysis

    Get PDF
    The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets.publishedVersio

    A proteomics sample metadata representation for multiomics integration and big data analysis

    Get PDF
    The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets

    Method for Identification of Threonine Isoforms in Peptides by Ultraviolet Photofragmentation of Cold Ions

    No full text
    Identification of isomeric amino acid residues in peptides and proteins is challenging but often highly desired in proteomics. One of the practically important cases that require isomeric assignments is that associated with single-nucleotide polymorphism substitutions of Met residues by Thr in cancer-related proteins. These genetically encoded substitutions can yet be confused with the chemical modifications, arising from protein alkylation by iodoacetamide, which is commonly used in the standard procedure of sample preparation for proteomic analysis. Similar to the genetically encoded mutations, the alkylation also induces a conversion of methionine residues, but to the iso-threonine form. Recognition of the mutations therefore requires isoform-sensitive detection techniques. Herein, we demonstrate an analytical method for reliable identification of isoforms of threonine residues in tryptic peptides. It is based on ultraviolet photodissociation mass spectrometry of cryogenically cooled ions and a machine-learning algorithm. The measured photodissociation mass spectra exhibit isoform-specific patterns, which are independent of the residues adjacent to threonine or iso-threonine in a peptide sequence. A comprehensive metric-based evaluation demonstrates that, being calibrated with a set of model peptides, the method allows for isomeric identification of threonine residues in peptides of arbitrary sequence

    Identification of Alternative Splicing in Proteomes of Human Melanoma Cell Lines without RNA Sequencing Data

    No full text
    Alternative splicing is one of the main regulation pathways in living cells beyond simple changes in the level of protein expression. Most of the approaches proposed in proteomics for the identification of specific splicing isoforms require a preliminary deep transcriptomic analysis of the sample under study, which is not always available, especially in the case of the re-analysis of previously acquired data. Herein, we developed new algorithms for the identification and validation of protein splice isoforms in proteomic data in the absence of RNA sequencing of the samples under study. The bioinformatic approaches were tested on the results of proteome analysis of human melanoma cell lines, obtained earlier by high-resolution liquid chromatography and mass spectrometry (LC-MS). A search for alternative splicing events for each of the cell lines studied was performed against the database generated from all known transcripts (RefSeq) and the one composed of peptide sequences, which included all biologically possible combinations of exons. The identifications were filtered using the prediction of both retention times and relative intensities of fragment ions in the corresponding mass spectra. The fragmentation mass spectra corresponding to the discovered alternative splicing events were additionally examined for artifacts. Selected splicing events were further validated at the mRNA level by quantitative PCR

    Identification of Alternative Splicing in Proteomes of Human Melanoma Cell Lines without RNA Sequencing Data

    No full text
    Alternative splicing is one of the main regulation pathways in living cells beyond simple changes in the level of protein expression. Most of the approaches proposed in proteomics for the identification of specific splicing isoforms require a preliminary deep transcriptomic analysis of the sample under study, which is not always available, especially in the case of the re-analysis of previously acquired data. Herein, we developed new algorithms for the identification and validation of protein splice isoforms in proteomic data in the absence of RNA sequencing of the samples under study. The bioinformatic approaches were tested on the results of proteome analysis of human melanoma cell lines, obtained earlier by high-resolution liquid chromatography and mass spectrometry (LC-MS). A search for alternative splicing events for each of the cell lines studied was performed against the database generated from all known transcripts (RefSeq) and the one composed of peptide sequences, which included all biologically possible combinations of exons. The identifications were filtered using the prediction of both retention times and relative intensities of fragment ions in the corresponding mass spectra. The fragmentation mass spectra corresponding to the discovered alternative splicing events were additionally examined for artifacts. Selected splicing events were further validated at the mRNA level by quantitative PCR

    MS/MS-Free Protein Identification in Complex Mixtures Using Multiple Enzymes with Complementary Specificity

    No full text
    In this work, we present the results of evaluation of a workflow that employs a multienzyme digestion strategy for MS1-based protein identification in “shotgun” proteomic applications. In the proposed strategy, several cleavage reagents of different specificity were used for parallel digestion of the protein sample followed by MS1 and retention time (RT) based search. Proof of principle for the proposed strategy was performed using experimental data obtained for the annotated 48-protein standard. By using the developed approach, up to 90% of proteins from the standard were unambiguously identified. The approach was further applied to HeLa proteome data. For the sample of this complexity, the proposed MS1-only strategy determined correctly up to 34% of all proteins identified using standard MS/MS-based database search. It was also found that the results of MS1-only search were independent of the chromatographic gradient time in a wide range of gradients from 15–120 min. Potentially, rapid MS1-only proteome characterization can be an alternative or complementary to the MS/MS-based “shotgun” analyses in the studies, in which the experimental time is more important than the depth of the proteome coverage

    IdentiPy: An Extensible Search Engine for Protein Identification in Shotgun Proteomics

    No full text
    We present an open-source, extensible search engine for shotgun proteomics. Implemented in Python programming language, IdentiPy shows competitive processing speed and sensitivity compared with the state-of-the-art search engines. It is equipped with a user-friendly web interface, IdentiPy Server, enabling the use of a single server installation accessed from multiple workstations. Using a simplified version of X!Tandem scoring algorithm and its novel “autotune” feature, IdentiPy outperforms the popular alternatives on high-resolution data sets. Autotune adjusts the search parameters for the particular data set, resulting in improved search efficiency and simplifying the user experience. IdentiPy with the autotune feature shows higher sensitivity compared with the evaluated search engines. IdentiPy Server has built-in postprocessing and protein inference procedures and provides graphic visualization of the statistical properties of the data set and the search results. It is open-source and can be freely extended to use third-party scoring functions or processing algorithms and allows customization of the search workflow for specialized applications

    MS/MS-Free Protein Identification in Complex Mixtures Using Multiple Enzymes with Complementary Specificity

    No full text
    In this work, we present the results of evaluation of a workflow that employs a multienzyme digestion strategy for MS1-based protein identification in “shotgun” proteomic applications. In the proposed strategy, several cleavage reagents of different specificity were used for parallel digestion of the protein sample followed by MS1 and retention time (RT) based search. Proof of principle for the proposed strategy was performed using experimental data obtained for the annotated 48-protein standard. By using the developed approach, up to 90% of proteins from the standard were unambiguously identified. The approach was further applied to HeLa proteome data. For the sample of this complexity, the proposed MS1-only strategy determined correctly up to 34% of all proteins identified using standard MS/MS-based database search. It was also found that the results of MS1-only search were independent of the chromatographic gradient time in a wide range of gradients from 15–120 min. Potentially, rapid MS1-only proteome characterization can be an alternative or complementary to the MS/MS-based “shotgun” analyses in the studies, in which the experimental time is more important than the depth of the proteome coverage

    Multi-Omics Analysis of Glioblastoma Cells’ Sensitivity to Oncolytic Viruses

    No full text
    Oncolytic viruses have gained momentum in the last decades as a promising tool for cancer treatment. Despite the progress, only a fraction of patients show a positive response to viral therapy. One of the key variable factors contributing to therapy outcomes is interferon-dependent antiviral mechanisms in tumor cells. Here, we evaluated this factor using patient-derived glioblastoma multiforme (GBM) cultures. Cell response to the type I interferons’ (IFNs) stimulation was characterized at mRNA and protein levels. Omics analysis revealed that GBM cells overexpress interferon-stimulated genes (ISGs) and upregulate their proteins, similar to the normal cells. A conserved molecular pattern unambiguously differentiates between the preserved and defective responses. Comparing ISGs’ portraits with titration-based measurements of cell sensitivity to a panel of viruses, the “strength” of IFN-induced resistance acquired by GBM cells was ranked. The study demonstrates that suppressing a single ISG and encoding an essential antiviral protein, does not necessarily increase sensitivity to viruses. Conversely, silencing IFIT3 and PLSCR1 genes in tumor cells can negatively affect the internalization of vesicular stomatitis and Newcastle disease viruses. We present evidence of a complex relationship between the interferon response genes and other factors affecting the sensitivity of tumor cells to viruses
    corecore