16 research outputs found
A proteomics sample metadata representation for multiomics integration and big data analysis
The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets.publishedVersio
A proteomics sample metadata representation for multiomics integration and big data analysis
The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets
Method for Identification of Threonine Isoforms in Peptides by Ultraviolet Photofragmentation of Cold Ions
Identification of isomeric amino acid residues in peptides and proteins is challenging but often highly desired in proteomics. One of the practically important cases that require isomeric assignments is that associated with single-nucleotide polymorphism substitutions of Met residues by Thr in cancer-related proteins. These genetically encoded substitutions can yet be confused with the chemical modifications, arising from protein alkylation by iodoacetamide, which is commonly used in the standard procedure of sample preparation for proteomic analysis. Similar to the genetically encoded mutations, the alkylation also induces a conversion of methionine residues, but to the iso-threonine form. Recognition of the mutations therefore requires isoform-sensitive detection techniques. Herein, we demonstrate an analytical method for reliable identification of isoforms of threonine residues in tryptic peptides. It is based on ultraviolet photodissociation mass spectrometry of cryogenically cooled ions and a machine-learning algorithm. The measured photodissociation mass spectra exhibit isoform-specific patterns, which are independent of the residues adjacent to threonine or iso-threonine in a peptide sequence. A comprehensive metric-based evaluation demonstrates that, being calibrated with a set of model peptides, the method allows for isomeric identification of threonine residues in peptides of arbitrary sequence
Identification of Alternative Splicing in Proteomes of Human Melanoma Cell Lines without RNA Sequencing Data
Alternative splicing is one of the main regulation pathways in living cells beyond simple changes in the level of protein expression. Most of the approaches proposed in proteomics for the identification of specific splicing isoforms require a preliminary deep transcriptomic analysis of the sample under study, which is not always available, especially in the case of the re-analysis of previously acquired data. Herein, we developed new algorithms for the identification and validation of protein splice isoforms in proteomic data in the absence of RNA sequencing of the samples under study. The bioinformatic approaches were tested on the results of proteome analysis of human melanoma cell lines, obtained earlier by high-resolution liquid chromatography and mass spectrometry (LC-MS). A search for alternative splicing events for each of the cell lines studied was performed against the database generated from all known transcripts (RefSeq) and the one composed of peptide sequences, which included all biologically possible combinations of exons. The identifications were filtered using the prediction of both retention times and relative intensities of fragment ions in the corresponding mass spectra. The fragmentation mass spectra corresponding to the discovered alternative splicing events were additionally examined for artifacts. Selected splicing events were further validated at the mRNA level by quantitative PCR
Identification of Alternative Splicing in Proteomes of Human Melanoma Cell Lines without RNA Sequencing Data
Alternative splicing is one of the main regulation pathways in living cells beyond simple changes in the level of protein expression. Most of the approaches proposed in proteomics for the identification of specific splicing isoforms require a preliminary deep transcriptomic analysis of the sample under study, which is not always available, especially in the case of the re-analysis of previously acquired data. Herein, we developed new algorithms for the identification and validation of protein splice isoforms in proteomic data in the absence of RNA sequencing of the samples under study. The bioinformatic approaches were tested on the results of proteome analysis of human melanoma cell lines, obtained earlier by high-resolution liquid chromatography and mass spectrometry (LC-MS). A search for alternative splicing events for each of the cell lines studied was performed against the database generated from all known transcripts (RefSeq) and the one composed of peptide sequences, which included all biologically possible combinations of exons. The identifications were filtered using the prediction of both retention times and relative intensities of fragment ions in the corresponding mass spectra. The fragmentation mass spectra corresponding to the discovered alternative splicing events were additionally examined for artifacts. Selected splicing events were further validated at the mRNA level by quantitative PCR
MS/MS-Free Protein Identification in Complex Mixtures Using Multiple Enzymes with Complementary Specificity
In this work, we present the results
of evaluation of a workflow
that employs a multienzyme digestion strategy for MS1-based protein
identification in “shotgun” proteomic applications.
In the proposed strategy, several cleavage reagents of different specificity
were used for parallel digestion of the protein sample followed by
MS1 and retention time (RT) based search. Proof of principle for the
proposed strategy was performed using experimental data obtained for
the annotated 48-protein standard. By using the developed approach,
up to 90% of proteins from the standard were unambiguously identified.
The approach was further applied to HeLa proteome data. For the sample
of this complexity, the proposed MS1-only strategy determined correctly
up to 34% of all proteins identified using standard MS/MS-based database
search. It was also found that the results of MS1-only search were
independent of the chromatographic gradient time in a wide range of
gradients from 15–120 min. Potentially, rapid MS1-only proteome
characterization can be an alternative or complementary to the MS/MS-based
“shotgun” analyses in the studies, in which the experimental
time is more important than the depth of the proteome coverage
IdentiPy: An Extensible Search Engine for Protein Identification in Shotgun Proteomics
We
present an open-source, extensible search engine for shotgun
proteomics. Implemented in Python programming language, IdentiPy shows
competitive processing speed and sensitivity compared with the state-of-the-art
search engines. It is equipped with a user-friendly web interface,
IdentiPy Server, enabling the use of a single server installation
accessed from multiple workstations. Using a simplified version of
X!Tandem scoring algorithm and its novel “autotune”
feature, IdentiPy outperforms the popular alternatives on high-resolution
data sets. Autotune adjusts the search parameters for the particular
data set, resulting in improved search efficiency and simplifying
the user experience. IdentiPy with the autotune feature shows higher
sensitivity compared with the evaluated search engines. IdentiPy Server
has built-in postprocessing and protein inference procedures and provides
graphic visualization of the statistical properties of the data set
and the search results. It is open-source and can be freely extended
to use third-party scoring functions or processing algorithms and
allows customization of the search workflow for specialized applications
MS/MS-Free Protein Identification in Complex Mixtures Using Multiple Enzymes with Complementary Specificity
In this work, we present the results
of evaluation of a workflow
that employs a multienzyme digestion strategy for MS1-based protein
identification in “shotgun” proteomic applications.
In the proposed strategy, several cleavage reagents of different specificity
were used for parallel digestion of the protein sample followed by
MS1 and retention time (RT) based search. Proof of principle for the
proposed strategy was performed using experimental data obtained for
the annotated 48-protein standard. By using the developed approach,
up to 90% of proteins from the standard were unambiguously identified.
The approach was further applied to HeLa proteome data. For the sample
of this complexity, the proposed MS1-only strategy determined correctly
up to 34% of all proteins identified using standard MS/MS-based database
search. It was also found that the results of MS1-only search were
independent of the chromatographic gradient time in a wide range of
gradients from 15–120 min. Potentially, rapid MS1-only proteome
characterization can be an alternative or complementary to the MS/MS-based
“shotgun” analyses in the studies, in which the experimental
time is more important than the depth of the proteome coverage
Multi-Omics Analysis of Glioblastoma Cells’ Sensitivity to Oncolytic Viruses
Oncolytic viruses have gained momentum in the last decades as a promising tool for cancer treatment. Despite the progress, only a fraction of patients show a positive response to viral therapy. One of the key variable factors contributing to therapy outcomes is interferon-dependent antiviral mechanisms in tumor cells. Here, we evaluated this factor using patient-derived glioblastoma multiforme (GBM) cultures. Cell response to the type I interferons’ (IFNs) stimulation was characterized at mRNA and protein levels. Omics analysis revealed that GBM cells overexpress interferon-stimulated genes (ISGs) and upregulate their proteins, similar to the normal cells. A conserved molecular pattern unambiguously differentiates between the preserved and defective responses. Comparing ISGs’ portraits with titration-based measurements of cell sensitivity to a panel of viruses, the “strength” of IFN-induced resistance acquired by GBM cells was ranked. The study demonstrates that suppressing a single ISG and encoding an essential antiviral protein, does not necessarily increase sensitivity to viruses. Conversely, silencing IFIT3 and PLSCR1 genes in tumor cells can negatively affect the internalization of vesicular stomatitis and Newcastle disease viruses. We present evidence of a complex relationship between the interferon response genes and other factors affecting the sensitivity of tumor cells to viruses