18 research outputs found

    Efficient visualization of high-throughput targeted proteomics experiments: TAPIR

    Get PDF
    Motivation: Targeted mass spectrometry comprises a set of powerful methods to obtain accurate and consistent protein quantification in complex samples. To fully exploit these techniques, a cross-platform and open-source software stack based on standardized data exchange formats is required. Results: We present TAPIR, a fast and efficient Python visualization software for chromatograms and peaks identified in targeted proteomics experiments. The input formats are open, community-driven standardized data formats (mzML for raw data storage and TraML encoding the hierarchical relationships between transitions, peptides and proteins). TAPIR is scalable to proteome-wide targeted proteomics studies (as enabled by SWATH-MS), allowing researchers to visualize high-throughput datasets. The framework integrates well with existing automated analysis pipelines and can be extended beyond targeted proteomics to other types of analyses. Availability and implementation: TAPIR is available for all computing platforms under the 3-clause BSD license at https://github.com/msproteomicstools/msproteomicstools. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

    A practical guide to interpreting and generating bottom-up proteomics data visualizations

    Get PDF
    Mass-spectrometry based bottom-up proteomics is the main method to analyze proteomes comprehensively and the rapid evolution of instrumentation and data analysis has made the technology widely available. Data visualization is an integral part of the analysis process and it is crucial for the communication of results. This is a major challenge due to the immense complexity of MS data. In this review, we provide an overview of commonly used visualizations, starting with raw data of traditional and novel MS technologies, then basic peptide and protein level analyses, and finally visualization of highly complex datasets and networks. We specifically provide guidance on how to critically interpret and discuss the multitude of different proteomics data visualizations. Furthermore, we highlight Python-based libraries and other open science tools that can be applied for independent and transparent generation of customized visualizations. To further encourage programmatic data visualization, we provide the Python code used to generate all data figures in this review on GitHub ().DATA AVAILABILITY STATEMENT Proteomics data from the following ProteomeExchange repositories were reused to generate Figures in this study: PXD012867, PXD017703, PXD010697, PXD010103

    Methods in automated glycosaminoglycan tandem mass spectra analysis

    Get PDF
    Glycosylation is the process by which a glycan is enzymatically attached to a protein, and is one of the most common post-translational modifications in nature. One class of glycans is the glycosaminoglycans (GAGs), which are long, linear polysaccharides that are variably sulfated and make up the glycan portion of proteoglycans (PGs). PGs are located on the cellular surface and in the extracellular matrix (ECM), making them important molecules for cell signaling and ligand binding. The GAG sulfation sequence is a determining factor for the signaling capacity of binding complexes, so accurate determination of the sequence is critical. Historically, GAG sequencing using tandem mass spectrometry (MS2) has been a difficult, manual process; however, with the advent of faster computational techniques and higher-resolution MS2, high-throughput GAG sequencing is within reach. Two steps in the pipeline of biomolecule sequencing using MS2 are discovery and interpretation of spectral peaks. The discovery step traditionally is performed using methods that rely on the concept of averagine, or the average molecular building block for the analyte in question. These methods were developed for protein sequencing, but perform considerably worse on GAG sequences, due to the non-uniform distribution of sulfur atoms along the chain and the relatively high isotope abundance of 34S. The interpretation step traditionally is performed manually, which takes time and introduces potential user error. To combat these problems, I developed GAGfinder, the first GAG-specific MS2 peak finding and annotation software. GAGfinder is described in detail in chapter two. Another step in MS2 sequencing is the determination of the sequence using the found MS2 fragments. For a given GAG composition, there are many possible sequences, and peak finding algorithms such as GAGfinder return a list of the peaks in the MS2 mass spectrum. The many-to-many relationship between sequences and fragments can be represented using a bipartite network, and node-ranking techniques can be employed to generate likelihood scores for possible sequences. I developed a bipartite network-based sequencing tool, GAGrank, based on a bipartite network extension of Google’s PageRank algorithm for ranking websites. GAGrank is described in detail in chapter three

    Automating Data Analysis for Two-Dimensional Gas Chromatography/Time-of-Flight Mass Spectrometry Non-Targeted Analysis of Comparative Samples

    Get PDF
    Non-targeted analysis of environmental samples, using comprehensive two‐dimensional gas chromatography coupled with time-of-flight mass spectrometry (GC × GC/ToF-MS), poses significant data analysis challenges due to the large number of possible analytes. Non-targeted data analysis of complex mixtures is prone to human bias and is laborious, particularly for comparative environmental samples such as contaminated soil pre- and post-bioremediation. To address this research bottleneck, we developed OCTpy, a Python™ script that acts as a data reduction filter to automate GC × GC/ToF-MS data analysis from LECO® ChromaTOF® software and facilitates selection of analytes of interest based on peak area comparison between comparative samples. We used data from polycyclic aromatic hydrocarbon (PAH) contaminated soil, pre- and post‐bioremediation, to assess the effectiveness of OCTpy in facilitating the selection of analytes that have formed or degraded following treatment. Using datasets from the soil extracts pre- and post‐bioremediation, OCTpy selected, on average, 18% of the initial suggested analytes generated by the LECO® ChromaTOF® software Statistical Compare feature. Based on this list, 63–100% of the candidate analytes identified by a highly trained individual were also selected by OCTpy. This process was accomplished in several minutes per sample, whereas manual data analysis took several hours per sample. OCTpy automates the analysis of complex mixtures of comparative samples, reduces the potential for human error during heavy data handling and decreases data analysis time by at least tenfold

    Development of a method for biomarkers characterization by mass spectrometry techniques

    Get PDF
    The purpose of this study is to define an extractive approach for the detection of the low-molecule peptide fraction from human plasma or serum and the subsequent analysis and interpretation of the obtained data, with the ultimate aim of developing a standardised protocol for the identification of potential biomarkers. The extraction of the low molecular protein fraction was developed thanks to a series of standard peptides solutions and using silica magnetic beads techniques differently functionalised with the purpose to bind target molecules with a different type of intermolecular force. The treatment of the samples, plasma or serum, took place without the use of proteases, as trypsin, to generate digested lysates, or electrophoresis and gel separation techniques, to avoid creating additional complexity in subsequent steps of data interpretation and to use the lower quantity of sample as possible. Both the peptides contained in the standard solution and those in the low molecular weight fraction of the pre-treated biological sample were separated and characterized through high performance liquid chromatography (HPLC) coupled to full scan and tandem mass spectrometry equipped with an electrospray ion source (ESI-MS/MS). Samples from biological sources were subsequently analysed using the mass spectrometry MALDI-TOF technique. In this project the development of the extraction method was followed by its application to real samples. The presence of low-molecular-weight peptides in plasma samples, from dialysis nephrotic patients at various stages of Sars-COV2 infection, and in plasma from healthy donors was evaluated with the aim to find significant differences between groups, especially in terms of qualitative/quantitative differences in the m/z ratios present in MS spectra. A bioinformatics approach to data processing has also been implemented, either by using statistical tools such as the Venn diagram or the Meaning Analysis of Microarrays (SAM) or by developing a series of codes in Python, for processing spectral data combined with algorithms with silico fragmentation rules. Outputs were compared with information from peptide databases to obtain significant correspondences between the theoretical and experimental spectrum

    PyPedia:using the wiki paradigm as crowd sourcing environment for bioinformatics protocols

    Get PDF
    Background: Today researchers can choose from many bioinformatics protocols for all types of life sciences research, computational environments and coding languages. Although the majority of these are open source, few of them possess all virtues to maximize reuse and promote reproducible science. Wikipedia has proven a great tool to disseminate information and enhance collaboration between users with varying expertise and background to author qualitative content via crowdsourcing. However, it remains an open question whether the wiki paradigm can be applied to bioinformatics protocols. Results: We piloted PyPedia, a wiki where each article is both implementation and documentation of a bioinformatics computational protocol in the python language. Hyperlinks within the wiki can be used to compose complex workflows and induce reuse. A RESTful API enables code execution outside the wiki. Initial content of PyPedia contains articles for population statistics, bioinformatics format conversions and genotype imputation. Use of the easy to learn wiki syntax effectively lowers the barriers to bring expert programmers and less computer savvy researchers on the same page. Conclusions: PyPedia demonstrates how wiki can provide a collaborative development, sharing and even execution environment for biologists and bioinformaticians that complement existing resources, useful for local and multi-center research teams. Availability: PyPedia is available online at: http://www.pypedia.com. The source code and installation instructions are available at: https://github.com/kantale/PyPedia_server. The PyPedia python library is available at: https://github.com/kantale/pypedia. PyPedia is open-source, available under the BSD 2-Clause License

    Self-assembly and anti-amyloid cytotoxicity activity of amyloid beta peptide derivatives

    Get PDF
    The self-assembly of two derivatives of KLVFF, a fragment Abeta(16-20) of the amyloid beta (Abeta) peptide, is investigated and recovery of viability of neuroblastoma cells exposed to Abeta is observed at sub-stoichiometric peptide concentrations. Fluorescence assays show that NH2-KLVFF-CONH2 undergoes hydrophobic collapse and amyloid formation at the same critical aggregation concentration (cac). In contrast, NH2-K(Boc)LVFF-CONH2 undergoes hydrophobic collapse at a low concentration, followed by amyloid formation at a higher cac. These findings are supported by the beta-sheet features observed by FTIR. Electrospray ionization mass spectrometry indicates that NH2-K(Boc)LVFF-CONH2 forms a significant population of oligomeric species above the cac. Cryo-TEM, used together with SAXS to determine fibril dimensions, shows that the length and degree of twisting of peptide fibrils seem to be influenced by the net peptide charge. Grazing incidence X-ray scattering from thin peptide films shows features of beta-sheet ordering for both peptides, along with evidence for lamellar ordering of NH2-KLVFF-CONH2. This work provides a comprehensive picture of the aggregation properties of these two KLVFF derivatives and show their utility, in unaggregated form, in restoring the viability of neuroblastoma cells against Abeta-induced toxicity

    Modulation of \u3cem\u3eEscherichia coli\u3c/em\u3e Translation by the Specific Inactivation of tRNA\u3csup\u3eGly\u3c/sup\u3e Under Oxidative Stress

    Get PDF
    Bacterial oxidative stress responses are generally controlled by transcription factors that modulate the synthesis of RNAs with the aid of some sRNAs that control the stability, and in some cases the translation, of specific mRNAs. Here, we report that oxidative stress additionally leads to inactivation of tRNAGly in Escherichia coli, inducing a series of physiological changes. The observed inactivation of tRNAGly correlated with altered efficiency of translation of Gly codons, suggesting a possible mechanism of translational control of gene expression under oxidative stress. Changes in translation also depended on the availability of glycine, revealing a mechanism whereby bacteria modulate the response to oxidative stress according to the prevailing metabolic state of the cells
    corecore