55 research outputs found
MetAssign: probabilistic annotation of metabolites from LC–MS data using a Bayesian clustering approach
Motivation: The use of liquid chromatography coupled to mass spectrometry (LC–MS) has enabled the high-throughput profiling of the metabolite composition of biological samples. However, the large amount of data obtained can be difficult to analyse and often requires computational processing to understand which metabolites are present in a sample. This paper looks at the dual problem of annotating peaks in a sample with a metabolite, together with putatively annotating whether a metabolite is present in the sample. The starting point of the approach is a Bayesian clustering of peaks into groups, each corresponding to putative adducts and isotopes of a single metabolite.<p></p>
Results: The Bayesian modelling introduced here combines information from the mass-to-charge ratio, retention time and intensity of each peak, together with a model of the inter-peak dependency structure, to increase the accuracy of peak annotation. The results inherently contain a quantitative estimate of confidence in the peak annotations and allow an accurate trade off between precision and recall. Extensive validation experiments using authentic chemical standards show that this system is able to produce more accurate putative identifications than other state-of-the-art systems, while at the same time giving a probabilistic measure of confidence in the annotations.<p></p>
Availability: The software has been implemented as part of the mzMatch metabolomics analysis pipeline, which is available for download at http://mzmatch.sourceforge.net/
mzMatch-ISO: an R tool for the annotation and relative quantification of isotope-labelled mass spectrometry data
<p>Motivation: Stable isotope-labelling experiments have recently gained increasing popularity in metabolomics studies, providing unique insights into the dynamics of metabolic fluxes, beyond the steady-state information gathered by routine mass spectrometry. However, most liquid chromatography–mass spectrometry data analysis software lacks features that enable automated annotation and relative quantification of labelled metabolite peaks. Here, we describe mzMatch–ISO, a new extension to the metabolomics analysis pipeline mzMatch.R.</p>
<p>Results: Targeted and untargeted isotope profiling using mzMatch–ISO provides a convenient visual summary of the quality and quantity of labelling for every metabolite through four types of diagnostic plots that show (i) the chromatograms of the isotope peaks of each compound in each sample group; (ii) the ratio of mono-isotopic and labelled peaks indicating the fraction of labelling; (iii) the average peak area of mono-isotopic and labelled peaks in each sample group; and (iv) the trend in the relative amount of labelling in a predetermined isotopomer. To aid further statistical analyses, the values used for generating these plots are also provided as a tab-delimited file. We demonstrate the power and versatility of mzMatch–ISO by analysing a 13C-labelled metabolome dataset from trypanosomal parasites.</p>
mzMatch-ISO: an R tool for the annotation and relative quantification of isotope-labelled mass spectrometry data
<p>Motivation: Stable isotope-labelling experiments have recently gained increasing popularity in metabolomics studies, providing unique insights into the dynamics of metabolic fluxes, beyond the steady-state information gathered by routine mass spectrometry. However, most liquid chromatography–mass spectrometry data analysis software lacks features that enable automated annotation and relative quantification of labelled metabolite peaks. Here, we describe mzMatch–ISO, a new extension to the metabolomics analysis pipeline mzMatch.R.</p>
<p>Results: Targeted and untargeted isotope profiling using mzMatch–ISO provides a convenient visual summary of the quality and quantity of labelling for every metabolite through four types of diagnostic plots that show (i) the chromatograms of the isotope peaks of each compound in each sample group; (ii) the ratio of mono-isotopic and labelled peaks indicating the fraction of labelling; (iii) the average peak area of mono-isotopic and labelled peaks in each sample group; and (iv) the trend in the relative amount of labelling in a predetermined isotopomer. To aid further statistical analyses, the values used for generating these plots are also provided as a tab-delimited file. We demonstrate the power and versatility of mzMatch–ISO by analysing a 13C-labelled metabolome dataset from trypanosomal parasites.</p>
Combination of deep XLMS with deep learning reveals an ordered rearrangement and assembly of a major protein component of the vaccinia virion
Vaccinia virus, the prototypical poxvirus and smallpox/monkeypox vaccine, has proven a challenging entity for structural biology, defying many of the approaches leading to molecular and atomic models for other viruses. Via a combination of deep learning and cross-linking mass spectrometry, we have developed an atomic-level model and an integrated processing/assembly pathway for a structural component of the vaccinia virion, protein P4a. Within the pathway, proteolytic separation of the C-terminal P4a-3 segment of P4a triggers a massive conformational rotation within the N-terminal P4a-1 segment that becomes fixed by disulfide-locking while removing a steric block to trimerization of the processing intermediate P4a-1+2. These events trigger the proteolytic separation of P4a-2, allowing the assembly of P4a-1 into a hexagonal lattice that encloses the nascent virion core
mzMLb: A Future-Proof Raw Mass Spectrometry Data Format Based on Standards-Compliant mzML and Optimized for Speed and Storage Requirements
With ever-increasing amounts of data produced by mass spectrometry (MS) proteomics and metabolomics, and the sheer volume of samples now analyzed, the need for a common open format possessing both file size efficiency and faster read/write speeds has become paramount to drive the next generation of data analysis pipelines. The Proteomics Standards Initiative (PSI) has established a clear and precise extensible markup language (XML) representation for data interchange, mzML, receiving substantial uptake; nevertheless, storage and file access efficiency has not been the main focus. We propose an HDF5 file format "mzMLb" that is optimized for both read/write speed and storage of the raw mass spectrometry data. We provide an extensive validation of the write speed, random read speed, and storage size, demonstrating a flexible format that with or without compression is faster than all existing approaches in virtually all cases, while with compression is comparable in size to proprietary vendor file formats. Since our approach uniquely preserves the XML encoding of the metadata, the format implicitly supports future versions of mzML and is straightforward to implement: mzMLb's design adheres to both HDF5 and NetCDF4 standard implementations, which allows it to be easily utilized by third parties due to their widespread programming language support. A reference implementation within the established ProteoWizard toolkit is provided
Thrombin activation of the factor XI dimer is a multistaged process for each subunit
Background: Factor (F)XI can be activated by proteases, including thrombin and FXIIa. The interactions of these enzymes with FXI are transient in nature and therefore difficult to study. Objectives: To identify the binding interface between thrombin and FXI and understand the dynamics underlying FXI activation. Methods: Crosslinking mass spectrometry was used to localize the binding interface of thrombin on FXI. Molecular dynamics simulations were applied to investigate conformational changes enabling thrombin-mediated FXI activation after binding. The proposed trajectory of activation was examined with nanobody 1C10, which was previously shown to inhibit thrombin-mediated activation of FXI. Results: We identified a binding interface of thrombin located on the light chain of FXI involving residue Pro520. After this initial interaction, FXI undergoes conformational changes driven by binding of thrombin to the apple 1 domain in a secondary step to allow migration toward the FXI cleavage site. The 1C10 binding site on the apple 1 domain supports this proposed trajectory of thrombin. We validated the results with known mutation sites on FXI. As Pro520 is conserved in prekallikrein (PK), we hypothesized and showed that thrombin can bind PK, even though it cannot activate PK. Conclusion: Our investigations show that the activation of FXI is a multistaged procedure. Thrombin first binds to Pro520 in FXI; thereafter, it migrates toward the activation site by engaging the apple 1 domain. This detailed analysis of the interaction between thrombin and FXI paves a way for future interventions for bleeding or thrombosis
Oxonium Ion-Guided Optimization of Ion Mobility-Assisted Glycoproteomics on the timsTOF Pro
Spatial separation of ions in the gas phase, providing information about their size as collisional cross-sections, can readily be achieved through ion mobility. The timsTOF Pro (Bruker Daltonics) series combines a trapped ion mobility device with a quadrupole, collision cell, and a time-of-flight analyzer to enable the analysis of ions at great speed. Here, we show that the timsTOF Pro is capable of physically separating N-glycopeptides from nonmodified peptides and producing high-quality fragmentation spectra, both beneficial for glycoproteomics analyses of complex samples. The glycan moieties enlarge the size of glycopeptides compared with nonmodified peptides, yielding a clear cluster in the mobilogram that, next to increased dynamic range from the physical separation of glycopeptides and nonmodified peptides, can be used to make an effective selection filter for directing the mass spectrometer to analytes of interest. We designed an approach where we (1) focused on a region of interest in the ion mobilogram and (2) applied stepped collision energies to obtain informative glycopeptide tandem mass spectra on the timsTOF Pro:glyco-polygon–stepped collision energy-parallel accumulation serial fragmentation. This method was applied to selected glycoproteins, human plasma– and neutrophil-derived glycopeptides. We show that the achieved physical separation in the region of interest allows for improved extraction of information from the samples, even at shorter liquid chromatography gradients of 15 min. We validated our approach on human neutrophil and plasma samples of known makeup, in which we captured the anticipated glycan heterogeneity (paucimannose, phosphomannose, high mannose, hybrid and complex glycans) from plasma and neutrophil samples at the expected abundances. As the method is compatible with off-the-shelve data acquisition routines and data analysis software, it can readily be applied by any laboratory with a timsTOF Pro and is reproducible as demonstrated by a comparison between two laboratories
- …