83 research outputs found
Generalized calibration across liquid chromatography setups for generic prediction of small-molecule retention times
Accurate prediction of liquid chromatographic retention times from small-molecule structures is useful for reducing experimental measurements and for improved identification in targeted and untargeted MS. However, different experimental setups (e.g., differences in columns, gradients, solvents, or stationary phase) have given rise to a multitude of prediction models that only predict accurate retention times for a specific experimental setup. In practice this typically results in the fitting of a new predictive model for each specific type of setup, which is not only inefficient but also requires substantial prior data to be accumulated on each such setup. Here we introduce the concept of generalized calibration, which is capable of the straightforward mapping of retention time models between different experimental setups. This concept builds on the database-controlled calibration approach implemented in PredRet and fits calibration curves on predicted retention times instead of only on observed retention times. We show that this approach results in substantially higher accuracy of elution-peak prediction than is achieved by setup-specific models
A comparison of collision cross section values obtained via travelling wave ion mobility-mass spectrometry and ultra high performance liquid chromatography-ion mobility-mass spectrometry : application to the characterisation of metabolites in rat urine
A comprehensive Collision Cross Section (CCS) library was obtained via Travelling Wave Ion Guide mobility measurements through direct infusion (DI). The library consists of CCS and Mass Spectral (MS) data in negative and positive ElectroSpray Ionisation (ESI) mode for 463 and 479 endogenous metabolites, respectively. For both ionisation modes combined, TWCCSN2 data were obtained for 542 non-redundant metabolites. These data were acquired on two different ion mobility enabled orthogonal acceleration QToF MS systems in two different laboratories, with the majority of the resulting TWCCSN2 values (from detected compounds) found to be within 1% of one another. Validation of these results against two independent, external TWCCSN2 data sources and predicted TWCCSN2 values indicated to be within 1-2% of these other values. The same metabolites were then analysed using a rapid reversed-phase ultra (high) performance liquid chromatographic (U(H)PLC) separation combined with IM and MS (IM-MS) thus providing retention time (tr), m/z and TWCCSN2 values (with the latter compared with the DI-IM-MS data). Analytes for which TWCCSN2 values were obtained by U(H)PLC-IM-MS showed good agreement with the results obtained from DI-IM-MS. The repeatability of the TWCCSN2 values obtained for these metabolites on the different ion mobility QToF systems, using either DI or LC, encouraged the further evaluation of the U(H)PLC-IM-MS approach via the analysis of samples of rat urine, from control and methotrexate-treated animals, in order to assess the potential of the approach for metabolite identification and profiling in metabolic phenotyping studies. Based on the database derived from the standards 63 metabolites were identified in rat urine, using positive ESI, based on the combination of tr, TWCCSN2 and MS data.</p
Accurate peptide fragmentation predictions allow data driven approaches to replace and improve upon proteomics search engine scoring functions
Motivation: The use of post-processing tools to maximize the information gained from a proteomics search engine is widely accepted and used by the community, with the most notable example being Percolator-a semi-supervised machine learning model which learns a new scoring function for a given dataset. The usage of such tools is however bound to the search engine's scoring scheme, which doesn't always make full use of the intensity information present in a spectrum. We aim to show how this tool can be applied in such a way that maximizes the use of spectrum intensity information by leveraging another machine learning-based tool, MS2PIP. MS2PIP predicts fragment ion peak intensities.
Results: We show how comparing predicted intensities to annotated experimental spectra by calculating direct similarity metrics provides enough information for a tool such as Percolator to accurately separate two classes of peptide-to-spectrum matches. This approach allows using more information out of the data (compared with simpler intensity based metrics, like peak counting or explained intensities summing) while maintaining control of statistics such as the false discovery rate
Structure Alignment
While many good textbooks are available on Protein Structure, Molecular
Simulations, Thermodynamics and Bioinformatics methods in general, there is no
good introductory level book for the field of Structural Bioinformatics. This
book aims to give an introduction into Structural Bioinformatics, which is
where the previous topics meet to explore three dimensional protein structures
through computational analysis. We provide an overview of existing
computational techniques, to validate, simulate, predict and analyse protein
structures. More importantly, it will aim to provide practical knowledge about
how and when to use such techniques. We will consider proteins from three major
vantage points: Protein structure quantification, Protein structure prediction,
and Protein simulation & dynamics.
The Protein DataBank (PDB) contains a wealth of structural information. In
order to investigate the similarity between different proteins in this
database, one can compare the primary sequence through pairwise alignment and
calculate the sequence identity (or similarity) over the two sequences. This
strategy will work particularly well if the proteins you want to compare are
close homologs. However, in this chapter we will explain that a structural
comparison through structural alignment will give you much more valuable
information, that allows you to investigate similarities between proteins that
cannot be discovered by comparing the sequences alone.Comment: editorial responsability: K. Anton Feenstra, Sanne Abeln. This
chapter is part of the book "Introduction to Protein Structural
Bioinformatics". The Preface arXiv:1801.09442 contains links to all the
(published) chapters. The update adds available arxiv hyperlinks for the
chapter
Structure Alignment
While many good textbooks are available on Protein Structure, Molecular Simulations, Thermodynamics and Bioinformatics methods in general, there is no good introductory level book for the field of Structural Bioinformatics. This book aims to give an introduction into Structural Bioinformatics, which is where the previous topics meet to explore three dimensional protein structures through computational analysis. We provide an overview of existing computational techniques, to validate, simulate, predict and analyse protein structures. More importantly, it will aim to provide practical knowledge about how and when to use such techniques. We will consider proteins from three major vantage points: Protein structure quantification, Protein structure prediction, and Protein simulation & dynamics. The Protein DataBank (PDB) contains a wealth of structural information. In order to investigate the similarity between different proteins in this database, one can compare the primary sequence through pairwise alignment and calculate the sequence identity (or similarity) over the two sequences. This strategy will work particularly well if the proteins you want to compare are close homologs. However, in this chapter we will explain that a structural comparison through structural alignment will give you much more valuable information, that allows you to investigate similarities between proteins that cannot be discovered by comparing the sequences alone
Personalized Proteome: Comparing Proteogenomics and Open Variant Search Approaches for Single Amino Acid Variant Detection
Item does not contain fulltex
Structural Property Prediction
While many good textbooks are available on Protein Structure, Molecular
Simulations, Thermodynamics and Bioinformatics methods in general, there is no
good introductory level book for the field of Structural Bioinformatics. This
book aims to give an introduction into Structural Bioinformatics, which is
where the previous topics meet to explore three dimensional protein structures
through computational analysis. We provide an overview of existing
computational techniques, to validate, simulate, predict and analyse protein
structures. More importantly, it will aim to provide practical knowledge about
how and when to use such techniques. We will consider proteins from three major
vantage points: Protein structure quantification, Protein structure prediction,
and Protein simulation & dynamics.
Some structural properties of proteins that are closely linked to their
function may be easier (or much faster) to predict from sequence than the
complete tertiary structure; for example, secondary structure, surface
accessibility, flexibility, disorder, interface regions or hydrophobic patches.
Serving as building blocks for the native protein fold, these structural
properties also contain important structural and functional information not
apparent from the amino acid sequence. Here, we will first give an introduction
into the application of machine learning for structural property prediction,
and explain the concepts of cross-validation and benchmarking. Next, we will
review various methods that incorporate knowledge of these concepts to predict
those structural properties, such as secondary structure, surface
accessibility, disorder and flexibility, and aggregation.Comment: editorial responsability: Juami H. M. van Gils, K. Anton Feenstra,
Sanne Abeln. This chapter is part of the book "Introduction to Protein
Structural Bioinformatics". The Preface arXiv:1801.09442 contains links to
all the (published) chapter
Structural Property Prediction
While many good textbooks are available on Protein Structure, Molecular Simulations, Thermodynamics and Bioinformatics methods in general, there is no good introductory level book for the field of Structural Bioinformatics. This book aims to give an introduction into Structural Bioinformatics, which is where the previous topics meet to explore three dimensional protein structures through computational analysis. We provide an overview of existing computational techniques, to validate, simulate, predict and analyse protein structures. More importantly, it will aim to provide practical knowledge about how and when to use such techniques. We will consider proteins from three major vantage points: Protein structure quantification, Protein structure prediction, and Protein simulation & dynamics. Some structural properties of proteins that are closely linked to their function may be easier (or much faster) to predict from sequence than the complete tertiary structure; for example, secondary structure, surface accessibility, flexibility, disorder, interface regions or hydrophobic patches. Serving as building blocks for the native protein fold, these structural properties also contain important structural and functional information not apparent from the amino acid sequence. Here, we will first give an introduction into the application of machine learning for structural property prediction, and explain the concepts of cross-validation and benchmarking. Next, we will review various methods that incorporate knowledge of these concepts to predict those structural properties, such as secondary structure, surface accessibility, disorder and flexibility, and aggregation
Cov-MS: A Community-Based Template Assay for Mass-Spectrometry-Based Protein Detection in SARS-CoV-2 Patients
Rising population density and global mobility are among the reasons why pathogens such as SARS-CoV-2, the virus that causes COVID-19, spread so rapidly across the globe. The policy response to such pandemics will always have to include accurate monitoring of the spread, as this provides one of the few alternatives to total lockdown. However, COVID-19 diagnosis is currently performed almost exclusively by reverse transcription polymerase chain reaction (RT-PCR). Although this is efficient, automatable, and acceptably cheap, reliance on one type of technology comes with serious caveats, as illustrated by recurring reagent and test shortages. We therefore developed an alternative diagnostic test that detects proteolytically digested SARS-CoV-2 proteins using mass spectrometry (MS). We established the Cov-MS consortium, consisting of 15 academic laboratories and several industrial partners to increase applicability, accessibility, sensitivity, and robustness of this kind of SARS-CoV-2 detection. This, in turn, gave rise to the Cov-MS Digital Incubator that allows other laboratories to join the effort, navigate, and share their optimizations and translate the assay into their clinic. As this test relies on viral proteins instead of RNA, it provides an orthogonal and complementary approach to RT-PCR using other reagents that are relatively inexpensive and widely available, as well as orthogonally skilled personnel and different instruments. Data are available via ProteomeXchange with identifier PXD022550.status: publishe
The difference is in the details : predicting the LC-IM-MS behaviour of metabolites and peptides
- …
