7 research outputs found
OpenMS – An open-source software framework for mass spectrometry
<p>Abstract</p> <p>Background</p> <p>Mass spectrometry is an essential analytical technique for high-throughput analysis in proteomics and metabolomics. The development of new separation techniques, precise mass analyzers and experimental protocols is a very active field of research. This leads to more complex experimental setups yielding ever increasing amounts of data. Consequently, analysis of the data is currently often the bottleneck for experimental studies. Although software tools for many data analysis tasks are available today, they are often hard to combine with each other or not flexible enough to allow for rapid prototyping of a new analysis workflow.</p> <p>Results</p> <p>We present OpenMS, a software framework for rapid application development in mass spectrometry. OpenMS has been designed to be portable, easy-to-use and robust while offering a rich functionality ranging from basic data structures to sophisticated algorithms for data analysis. This has already been demonstrated in several studies.</p> <p>Conclusion</p> <p>OpenMS is available under the Lesser GNU Public License (LGPL) from the project website at <url>http://www.openms.de</url>.</p
Optimal precursor ion selection for LC-MS/MS based proteomics
Shotgun proteomics with Liquid Chromatography (LC) coupled to Tandem Mass Spectrometry (MS/MS) is a key technology for protein identification and quantitation. Protein identification is done indirectly: detected peptide signals are fragmented byMS/MS and their sequence is reconstructed. Afterwards, the identified peptides are used to infer the proteins present in a sample. The problem of choosing the peptide signals that shall be identified with MS/MS is called precursor ion selection. Most workflows use data- dependent acquisition for precursor ion selection despite known drawbacks like data redundancy, limited reproducibility or a bias towards high-abundance proteins. In this thesis, we formulate optimization problems for different aspects of precursor ion selection to overcome these weaknesses. In the first part of this work we develop inclusion lists aiming at optimal precursor ion selection given different input information. We trace precursor ion selection back to known combinatorial problems and develop linear program (LP) formulations. The first method creates an inclusion list given a set of detected features in an LC-MS map. We show that this setting is an instance of the Knapsack Problem. The corresponding LP can be solved efficiently and yields inclusion lists that schedule more precursors than standard methods when the number of precursors per fraction is limited. Furthermore, we develop a method for inclusion list creation based on a list of proteins of interest. We employ retention time and detectability prediction to infer LC-MS features. Based on peptide detectability, we introduce protein detectabilities that reflect the likelihood of detecting and identifying a protein. By maximizing the sum of protein detectabilities we create an inclusion list of limited size that covers a maximum number of proteins. In the second part of the thesis, we focus on iterative precursor ion selection (IPS) with LC-MALDI MS/MS. Here, after a fixed number of acquired MS/MS spectra their identification results are evaluated and are used for the next round of precursor ion selection. We develop a heuristic which creates a ranked precursor list. The second method, IPS LP, is a combination of the two inclusion list scenarios presented in the first part. Additionally, a protein-based exclusion is part of the objective function. For evaluation, we compared both IPS methods to a static inclusion list (SPS) created before the beginning of MS/MS acquisition. We simulated precursor ion selection on three data sets of different complexity and show that IPS LP can identify the same number of proteins with fewer selected precursors. This improvement is especially pronounced for low abundance proteins. Additionally, we show that IPS LP decreases the bias to high abundance proteins. All presented algorithms were implemented in OpenMS, a software library for mass spectrometry. Finally, we present an online tool for IPS that has direct access to the instrument and controls the measurement
Optimale Prekursor-Ionenauswahl fĂĽr die LC-MS/MS basierte Proteomik
Shotgun proteomics with Liquid Chromatography (LC) coupled to Tandem Mass
Spectrometry (MS/MS) is a key technology for protein identification and
quantitation. Protein identification is done indirectly: detected peptide
signals are fragmented byMS/MS and their sequence is reconstructed.
Afterwards, the identified peptides are used to infer the proteins present in
a sample. The problem of choosing the peptide signals that shall be identified
with MS/MS is called precursor ion selection. Most workflows use data-
dependent acquisition for precursor ion selection despite known drawbacks like
data redundancy, limited reproducibility or a bias towards high-abundance
proteins. In this thesis, we formulate optimization problems for different
aspects of precursor ion selection to overcome these weaknesses. In the first
part of this work we develop inclusion lists aiming at optimal precursor ion
selection given different input information. We trace precursor ion selection
back to known combinatorial problems and develop linear program (LP)
formulations. The first method creates an inclusion list given a set of
detected features in an LC-MS map. We show that this setting is an instance of
the Knapsack Problem. The corresponding LP can be solved efficiently and
yields inclusion lists that schedule more precursors than standard methods
when the number of precursors per fraction is limited. Furthermore, we develop
a method for inclusion list creation based on a list of proteins of interest.
We employ retention time and detectability prediction to infer LC-MS features.
Based on peptide detectability, we introduce protein detectabilities that
reflect the likelihood of detecting and identifying a protein. By maximizing
the sum of protein detectabilities we create an inclusion list of limited size
that covers a maximum number of proteins. In the second part of the thesis, we
focus on iterative precursor ion selection (IPS) with LC-MALDI MS/MS. Here,
after a fixed number of acquired MS/MS spectra their identification results
are evaluated and are used for the next round of precursor ion selection. We
develop a heuristic which creates a ranked precursor list. The second method,
IPS LP, is a combination of the two inclusion list scenarios presented in the
first part. Additionally, a protein-based exclusion is part of the objective
function. For evaluation, we compared both IPS methods to a static inclusion
list (SPS) created before the beginning of MS/MS acquisition. We simulated
precursor ion selection on three data sets of different complexity and show
that IPS LP can identify the same number of proteins with fewer selected
precursors. This improvement is especially pronounced for low abundance
proteins. Additionally, we show that IPS LP decreases the bias to high
abundance proteins. All presented algorithms were implemented in OpenMS, a
software library for mass spectrometry. Finally, we present an online tool for
IPS that has direct access to the instrument and controls the measurement.FlĂĽssigkeitschromatographie (LC) gekoppelt mit Tandemmassenspektrometrie
(MS/MS) ist eine SchlĂĽsseltechnologie fĂĽr die Proteinidentifikation und
Quantifizierung in proteomischen Proben. Dabei werden Proteine indirekt
identifiziert: detektierte Peptidsignale werden durch MS/MS fragmentiert und
anschlieĂźend wird die Peptidsequenz rekonstruiert. Ăśber die identifizierten
Peptide werden schlieĂźlich die Proteine in der Probe identifiziert. Das
Problem der Auswahl der Peptidsignale, die ĂĽber MS/MS sequenziert werden
sollen, heiĂźt Precursor-Ionen-Selektion (PS). Die meisten Selektionsverfahren
benutzen rein intensitätsbasierte Ansätze – sogenannte Datenabhängige
Akquisition (DDA) – trotz bekannter Schwächen wie Datenredundanz, begrenzter
Reproduzierbarkeit oder einer Neigung zur Identifikation häufiger Proteine. In
dieser Arbeit entwickeln wir fĂĽr unterschiedliche Aspekte der PS
Formulierungen als Optimierungsprobleme mit dem Ziel den bekannten Schwächen
entgegenzusteuern. Im ersten Teil der Arbeit werden fĂĽr unterschiedliche
Anfangsinformationen optimale Inklusionslisten erstellt. Dabei fĂĽhren wir PS
auf bekannte kombinatorische Probleme zurĂĽck und entwickeln Formulierungen als
Lineare Programme (LP) zur Lösung der Probleme. Die erste Methode basiert auf
einer Liste von LC-MS-Features. Wir zeigen, dass sich diese Situation auf das
Rucksackproblem zurückführen läßt. Das zugehörige LP erstellt effiziente
Inklusionslisten, die mehr Precursor enthalten als Standardmethoden, wenn die
Anzahl an Precursor-Ionen pro Fraktion begrenzt ist. AuĂźerdem entwickeln wir
eine Methode basierend auf einer Liste an zu identifizierenden
Proteinsequenzen. Wir benutzen Schätzverfahren für RT und Detektierbarkeit um
repräsentative LC-MS-Features für diese Proteine vorherzusagen. Basierend auf
der Peptiddetektierbarkeit fĂĽhren wir eine Proteindetektierbarkeit ein. Indem
wir die Summe dieser maximieren, erstellen wir eine größenbeschränkte
Inklusionsliste, die eine maximale Anzahl an Proteinen abdeckt. Im zweiten
Teil der Arbeit beschäftigen wir uns mit iterativer PS (IPS) mit LC-MALDI
MS/MS. Dabei werden nach einer bestimmten Anzahl an aufgenommenen MS/MS-
Spektren deren Identifikationsergebnisse ausgewertet und diese zur weiteren PS
benutzt. Wir entwickeln einerseits eine Heuristik, die eine priorisierte
Inklusionsliste erstellt.FĂĽr die zweite Methode, IPS_LP, kombinieren wir die
beiden LP-Formulierungen aus dem ersten Teil und erweitern sie um eine
proteinbasierte Exklusion. FĂĽr die Auswertung vergleichen wir unsere IPS-
Methoden mit einer statischen Inklusionsliste (SPS), die vor Beginn der MS/MS-
Messung erstellt wurde. Wir simulieren die PS auf drei Datensätzen mit
unterschiedlicher Komplexität und zeigen, dass IPS_LP die gleiche
Proteinanzahl wie SPS identifiziert, dabei aber weniger MS/MS-Messungen
benötigt. Diese Verbesserung wird insbesondere für Proteine mit geringer
Abundanz deutlich. Außerdem können wir zeigen, dass die Neigung zur
Identifikation häufiger Proteine gesenkt wird. Unsere Algorithmen wurden als
Teil von OpenMS, einer Softwarebibliothek fĂĽr Massenspektrometrie,
implementiert. Im letzten Teil stellen wir auĂźerdem ein Onlinetool vor, dass
direkten Zugriff auf das Massenspektrometer hat und die Messungen steuert
An Iterative Strategy for Precursor Ion Selection for LC-MS/MS Based Shotgun Proteomics.
Currently, the precursor ion selection strategies in LC-MS mainly choose the most prominent peptide signals for MS/MS analysis. Consequently, high-abundance proteins are identified by MS/MS of many peptides, whereas proteins of lower abundance might elude identification. We present a novel, iterative and result-driven approach for precursor ion selection that significantly increases the efficiency of an MS/MS analysis by decreasing data redundancy and analysis time. By simulating different strategies for precursor ion selection on an existing data set, we compare our method to existing result-driven strategies and evaluate its performance with regard to mass accuracy, database size, and sample complexity
Network integration and modelling of dynamic drug responses at multi-omics levels
Uncovering cellular responses from heterogeneous genomic data is crucial for molecular medicine in particular for drug safety. This can be realized by integrating the molecular activities in networks of interacting proteins. As proof-of-concept we challenge network modeling with time-resolved proteome, transcriptome and methylome measurements in iPSC-derived human 3D cardiac microtissues to elucidate adverse mechanisms of anthracycline cardiotoxicity measured with four different drugs (doxorubicin, epirubicin, idarubicin and daunorubicin). Dynamic molecular analysis at in vivo drug exposure levels reveal a network of 175 disease-associated proteins and identify common modules of anthracycline cardiotoxicity in vitro, related to mitochondrial and sarcomere function as well as remodeling of extracellular matrix. These in vitro-identified modules are transferable and are evaluated with biopsies of cardiomyopathy patients. This to our knowledge most comprehensive study on anthracycline cardiotoxicity demonstrates a reproducible workflow for molecular medicine and serves as a template for detecting adverse drug responses from complex omics data