Search CORE

10 research outputs found

The mzTab data exchange format: communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience.

Author: Bandeira Nuno
Cox Jürgen
Del Toro Noemi
Fan Jun
Gatto Laurent
Ghali Fawaz
Griss Johannes
Hartler Jürgen
Hermjakob Henning
Jones Andrew R
Kohlbacher Oliver
Neuhauser Nadin
Neumann Steffen
Pérez-Riverol Yasset
Reisinger Florian
Sachsenberg Timo
Salek Reza M
Steinbeck Christoph
Thallinger Gerhard G
Vizcaíno Juan Antonio
Walzer Mathias
Xenarios Ioannis
Xu Qing-Wei
Publication venue: Mol Cell Proteomics
Publication date: 01/01/2014
Field of study

The HUPO Proteomics Standards Initiative has developed several standardized data formats to facilitate data sharing in mass spectrometry (MS)-based proteomics. These allow researchers to report their complete results in a unified way. However, at present, there is no format to describe the final qualitative and quantitative results for proteomics and metabolomics experiments in a simple tabular format. Many downstream analysis use cases are only concerned with the final results of an experiment and require an easily accessible format, compatible with tools such as Microsoft Excel or R. We developed the mzTab file format for MS-based proteomics and metabolomics results to meet this need. mzTab is intended as a lightweight supplement to the existing standard XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. mzTab files can contain protein, peptide, and small molecule identifications together with experimental metadata and basic quantitative information. The format is not intended to store the complete experimental evidence but provides mechanisms to report results at different levels of detail. These range from a simple summary of the final results to a representation of the results including the experimental design. This format is ideally suited to make MS-based proteomics and metabolomics results available to a wider biological community outside the field of MS. Several software tools for proteomics and metabolomics have already adapted the format as an output format. The comprehensive mzTab specification document and extensive additional documentation can be found online

Crossref

Serveur académique lausannois

PubMed Central

Apollo (Cambridge)

MPG.PuRe

A Systematic Investigation into the Nature of Tryptic HCD Spectra

Author: Annette Michalski (1276377)
Jürgen Cox (1276383)
Matthias Mann (13750)
Nadin Neuhauser (1276380)
Publication venue
Publication date
Field of study

Modern mass spectrometry-based proteomics can produce millions of peptide fragmentation spectra, which are automatically identified in databases using sequence-specific b- or y-ions. Proteomics projects have mainly been performed with low resolution collision-induced dissociation (CID) in ion traps and beam-type fragmentation on triple quadrupole and QTOF instruments. Recently, the latter has also become available with Orbitrap instrumentation as higher energy collisional dissociation (HCD), routinely providing full mass range fragmentation with high mass accuracy. To systematically study the nature of HCD spectra, we made use of a large scale data set of tryptic peptides identified with an FDR of 0.0001, from which we extract a subset of more than 16 000 that have little or no contribution from cofragmented precursors. We employed a newly developed computer-assisted “Expert System”, which distills our experience and literature knowledge about fragmentation pathways. It aims to automatically annotate the peaks in high mass accuracy fragment spectra while strictly controlling the false discovery rate. Using this Expert System we determined that sequence specific regular ions covering the entire sequence were present for almost all peptides with up to 10 amino acids (median 100%). Peptides up to 20 amino acid length contained sufficient fragmentation to cover 80% of the sequence. Internal fragments are common in HCD spectra but not in high resolution CID spectra (10% vs 1%). The low mass region contains abundant immonium ions (6% of fragment ion intensity), the characteristic a2, b2 ion pair (72% of spectra), side chain fragments and reporter ions for peptide modifications such as tyrosine phosphorylation. B- and y-ions account for only 20% of fragment ions by number but 53% by ion intensity. Overall, 84% of the fragment ion intensity was unambiguously explainable. Thus high mass accuracy HCD and CID data are near comprehensively and automatically interpretable

The Francis Crick Institute

Andromeda - a peptide search engine integrated into the MaxQuant environment

Author: Cox Jurgen
Mann Matthias
Michalski Annette
Neuhauser Nadin
Olsen Jesper Velgaard
Scheltema Richard A
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2011
Field of study

Copenhagen University Research Information System

Andromeda: A Peptide Search Engine Integrated into the MaxQuant Environment

Author: Annette Michalski
Jesper V. Olsen
Jürgen Cox
Matthias Mann
Nadin Neuhauser
Richard A. Scheltema
Publication venue: 'American Chemical Society (ACS)'
Publication date
Field of study

Crossref

High-accuracy identification and bioinformatic analysis of in vivo protein phosphorylation sites in yeast

Author: Cox Jürgen
de Godoy Lyris M F
Gnad Florian
Mann Matthias
Neuhauser Nadin
Olsen Jesper V
Ren Shubin
Publication venue: 'Wiley'
Publication date: 01/01/2009
Field of study

Copenhagen University Research Information System

Andromeda: A Peptide Search Engine Integrated into the MaxQuant Environment

Author: Annette Michalski (1276377)
Jesper V. Olsen (23892)
Jürgen Cox (1276383)
Matthias Mann (13750)
Nadin Neuhauser (1276380)
Richard A. Scheltema (236180)
Publication venue
Publication date
Field of study

A key step in mass spectrometry (MS)-based proteomics is the identification of peptides in sequence databases by their fragmentation spectra. Here we describe Andromeda, a novel peptide search engine using a probabilistic scoring model. On proteome data, Andromeda performs as well as Mascot, a widely used commercial search engine, as judged by sensitivity and specificity analysis based on target decoy searches. Furthermore, it can handle data with arbitrarily high fragment mass accuracy, is able to assign and score complex patterns of post-translational modifications, such as highly phosphorylated peptides, and accommodates extremely large databases. The algorithms of Andromeda are provided. Andromeda can function independently or as an integrated search engine of the widely used MaxQuant computational proteomics platform and both are freely available at www.maxquant.org. The combination enables analysis of large data sets in a simple analysis workflow on a desktop computer. For searching individual spectra Andromeda is also accessible via a web server. We demonstrate the flexibility of the system by implementing the capability to identify cofragmented peptides, significantly improving the total number of identified peptides

The Francis Crick Institute

Andromeda: A Peptide Search Engine Integrated into the MaxQuant Environment

Author: Annette Michalski (1276377)
Jesper V. Olsen (23892)
Jürgen Cox (1276383)
Matthias Mann (13750)
Nadin Neuhauser (1276380)
Richard A. Scheltema (236180)
Publication venue
Publication date
Field of study

The Francis Crick Institute

High Performance Computational Analysis of Large-scale Proteome Data Sets to Assess Incremental Contribution to Coverage of the Human Genome

Author: Jürgen Cox (1276383)
Matthias Mann (13750)
Nadin Neuhauser (1276380)
Nagarjuna Nagaraj (1296663)
Peter McHardy (1296660)
Richard Scheltema (1296666)
Sara Zanivan (41266)
Publication venue
Publication date
Field of study

Computational analysis of shotgun proteomics data can now be performed in a completely automated and statistically rigorous way, as exemplified by the freely available MaxQuant environment. The sophisticated algorithms involved and the sheer amount of data translate into very high computational demands. Here we describe parallelization and memory optimization of the MaxQuant software with the aim of executing it on a large computer cluster. We analyze and mitigate bottlenecks in overall performance and find that the most time-consuming algorithms are those detecting peptide features in the MS1 data as well as the fragment spectrum search. These tasks scale with the number of raw files and can readily be distributed over many CPUs as long as memory access is properly managed. Here we compared the performance of a parallelized version of MaxQuant running on a standard desktop, an I/O performance optimized desktop computer (“game computer”), and a cluster environment. The modified gaming computer and the cluster vastly outperformed a standard desktop computer when analyzing more than 1000 raw files. We apply our high performance platform to investigate incremental coverage of the human proteome by high resolution MS data originating from in-depth cell line and cancer tissue proteome measurements

The Francis Crick Institute

Expert System for Computer-assisted Annotation of MS/MS Spectra

Author: Annette Michalski
Bern
Bin
Boersema
Cox
Cox
Giarratano
Granholm
Houel
Jürgen Cox
Kelstrup
Liao
Matthias Mann
Michalski
Michalski
Michalski
Nadin Neuhauser
Nesvizhskii
Olsen
Olsen
Russell
Schroll
Shevchenko
Steen
Zhang
Publication venue: 'American Society for Biochemistry & Molecular Biology (ASBMB)'
Publication date
Field of study

Crossref

High performance computational analysis of large-scale proteome data sets to assess incremental contribution to coverage of the human genome

Author: Aebersold R.
Altelaar A. F.
Beck M.
Clamp M.
Cox J.
Cox J.
Cox J.
Cox J.
Cox J.
Cox J.
Galvez S.
Geiger T.
Geiger T.
Jürgen Cox
Keller A.
Kohlbacher O.
Legrain P.
Lundberg E.
MacCoss M. J.
MacLean B.
Mallick P.
Matthias Mann
Michalski A.
Mueller L. N.
Munoz J.
Nadin Neuhauser
Nagaraj N.
Nagaraj N.
Nagarjuna Nagaraj
Paik Y. K.
Perkins D. N.
Peter McHardy
Richard Scheltema
Sara Zanivan
Schaab C.
UniProt Consortium
Wisniewski J. R.
Wisniewski J. R.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2013
Field of study

Computational analysis of shotgun proteomics data can now be performed in a completely automated and statistically rigorous way, as exemplified by the freely available MaxQuant environment. The sophisticated algorithms involved and the sheer amount of data translate into very high computational demands. Here we describe parallelization and memory optimization of the MaxQuant software with the aim of executing it on a large computer cluster. We analyze and mitigate bottlenecks in overall performance and find that the most time-consuming algorithms are those detecting peptide features in the MS1 data as well as the fragment spectrum search. These tasks scale with the number of raw files and can readily be distributed over many CPUs as long as memory access is properly managed. Here we compared the performance of a parallelized version of MaxQuant running on a standard desktop, an I/O performance optimized desktop computer (“game computer”), and a cluster environment. The modified gaming computer and the cluster vastly outperformed a standard desktop computer when analyzing more than 1000 raw files. We apply our high performance platform to investigate incremental coverage of the human proteome by high resolution MS data originating from in-depth cell line and cancer tissue proteome measurements

Crossref

Enlighten

MPG.PuRe