218 research outputs found

    MASPECTRAS: a platform for management and analysis of proteomics LC-MS/MS data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The advancements of proteomics technologies have led to a rapid increase in the number, size and rate at which datasets are generated. Managing and extracting valuable information from such datasets requires the use of data management platforms and computational approaches.</p> <p>Results</p> <p>We have developed the MAss SPECTRometry Analysis System (MASPECTRAS), a platform for management and analysis of proteomics LC-MS/MS data. MASPECTRAS is based on the Proteome Experimental Data Repository (PEDRo) relational database schema and follows the guidelines of the Proteomics Standards Initiative (PSI). Analysis modules include: 1) import and parsing of the results from the search engines SEQUEST, Mascot, Spectrum Mill, X! Tandem, and OMSSA; 2) peptide validation, 3) clustering of proteins based on Markov Clustering and multiple alignments; and 4) quantification using the Automated Statistical Analysis of Protein Abundance Ratios algorithm (ASAPRatio). The system provides customizable data retrieval and visualization tools, as well as export to PRoteomics IDEntifications public repository (PRIDE). MASPECTRAS is freely available at <url>http://genome.tugraz.at/maspectras</url></p> <p>Conclusion</p> <p>Given the unique features and the flexibility due to the use of standard software technology, our platform represents significant advance and could be of great interest to the proteomics community.</p

    An Optimized Data Structure for High Throughput 3D Proteomics Data: mzRTree

    Get PDF
    As an emerging field, MS-based proteomics still requires software tools for efficiently storing and accessing experimental data. In this work, we focus on the management of LC-MS data, which are typically made available in standard XML-based portable formats. The structures that are currently employed to manage these data can be highly inefficient, especially when dealing with high-throughput profile data. LC-MS datasets are usually accessed through 2D range queries. Optimizing this type of operation could dramatically reduce the complexity of data analysis. We propose a novel data structure for LC-MS datasets, called mzRTree, which embodies a scalable index based on the R-tree data structure. mzRTree can be efficiently created from the XML-based data formats and it is suitable for handling very large datasets. We experimentally show that, on all range queries, mzRTree outperforms other known structures used for LC-MS data, even on those queries these structures are optimized for. Besides, mzRTree is also more space efficient. As a result, mzRTree reduces data analysis computational costs for very large profile datasets.Comment: Paper details: 10 pages, 7 figures, 2 tables. To be published in Journal of Proteomics. Source code available at http://www.dei.unipd.it/mzrtre

    PathwayExplorer: web service for visualizing high-throughput expression data on biological pathways

    Get PDF
    While generation of high-throughput expression data is becoming routine, the fast, easy, and systematic presentation and analysis of these data in a biological context is still an obstacle. To address this need, we have developed PathwayExplorer, which maps expression profiles of genes or proteins simultaneously onto major, currently available regulatory, metabolic and cellular pathways from KEGG, BioCarta and GenMAPP. PathwayExplorer is a platform-independent web server application with an optional standalone Java application using a SOAP (simple object access protocol) interface. Mapped pathways are ranked for the easy selection of the pathway of interest, displaying all available genes of this pathway with their expression profiles in a selectable and intuitive color code. Pathway maps produced can be downloaded as PNG, JPG or as high-resolution vector graphics SVG. The web service is freely available at ; the standalone client can be downloaded at

    GOLD.db: genomics of lipid-associated disorders database

    Get PDF
    BACKGROUND: The GOLD.db (Genomics of Lipid-Associated Disorders Database) was developed to address the need for integrating disparate information on the function and properties of genes and their products that are particularly relevant to the biology, diagnosis management, treatment, and prevention of lipid-associated disorders. DESCRIPTION: The GOLD.db provides a reference for pathways and information about the relevant genes and proteins in an efficiently organized way. The main focus was to provide biological pathways with image maps and visual pathway information for lipid metabolism and obesity-related research. This database provides also the possibility to map gene expression data individually to each pathway. Gene expression at different experimental conditions can be viewed sequentially in context of the pathway. Related large scale gene expression data sets were provided and can be searched for specific genes to integrate information regarding their expression levels in different studies and conditions. Analytic and data mining tools, reagents, protocols, references, and links to relevant genomic resources were included in the database. Finally, the usability of the database was demonstrated using an example about the regulation of Pten mRNA during adipocyte differentiation in the context of relevant pathways. CONCLUSIONS: The GOLD.db will be a valuable tool that allow researchers to efficiently analyze patterns of gene expression and to display them in a variety of useful and informative ways, allowing outside researchers to perform queries pertaining to gene expression results in the context of biological processes and pathways

    MARS: Microarray analysis, retrieval, and storage system

    Get PDF
    BACKGROUND: Microarray analysis has become a widely used technique for the study of gene-expression patterns on a genomic scale. As more and more laboratories are adopting microarray technology, there is a need for powerful and easy to use microarray databases facilitating array fabrication, labeling, hybridization, and data analysis. The wealth of data generated by this high throughput approach renders adequate database and analysis tools crucial for the pursuit of insights into the transcriptomic behavior of cells. RESULTS: MARS (Microarray Analysis and Retrieval System) provides a comprehensive MIAME supportive suite for storing, retrieving, and analyzing multi color microarray data. The system comprises a laboratory information management system (LIMS), a quality control management, as well as a sophisticated user management system. MARS is fully integrated into an analytical pipeline of microarray image analysis, normalization, gene expression clustering, and mapping of gene expression data onto biological pathways. The incorporation of ontologies and the use of MAGE-ML enables an export of studies stored in MARS to public repositories and other databases accepting these documents. CONCLUSION: We have developed an integrated system tailored to serve the specific needs of microarray based research projects using a unique fusion of Web based and standalone applications connected to the latest J2EE application server technology. The presented system is freely available for academic and non-profit institutions. More information can be found at

    A guide to the Proteomics Identifications Database proteomics data repository

    Get PDF
    The Proteomics Identifications Database (PRIDE, http://www.ebi.ac.uk/pride) is one of the main repositories of MS derived proteomics data. Here, we point out the main functionalities of PRIDE both as a submission repository and as a source for proteomics data. We describe the main features for data retrieval and visualization available through the PRIDE web and BioMart interfaces. We also highlight the mechanism by which tailored queries in the BioMart can join PRIDE to other resources such as Reactome, Ensembl or UniProt to execute extremely powerful across-domain queries. We then present the latest improvements in the PRIDE submission process, using the new easy-to-use, platform-independent graphical user interface submission tool PRIDE Converter. Finally, we speak about future plans and the role of PRIDE in the ProteomExchange consortium

    Envelope: interactive software for modeling and fitting complex isotope distributions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>An important aspect of proteomic mass spectrometry involves quantifying and interpreting the isotope distributions arising from mixtures of macromolecules with different isotope labeling patterns. These patterns can be quite complex, in particular with <it>in vivo </it>metabolic labeling experiments producing fractional atomic labeling or fractional residue labeling of peptides or other macromolecules. In general, it can be difficult to distinguish the contributions of species with different labeling patterns to an experimental spectrum and difficult to calculate a theoretical isotope distribution to fit such data. There is a need for interactive and user-friendly software that can calculate and fit the entire isotope distribution of a complex mixture while comparing these calculations with experimental data and extracting the contributions from the differently labeled species.</p> <p>Results</p> <p>Envelope has been developed to be user-friendly while still being as flexible and powerful as possible. Envelope can simultaneously calculate the isotope distributions for any number of different labeling patterns for a given peptide or oligonucleotide, while automatically summing these into a single overall isotope distribution. Envelope can handle fractional or complete atom or residue-based labeling, and the contribution from each different user-defined labeling pattern is clearly illustrated in the interactive display and is individually adjustable. At present, Envelope supports labeling with <sup>2</sup>H, <sup>13</sup>C, and <sup>15</sup>N, and supports adjustments for baseline correction, an instrument accuracy offset in the m/z domain, and peak width. Furthermore, Envelope can display experimental data superimposed on calculated isotope distributions, and calculate a least-squares goodness of fit between the two. All of this information is displayed on the screen in a single graphical user interface. Envelope supports high-quality output of experimental and calculated distributions in PNG or PDF format. Beyond simply comparing calculated distributions to experimental data, Envelope is useful for planning or designing metabolic labeling experiments, by visualizing hypothetical isotope distributions in order to evaluate the feasibility of a labeling strategy. Envelope is also useful as a teaching tool, with its real-time display capabilities providing a straightforward way to illustrate the key variable factors that contribute to an observed isotope distribution.</p> <p>Conclusion</p> <p>Envelope is a powerful tool for the interactive calculation and visualization of complex isotope distributions for comparison to experimental data. It is available under the GNU General Public License from <url>http://williamson.scripps.edu/envelope/</url>.</p

    BioSunMS: a plug-in-based software for the management of patients information and the analysis of peptide profiles from mass spectrometry

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With wide applications of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) and surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS), statistical comparison of serum peptide profiles and management of patients information play an important role in clinical studies, such as early diagnosis, personalized medicine and biomarker discovery. However, current available software tools mainly focused on data analysis rather than providing a flexible platform for both the management of patients information and mass spectrometry (MS) data analysis.</p> <p>Results</p> <p>Here we presented a plug-in-based software, BioSunMS, for both the management of patients information and serum peptide profiles-based statistical analysis. By integrating all functions into a user-friendly desktop application, BioSunMS provided a comprehensive solution for clinical researchers without any knowledge in programming, as well as a plug-in architecture platform with the possibility for developers to add or modify functions without need to recompile the entire application.</p> <p>Conclusion</p> <p>BioSunMS provides a plug-in-based solution for managing, analyzing, and sharing high volumes of MALDI-TOF or SELDI-TOF MS data. The software is freely distributed under GNU General Public License (GPL) and can be downloaded from <url>http://sourceforge.net/projects/biosunms/</url>.</p

    OpenMS – An open-source software framework for mass spectrometry

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Mass spectrometry is an essential analytical technique for high-throughput analysis in proteomics and metabolomics. The development of new separation techniques, precise mass analyzers and experimental protocols is a very active field of research. This leads to more complex experimental setups yielding ever increasing amounts of data. Consequently, analysis of the data is currently often the bottleneck for experimental studies. Although software tools for many data analysis tasks are available today, they are often hard to combine with each other or not flexible enough to allow for rapid prototyping of a new analysis workflow.</p> <p>Results</p> <p>We present OpenMS, a software framework for rapid application development in mass spectrometry. OpenMS has been designed to be portable, easy-to-use and robust while offering a rich functionality ranging from basic data structures to sophisticated algorithms for data analysis. This has already been demonstrated in several studies.</p> <p>Conclusion</p> <p>OpenMS is available under the Lesser GNU Public License (LGPL) from the project website at <url>http://www.openms.de</url>.</p
    corecore