2,037 research outputs found

    Biomarker discovery and redundancy reduction towards classification using a multi-factorial MALDI-TOF MS T2DM mouse model dataset

    Get PDF
    Diabetes like many diseases and biological processes is not mono-causal. On the one hand multifactorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics

    A metaproteomic approach to study human-microbial ecosystems at the mucosal luminal interface

    Get PDF
    Aberrant interactions between the host and the intestinal bacteria are thought to contribute to the pathogenesis of many digestive diseases. However, studying the complex ecosystem at the human mucosal-luminal interface (MLI) is challenging and requires an integrative systems biology approach. Therefore, we developed a novel method integrating lavage sampling of the human mucosal surface, high-throughput proteomics, and a unique suite of bioinformatic and statistical analyses. Shotgun proteomic analysis of secreted proteins recovered from the MLI confirmed the presence of both human and bacterial components. To profile the MLI metaproteome, we collected 205 mucosal lavage samples from 38 healthy subjects, and subjected them to high-throughput proteomics. The spectral data were subjected to a rigorous data processing pipeline to optimize suitability for quantitation and analysis, and then were evaluated using a set of biostatistical tools. Compared to the mucosal transcriptome, the MLI metaproteome was enriched for extracellular proteins involved in response to stimulus and immune system processes. Analysis of the metaproteome revealed significant individual-related as well as anatomic region-related (biogeographic) features. Quantitative shotgun proteomics established the identity and confirmed the biogeographic association of 49 proteins (including 3 functional protein networks) demarcating the proximal and distal colon. This robust and integrated proteomic approach is thus effective for identifying functional features of the human mucosal ecosystem, and a fresh understanding of the basic biology and disease processes at the MLI. © 2011 Li et al

    Peaks detection and alignment for mass spectrometry data

    Get PDF
    The goal of this paper is to review existing methods for protein mass spectrometry data analysis, and to present a new methodology for automatic extraction of significant peaks (biomarkers). For the pre-processing step required for data from MALDI-TOF or SELDI- TOF spectra, we use a purely nonparametric approach that combines stationary invariant wavelet transform for noise removal and penalized spline quantile regression for baseline correction. We further present a multi-scale spectra alignment technique that is based on identification of statistically significant peaks from a set of spectra. This method allows one to find common peaks in a set of spectra that can subsequently be mapped to individual proteins. This may serve as useful biomarkers in medical applications, or as individual features for further multidimensional statistical analysis. MALDI-TOF spectra obtained from serum samples are used throughout the paper to illustrate the methodology

    Accurate peak list extraction from proteomic mass spectra for identification and profiling studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Mass spectrometry is an essential technique in proteomics both to identify the proteins of a biological sample and to compare proteomic profiles of different samples. In both cases, the main phase of the data analysis is the procedure to extract the significant features from a mass spectrum. Its final output is the so-called peak list which contains the mass, the charge and the intensity of every detected biomolecule. The main steps of the peak list extraction procedure are usually preprocessing, peak detection, peak selection, charge determination and monoisotoping operation.</p> <p>Results</p> <p>This paper describes an original algorithm for peak list extraction from low and high resolution mass spectra. It has been developed principally to improve the precision of peak extraction in comparison to other reference algorithms. It contains many innovative features among which a sophisticated method for managing the overlapping isotopic distributions.</p> <p>Conclusions</p> <p>The performances of the basic version of the algorithm and of its optional functionalities have been evaluated in this paper on both SELDI-TOF, MALDI-TOF and ESI-FTICR ECD mass spectra. Executable files of MassSpec, a MATLAB implementation of the peak list extraction procedure for Windows and Linux systems, can be downloaded free of charge for nonprofit institutions from the following web site: <url>http://aimed11.unipv.it/MassSpec</url></p

    Peak annotation and data analysis software tools for mass spectrometry imaging

    Get PDF
    La metabolòmica espacial és la disciplina que estudia les imatges de les distribucions de compostos químics de baix pes (metabòlits) a la superfície dels teixits biològics per revelar interaccions entre molècules. La imatge d'espectrometria de masses (MSI) és actualment la tècnica principal per obtenir informació d'imatges moleculars per a la metabolòmica espacial. MSI és una tecnologia d'imatges moleculars sense marcador que produeix espectres de masses que conserven les estructures espacials de les mostres de teixit. Això s'aconsegueix ionitzant petites porcions d'una mostra (un píxel) en un ràster definit a través de tota la seva superfície, cosa que dona com a resultat una col·lecció d'imatges de distribució de ions (registrades com a relacions massa-càrrega (m/z)) sobre la mostra. Aquesta tesi té com a objectius desenvolupar eines computacionals per a l'anotació de pics de MSI i el disseny de fluxos de treball per a l'anàlisi estadística i multivariant de dades MSI, inclosa la segmentació espacial. El treball realitzat en aquesta tesi es pot separar clarament en dues parts. En primer lloc, el desenvolupament d'una eina d'anotació de pics d'isòtops i adductes adequada per facilitar la identificació de compostos de rang de massa baix. Ara podem trobar fàcilment ions monoisotòpics als nostres conjunts de dades MSI gràcies al paquet de programari rMSIannotation. En segon lloc, el desenvolupament de eines de programari per a l’anàlisi de dades i la segmentació espacial basades en soft clustering per a dades MSI.La metabolómica espacial es la disciplina que estudia las imágenes de las distribuciones de compuestos químicos de bajo peso (metabolitos) en la superficie de los tejidos biológicos para revelar interacciones entre moléculas. Las imágenes de espectrometría de masas (MSI) es actualmente la principal técnica para obtener información de imágenes moleculares para la metabolómica espacial. MSI es una tecnología de imágenes moleculares sin marcador que produce espectros de masas que conservan las estructuras espaciales de las muestras de tejido. Esto se logra ionizando pequeñas porciones de una muestra (un píxel) en un ráster definido a través de toda su superficie, lo que da como resultado una colección de imágenes de distribución de iones (registradas como relaciones masa-carga (m/z)) sobre la muestra. Esta tesis tiene como objetivo desarrollar herramientas computacionales para la anotación de picos en MSI y en el diseño de flujos de trabajo para el análisis estadístico y multivariado de datos MSI, incluida la segmentación espacial. El trabajo realizado en esta tesis se puede separar claramente en dos partes. En primer lugar, el desarrollo de una herramienta de anotación de picos de isótopos y aductos adecuada para facilitar la identificación de compuestos de bajo rango de masa. Ahora podemos encontrar fácilmente iones monoisotópicos en nuestros conjuntos de datos MSI gracias al paquete de software rMSIannotation.Spatial metabolomics is the discipline that studies the images of the distributions of low weight chemical compounds (metabolites) on the surface of biological tissues to unveil interactions between molecules. Mass spectrometry imaging (MSI) is currently the principal technique to get molecular imaging information for spatial metabolomics. MSI is a labelfree molecular imaging technology that produces mass spectra preserving the spatial structures of tissue samples. This is achieved by ionizing small portions of a sample (a pixel) in a defined raster through all its surface, which results in a collection of ion distribution images (registered as mass-to-charge ratios (m/z)) over the sample. This thesis is aimed to develop computational tools for peak annotation in MSI and in the design of workflows for the statistical and multivariate analysis of MSI data, including spatial segmentation. The work carried out in this thesis can be clearly separated in two parts. Firstly, the development of an isotope and adduct peak annotation tool suited to facilitate the identification of the low mass range compounds. We can now easily find monoisotopic ions in our MSI datasets thanks to the rMSIannotation software package. Secondly, the development of software tools for data analysis and spatial segmentation based on soft clustering for MSI data. In this thesis, we have developed tools and methodologies to search for significant ions (rMSIKeyIon software package) and for the soft clustering of tissues (Fuzzy c-means algorithm)

    A database application for pre-processing, storage and comparison of mass spectra derived from patients and controls.

    Get PDF
    BACKGROUND: Statistical comparison of peptide profiles in biomarker discovery requires fast, user-friendly software for high throughput data analysis. Important features are flexibility in changing input variables and statistical analysis of peptides that are differentially expressed between patient and control groups. In addition, integration the mass spectrometry data with the results of other experiments, such as microarray analysis, and information from other databases requires a central storage of the profile matrix, where protein id's can be added to peptide masses of interest. RESULTS: A new database application is presented, to detect and identify significantly differentially expressed peptides in peptide profiles obtained from body fluids of patient and control groups. The presented modular software is capable of central storage of mass spectra and results in fast analysis. The software architecture consists of 4 pillars, 1) a Graphical User Interface written in Java, 2) a MySQL database, which contains all metadata, such as experiment numbers and sample codes, 3) a FTP (File Transport Protocol) server to store all raw mass spectrometry files and processed data, and 4) the software package R, which is used for modular statistical calculations, such as the Wilcoxon-Mann-Whitney rank sum test. Statistic analysis by the Wilcoxon-Mann-Whitney test in R demonstrates that peptide-profiles of two patient groups 1) breast cancer patients with leptomeningeal metastases and 2) prostate cancer patients in end stage disease can be distinguished from those of control groups. CONCLUSION: The database application is capable to distinguish patient Matrix Assisted Laser Desorption Ionization (MALDI-TOF) peptide profiles from control groups using large size datasets. The modular architecture of the application makes it possible to adapt the application to handle also large sized data from MS/MS- and Fourier Transform Ion Cyclotron Resonance (FT-ICR) mass spectrometry experiments. It is expected that the higher resolution and mass accuracy of the FT-ICR mass spectrometry prevents the clustering of peaks of different peptides and allows the identification of differentially expressed proteins from the peptide profiles

    Development of a complete advanced computational workflow for high-resolution LDI-MS metabolomics imaging data processing and visualization

    Get PDF
    La imatge per espectrometria de masses (MSI) mapeja la distribució espacial de les molècules en una mostra. Això permet extreure informació Metabolòmica espacialment corralada d'una secció de teixit. MSI no s'usa àmpliament en la metabolòmica espacial a causa de diverses limitacions relacionades amb les matrius MALDI, incloent la generació d'ions que interfereixen en el rang de masses més baix i la difusió lateral dels compostos. Hem desenvolupat un flux de treball que millora l'adquisició de metabòlits en un instrument MALDI utilitzant un "sputtering" per dipositar una nano-capa d'Au directament sobre el teixit. Això minimitza la interferència dels senyals del "background" alhora que permet resolucions espacials molt altes. S'ha desenvolupat un paquet R per a la visualització d'imatges i processament de les dades MSI, tot això mitjançant una implementació optimitzada per a la gestió de la memòria i la programació concurrent. A més, el programari desenvolupat inclou també un algoritme per a l'alineament de masses que millora la precisió de massa.La imagen por espectrometría de masas (MSI) mapea la distribución espacial de las moléculas en una muestra. Esto permite extraer información metabolòmica espacialmente corralada de una sección de tejido. MSI no se usa ampliamente en la metabolòmica espacial debido a varias limitaciones relacionadas con las matrices MALDI, incluyendo la generación de iones que interfieren en el rango de masas más bajo y la difusión lateral de los compuestos. Hemos desarrollado un flujo de trabajo que mejora la adquisición de metabolitos en un instrumento MALDI utilizando un “sputtering” para depositar una nano-capa de Au directamente sobre el tejido. Esto minimiza la interferencia de las señales del “background” a la vez que permite resoluciones espaciales muy altas. Se ha desarrollado un paquete R para la visualización de imágenes y procesado de los datos MSI, todo ello mediante una implementación optimizada para la gestión de la memoria y la programación concurrente. Además, el software desarrollado incluye también un algoritmo para el alineamiento de masas que mejora la precisión de masa.Mass spectrometry imaging (MSI) maps the spatial distributions of molecules in a sample. This allows extracting spatially-correlated metabolomics information from tissue sections. MSI is not widely used in spatial metabolomics due to several limitations related with MALDI matrices, including the generation of interfering ions and in the low mass range and the lateral compound delocalization. We developed a workflow to improve the acquisition of metabolites using a MALDI instrument. We sputter an Au nano-layer directly onto the tissue section enabling the acquisition of metabolites with minimal interference of background signals and ultra-high spatial resolution. We developed an R package for image visualization and MSI data processing, which is optimized to manage datasets larger than computer’s memory using a mutli-threaded implementation. Moreover, our software includes a label-free mass alignment algorithm for mass accuracy enhancement

    Three-dimensional reconstruction of the tissue-specific multielemental distribution within Ceriodaphnia dubia via multimodal registration using laser ablation ICP-mass spectrometry and X-ray spectroscopic techniques

    Get PDF
    In this work, the three-dimensional elemental, distribution profile within the freshwater crustacean Ceriodaphnia dubia was constructed at a spatial resolution down to S mu m via a data, fusion approach employing state-of-the-art laser ablation inductively coupled plasma-time-of-flight mass spectrometry (LAICP-TOFMS) and laboratory-based absorption microcomputed tomography (mu-CT). C. dubia was exposed to elevated Cu, Ni, and Zn concentrations, chemically fixed, dehydrated, stained, and embedded, prior to mu-CT analysis. Subsequently, the sample was cut into 5 pm thin sections that were subjected to LA-ICPTOFMS imaging. Multimodal image registration was performed to spatially align the 2D LA-ICP-TOFMS images relative to the Corresponding slices of the 3D mu-CT reconstruction. Mass channels corresponding to the isotopes of a single element were merged to improve the signal-to-noise ratios within the elemental images. In order to aid the visual interpretation of the data, LA-ICP-TOEMS data wete projected onto the mu-CT voxels representing tissue. Additionally, the image resolution and elemental sensitivity were compared to those obtained with synchrotron radiation based 3D confocal mu-X-ray fluorescence imaging upon a chemically fixed and air-dried C. dubia specimen

    Peptide mass fingerprinting using field-programmable gate arrays

    Get PDF
    The reconfigurable computing paradigm, which exploits the flexibility and versatility of field-programmable gate arrays (FPGAs), has emerged as a powerful solution for speeding up time-critical algorithms. This paper describes a reconfigurable computing solution for processing raw mass spectrometric data generated by MALDI-TOF instruments. The hardware-implemented algorithms for denoising, baseline correction, peak identification, and deisotoping, running on a Xilinx Virtex-2 FPGA at 180 MHz, generate a mass fingerprint that is over 100 times faster than an equivalent algorithm written in C, running on a Dual 3-GHz Xeon server. The results obtained using the FPGA implementation are virtually identical to those generated by a commercial software package MassLynx

    A novel high-throughput and label-free phenotypic drug screening approach: MALDI-TOF mass spectrometry combined with machine learning strategies

    Get PDF
    A renewed and growing interest in phenotypic drug screening approaches in the field of drug discovery is observed, as it has become apparent that target-oriented drug discovery assays have inherent limitations and cannot fulfil the urgent unmet medical need for novel drugs. The shortcomings of target-oriented drug screening assays are especially apparent in the field of antibiotic drug discovery, where target-based approaches largely failed to translate screening hits to clinically relevant drugs. In this thesis, a proteomics-based phenotypic drug screening approach using MALDI-TOF mass spectrometry was developed, which is able to detect sub-lethal stress in bacterial cells provoked by antibiotics. To achieve this, mass spectra of whole-cells exposed to known antibiotics at concentrations below the minimal inhibitory concentration (MIC) were used to extract relevant mass spectral peaks with a data-dependent and automated computational pipeline created in the MATLAB environment. Using the selected subset of mass spectral peaks, classification models were trained to recognize general mass spectral responses provoked by unknown drugs in the cellular proteome. Additionally, the classification models proved capable of identifying the mechanisms of action of unknown drugs. To establish and validate the best performing classification modeling procedure, four different feature selection algorithms and nine classification models were analyzed in detail using an Escherichia coli data set composed of over 900 spectra, involving 17 antibiotics with four different mechanisms of action, at concentrations ranging 1×MIC down to 1/32×MIC in a two-fold dilution series. Four different feature selection approaches were investigated to ensure the extraction of relevant mass spectral data in response to the different antibiotics for classification modeling. The selection approaches included (1) a random forest of decision trees, (2) sequential forward feature selection, and (3) sequential backward feature selection. Mass spectral peaks selected by two or all three of these feature selection approaches were combined into (4) an aggregated feature set. Classification models were trained for all combinations of nine model types and the four feature sets. In this thesis two classification problems were investigated. First, a binary classification problem, to differentiate between affected cells, and non-affected cells based on selected mass spectral peaks. Second, a multi-class model was trained to detect and distinguish between the different antibiotic mechanisms of action, a highly desired drug screening assay characteristic. The combination of these elements yielded 72 models, which were evaluated based on their overall classification accuracy. The overall classification accuracy was determined using internal 10-fold cross-validation and external validation, which was performed with a blind set of 20 drugs. The internal and external validation studies showed that the aggregated feature set in combination with a quadratic support vector machine-based model (Q-SVM) resulted in the best classification performance. For the E. coli data set, this was represented by an overall accuracy of 0.92 for internal validation and an accuracy of 0.95 for the external validation of the Q-SVM model. Classifying based on the mechanism of action of the antibiotics resulted in a classification accuracy of 0.67 for internal validation and 0.80 for external validation. Furthermore, it was shown that the peak selection method was able to identify relevant, known stress associated proteins within the aggregated feature sets of both the binary and the mechanism of action model. After the experimental workflow and the computational pipeline were established based on E. coli data, the method was applied to four different organisms (the Gram-positive bacterium Staphylococcus aureus, the fungi Saccharomyces cerevisiae and Candida albicans, and human HeLa cancer cell line) and different proteomic responses, to explore the versatility and transferability of the developed screening assay. The applicability of the method was demonstrated by the consistent performance of the classification models generated with the experimental and computational pipeline. This resulted in binary model accuracies between 0.92 and 0.97 for internal and 0.77 and 0.95 for external validation, depending on the assayed organism and data set complexity. For mechanism of action models, model accuracies ranged between 0.73 and 0.96 for internal and 0.66 and 0.93 for external validation. The application of the developed assay on different organisms with different drug stressors highlighted several advantageous characteristics of the developed MALDI-TOF MS screening approach. Both the binary and mechanism of action classification models of S. aureus correctly identified an antibiotic drug (fusidic acid) in the blind test set, which had a target binding activity that was not present in the training data set. This implicates the ability of the method to detect novel drugs within known global mechanism of action for which the model was trained. Moreover, external validation of S. cerevisiae showed that the binary classification model is able to detect antifungal drugs (tavaborole, an antifungal protein synthesis inhibitor) with a mechanism of action which was not present in the training data set. This is a highly desirable property of any phenotypic screening assay, as it shows that the assay allows for the identification of drugs with novel mechanisms of action. Lastly, the proteomic effect of different types of drugs on mammalian cells was explored by using the HeLa cancer cell line. It was shown that the presented proteomic profiling approach can easily detect several types of drug-induced stresses in HeLa cells, in particular corticosteroids and tubulin (de)polymerization inhibitors, but is less suitable for distinguishing other types of drug classes (neurotransmitter antagonists, statins, opioids). Additionally, the application of the assay on HeLa cells demonstrated the ability to detect different types of stresses, such as the cells’ proteomic response to UV exposure or heat shocks. These results pave the way for possible distinction between apoptosis and necrosis pathways in HeLa cells using the presented MALDI-TOF MS based method. To conclude, a high-throughput compatible, label free, MALDI-TOF mass spectrometry-based screening assay is described in this thesis, which measures sub-lethal drug effects on the cellular proteome in a phenotypic and pharmacological relevant setting. The method was found suitable for whole-cell screening of small libraries of drugs, and showed the ability to distinguish different types of stresses elicited on multiple types of cell cultures. The potential to find new, weakly active drugs within a known mechanism of action, as well as the ability to detect sub-lethal drug responses with new mechanisms of action for which the model was not trained, was demonstrated. The characteristic to identify novel mechanisms of action in a cell-based screen can be exploited to solve the most pressing issues in drug discovery today. In addition, mechanistic information of the drugs activity can be used as a starting point for further target elucidation or to prioritize drug screening hits. The studies performed in this thesis have resulted in a solid foundation for further research that expand the capabilities of the MALDI-TOF MS-based assay in a broad range of phenotypic profiling applications in the drug discovery field
    corecore