64 research outputs found

    Automated mass spectrometry-based metabolomics data processing by blind source separation methods

    Get PDF
    Una de les principals limitacions de la metabolòmica és la transformació de dades crues en informació biològica. A més, la metabolòmica basada en espectrometria de masses genera grans quantitats de dades complexes caracteritzades per la co-elució de compostos i artefactes experimentals. L'objectiu d'aquesta tesi és desenvolupar estratègies automatitzades basades en deconvolució cega del senyal per millorar les capacitats dels mètodes existents que tracten les limitacions de les diferents passes del processament de dades en metabolòmica. L'objectiu d'aquesta tesi és també desenvolupar eines capaces d'executar el flux de treball del processament de dades en metabolòmica, que inclou el preprocessament de dades, deconvolució espectral, alineament i identificació. Com a resultat, tres nous mètodes automàtics per deconvolució espectral basats en deconvolució cega del senyal van ser desenvolupats. Aquests mètodes van ser inclosos en dues eines computacionals que permeten convertir automàticament dades crues en informació biològica interpretable i per tant, permeten resoldre hipòtesis biològiques i adquirir nous coneixements biològics.Una de les principals limitacions de la metabolòmica és la transformació de dades crues en informació biològica. A més, la metabolòmica basada en espectrometria de masses genera grans quantitats de dades complexes caracteritzades per la co-elució de compostos i artefactes experimentals. L'objectiu d'aquesta tesi és desenvolupar estratègies automatitzades basades en deconvolució cega del senyal per millorar les capacitats dels mètodes existents que tracten les limitacions de les diferents passes del processament de dades en metabolòmica. L'objectiu d'aquesta tesi és també desenvolupar eines capaces d'executar el flux de treball del processament de dades en metabolòmica, que inclou el preprocessament de dades, deconvolució espectral, alineament i identificació. Com a resultat, tres nous mètodes automàtics per deconvolució espectral basats en deconvolució cega del senyal van ser desenvolupats. Aquests mètodes van ser inclosos en dues eines computacionals que permeten convertir automàticament dades crues en informació biològica interpretable i per tant, permeten resoldre hipòtesis biològiques i adquirir nous coneixements biològics.Una de las principales limitaciones de la metabolómica es la transformación de datos crudos en información biológica. Además, la metabolómica basada en espectrometría de masas genera grandes cantidades de datos complejos caracterizados por la co-elución de compuestos y artefactos experimentales. El objetivo de esta tesis es desarrollar estrategias automatizadas basadas en deconvolución ciega de la señal para mejorar las capacidades de los métodos existentes que tratan las limitaciones de los diferentes pasos del procesamiento de datos en metabolómica. El objetivo de esta tesis es también desarrollar herramientas capaces de ejecutar el flujo de trabajo del procesamiento de datos en metabolómica, que incluye el preprocessamiento de datos, deconvolución espectral, alineamiento e identificación. Como resultado, tres nuevos métodos automáticos para deconvolución espectral basados en deconvolución ciega de la señal fueron desarrollados. Estos métodos fueron incluidos en dos herramientas computacionales que permiten convertir automáticamente datos crudos en información biológica interpretable y por lo tanto, permiten resolver hipótesis biológicas y adquirir nuevos conocimientos biológicos.One of the major bottlenecks in metabolomics is to convert raw data samples into biological interpretable information. Moreover, mass spectrometry-based metabolomics generates large and complex datasets characterized by co-eluting compounds and with experimental artifacts. This thesis main objective is to develop automated strategies based on blind source separation to improve the capabilities of the current methods that tackle the different metabolomics data processing workflow steps limitations. Also, the objective of this thesis is to develop tools capable of performing the entire metabolomics workflow for GC--MS, including pre-processing, spectral deconvolution, alignment and identification. As a result, three new automated methods for spectral deconvolution based on blind source separation were developed. These methods were embedded into two computation tools able to automatedly convert raw data into biological interpretable information and thus, allow resolving biological answers and discovering new biological insights

    Automated resolution of chromatographic signals by independent component analysis-orthogonal signal deconvolution in comprehensive gas chromatography/mass spectrometry-based metabolomics

    Get PDF
    Comprehensive gas chromatography-mass spectrometry (GC x GC-MS) provides a different perspective in metabolomics profiling of samples. However, algorithms for GCx GC-MS data processing are needed in order to automatically process the data and extract the purest information about the compounds appearing in complex biological samples. This study shows the capability of independent component analysis-orthogonal signal deconvolution (ICA-OSD), an algorithm based on blind source separation and distributed in an R package called osd, to extract the spectra of the compounds appearing in GCx GC-MS chromatograms in an automated manner. We studied the performance of ICA-OSD by the quantification of 38 metabolites through a set of 20 Jurkat cell samples analyzed by GCx GC-MS. The quantification by ICA-OSD was compared with a supervised quantification by selective ions, and most of the R2 coefficients of determination were in good agreement (R-2>0.90) while up to 24 cases exhibited an excellent linear relation (R-2>0.95). We concluded that ICA-OSD can be used to resolve co-eluted compounds in GC x GC-MS. (C) 2016 Elsevier Ireland Ltd. All rights reserved.Postprint (author's final draft

    Avoiding hard chromatographic segmentation: A moving window approach for the automated resolution of gas chromatography–mass spectrometry-based metabolomics signals by multivariate methods

    No full text
    Gas chromatography–mass spectrometry (GC–MS) produces large and complex datasets characterized by co-eluted compounds and at trace levels, and with a distinct compound ion-redundancy as a result of the high fragmentation by the electron impact ionization. Compounds in GC–MS can be resolved by taking advantage of the multivariate nature of GC–MS data by applying multivariate resolution methods. However, multivariate methods have to be applied in small regions of the chromatogram, and therefore chromatograms are segmented prior to the application of the algorithms. The automation of this segmentation process is a challenging task as it implies separating between informative data and noise from the chromatogram. This study demonstrates the capabilities of independent component analysis–orthogonal signal deconvolution (ICA–OSD) and multivariate curve resolution–alternating least squares (MCR–ALS) with an overlapping moving window implementation to avoid the typical hard chromatographic segmentation. Also, after being resolved, compounds are aligned across samples by an automated alignment algorithm. We evaluated the proposed methods through a quantitative analysis of GC–qTOF MS data from 25 serum samples. The quantitative performance of both moving window ICA–OSD and MCR–ALS-based implementations was compared with the quantification of 33 compounds by the XCMS package. Results shown that most of the R2 coefficients of determination exhibited a high correlation (R2 > 0.90) in both ICA–OSD and MCR–ALS moving window-based approaches.Postprint (author's final draft

    Avoiding hard chromatographic segmentation: A moving window approach for the automated resolution of gas chromatography–mass spectrometry-based metabolomics signals by multivariate methods

    No full text
    Gas chromatography–mass spectrometry (GC–MS) produces large and complex datasets characterized by co-eluted compounds and at trace levels, and with a distinct compound ion-redundancy as a result of the high fragmentation by the electron impact ionization. Compounds in GC–MS can be resolved by taking advantage of the multivariate nature of GC–MS data by applying multivariate resolution methods. However, multivariate methods have to be applied in small regions of the chromatogram, and therefore chromatograms are segmented prior to the application of the algorithms. The automation of this segmentation process is a challenging task as it implies separating between informative data and noise from the chromatogram. This study demonstrates the capabilities of independent component analysis–orthogonal signal deconvolution (ICA–OSD) and multivariate curve resolution–alternating least squares (MCR–ALS) with an overlapping moving window implementation to avoid the typical hard chromatographic segmentation. Also, after being resolved, compounds are aligned across samples by an automated alignment algorithm. We evaluated the proposed methods through a quantitative analysis of GC–qTOF MS data from 25 serum samples. The quantitative performance of both moving window ICA–OSD and MCR–ALS-based implementations was compared with the quantification of 33 compounds by the XCMS package. Results shown that most of the R2 coefficients of determination exhibited a high correlation (R2 > 0.90) in both ICA–OSD and MCR–ALS moving window-based approaches

    Automated resolution of chromatographic signals by independent component analysis-orthogonal signal deconvolution in comprehensive gas chromatography/mass spectrometry-based metabolomics

    No full text
    Comprehensive gas chromatography-mass spectrometry (GC x GC-MS) provides a different perspective in metabolomics profiling of samples. However, algorithms for GCx GC-MS data processing are needed in order to automatically process the data and extract the purest information about the compounds appearing in complex biological samples. This study shows the capability of independent component analysis-orthogonal signal deconvolution (ICA-OSD), an algorithm based on blind source separation and distributed in an R package called osd, to extract the spectra of the compounds appearing in GCx GC-MS chromatograms in an automated manner. We studied the performance of ICA-OSD by the quantification of 38 metabolites through a set of 20 Jurkat cell samples analyzed by GCx GC-MS. The quantification by ICA-OSD was compared with a supervised quantification by selective ions, and most of the R2 coefficients of determination were in good agreement (R-2>0.90) while up to 24 cases exhibited an excellent linear relation (R-2>0.95). We concluded that ICA-OSD can be used to resolve co-eluted compounds in GC x GC-MS. (C) 2016 Elsevier Ireland Ltd. All rights reserved

    Computational Expansion of High-Resolution-MSn Spectral Libraries

    No full text
    <p>Commonly, in MS-based untargeted metabolomics, some metabolites cannot be confidently identified due to ambiguities in resolving isobars and structurally similar species. To address this, analytical techniques beyond traditional MS2 analysis, such as MSn fragmentation, can be applied to probe metabolites for additional structural information. In MSn fragmentation, recursive cycles of activation are applied to fragment ions originating from the same precursor ion detected on an MS1 spectrum. This resonant-type collision-activated dissociation (CAD) can yield information that cannot be ascertained from MS2 spectra alone, which helps improve the performance of metabolite identification workflows. However, most approaches for metabolite identification require mass-to-charge (<i>m</i>/<i>z</i>) values measured with high resolution, as this enables the determination of accurate mass values. Unfortunately, high-resolution-MSn spectra are relatively rare in spectral libraries. Here, we describe a computational approach to generate a database of high-resolution-MSn spectra by converting existing low-resolution-MSn spectra using complementary high-resolution-MS2 spectra generated by beam-type CAD. Using this method, we have generated a database, derived from the NIST20 MS/MS database, of MSn spectral trees representing 9637 compounds and 19386 precursor ions where at least 90% of signal intensity was converted from low-to-high resolution.</p><p>The GLOMICAVE project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 952908. </p&gt
    corecore