21 research outputs found

    Automated mass spectrometry-based metabolomics data processing by blind source separation methods

    Get PDF
    Una de les principals limitacions de la metabolòmica és la transformació de dades crues en informació biològica. A més, la metabolòmica basada en espectrometria de masses genera grans quantitats de dades complexes caracteritzades per la co-elució de compostos i artefactes experimentals. L'objectiu d'aquesta tesi és desenvolupar estratègies automatitzades basades en deconvolució cega del senyal per millorar les capacitats dels mètodes existents que tracten les limitacions de les diferents passes del processament de dades en metabolòmica. L'objectiu d'aquesta tesi és també desenvolupar eines capaces d'executar el flux de treball del processament de dades en metabolòmica, que inclou el preprocessament de dades, deconvolució espectral, alineament i identificació. Com a resultat, tres nous mètodes automàtics per deconvolució espectral basats en deconvolució cega del senyal van ser desenvolupats. Aquests mètodes van ser inclosos en dues eines computacionals que permeten convertir automàticament dades crues en informació biològica interpretable i per tant, permeten resoldre hipòtesis biològiques i adquirir nous coneixements biològics.Una de les principals limitacions de la metabolòmica és la transformació de dades crues en informació biològica. A més, la metabolòmica basada en espectrometria de masses genera grans quantitats de dades complexes caracteritzades per la co-elució de compostos i artefactes experimentals. L'objectiu d'aquesta tesi és desenvolupar estratègies automatitzades basades en deconvolució cega del senyal per millorar les capacitats dels mètodes existents que tracten les limitacions de les diferents passes del processament de dades en metabolòmica. L'objectiu d'aquesta tesi és també desenvolupar eines capaces d'executar el flux de treball del processament de dades en metabolòmica, que inclou el preprocessament de dades, deconvolució espectral, alineament i identificació. Com a resultat, tres nous mètodes automàtics per deconvolució espectral basats en deconvolució cega del senyal van ser desenvolupats. Aquests mètodes van ser inclosos en dues eines computacionals que permeten convertir automàticament dades crues en informació biològica interpretable i per tant, permeten resoldre hipòtesis biològiques i adquirir nous coneixements biològics.Una de las principales limitaciones de la metabolómica es la transformación de datos crudos en información biológica. Además, la metabolómica basada en espectrometría de masas genera grandes cantidades de datos complejos caracterizados por la co-elución de compuestos y artefactos experimentales. El objetivo de esta tesis es desarrollar estrategias automatizadas basadas en deconvolución ciega de la señal para mejorar las capacidades de los métodos existentes que tratan las limitaciones de los diferentes pasos del procesamiento de datos en metabolómica. El objetivo de esta tesis es también desarrollar herramientas capaces de ejecutar el flujo de trabajo del procesamiento de datos en metabolómica, que incluye el preprocessamiento de datos, deconvolución espectral, alineamiento e identificación. Como resultado, tres nuevos métodos automáticos para deconvolución espectral basados en deconvolución ciega de la señal fueron desarrollados. Estos métodos fueron incluidos en dos herramientas computacionales que permiten convertir automáticamente datos crudos en información biológica interpretable y por lo tanto, permiten resolver hipótesis biológicas y adquirir nuevos conocimientos biológicos.One of the major bottlenecks in metabolomics is to convert raw data samples into biological interpretable information. Moreover, mass spectrometry-based metabolomics generates large and complex datasets characterized by co-eluting compounds and with experimental artifacts. This thesis main objective is to develop automated strategies based on blind source separation to improve the capabilities of the current methods that tackle the different metabolomics data processing workflow steps limitations. Also, the objective of this thesis is to develop tools capable of performing the entire metabolomics workflow for GC--MS, including pre-processing, spectral deconvolution, alignment and identification. As a result, three new automated methods for spectral deconvolution based on blind source separation were developed. These methods were embedded into two computation tools able to automatedly convert raw data into biological interpretable information and thus, allow resolving biological answers and discovering new biological insights

    Automated resolution of chromatographic signals by independent component analysis-orthogonal signal deconvolution in comprehensive gas chromatography/mass spectrometry-based metabolomics

    Get PDF
    Comprehensive gas chromatography-mass spectrometry (GC x GC-MS) provides a different perspective in metabolomics profiling of samples. However, algorithms for GCx GC-MS data processing are needed in order to automatically process the data and extract the purest information about the compounds appearing in complex biological samples. This study shows the capability of independent component analysis-orthogonal signal deconvolution (ICA-OSD), an algorithm based on blind source separation and distributed in an R package called osd, to extract the spectra of the compounds appearing in GCx GC-MS chromatograms in an automated manner. We studied the performance of ICA-OSD by the quantification of 38 metabolites through a set of 20 Jurkat cell samples analyzed by GCx GC-MS. The quantification by ICA-OSD was compared with a supervised quantification by selective ions, and most of the R2 coefficients of determination were in good agreement (R-2>0.90) while up to 24 cases exhibited an excellent linear relation (R-2>0.95). We concluded that ICA-OSD can be used to resolve co-eluted compounds in GC x GC-MS. (C) 2016 Elsevier Ireland Ltd. All rights reserved.Postprint (author's final draft

    Avoiding hard chromatographic segmentation: A moving window approach for the automated resolution of gas chromatography–mass spectrometry-based metabolomics signals by multivariate methods

    No full text
    Gas chromatography–mass spectrometry (GC–MS) produces large and complex datasets characterized by co-eluted compounds and at trace levels, and with a distinct compound ion-redundancy as a result of the high fragmentation by the electron impact ionization. Compounds in GC–MS can be resolved by taking advantage of the multivariate nature of GC–MS data by applying multivariate resolution methods. However, multivariate methods have to be applied in small regions of the chromatogram, and therefore chromatograms are segmented prior to the application of the algorithms. The automation of this segmentation process is a challenging task as it implies separating between informative data and noise from the chromatogram. This study demonstrates the capabilities of independent component analysis–orthogonal signal deconvolution (ICA–OSD) and multivariate curve resolution–alternating least squares (MCR–ALS) with an overlapping moving window implementation to avoid the typical hard chromatographic segmentation. Also, after being resolved, compounds are aligned across samples by an automated alignment algorithm. We evaluated the proposed methods through a quantitative analysis of GC–qTOF MS data from 25 serum samples. The quantitative performance of both moving window ICA–OSD and MCR–ALS-based implementations was compared with the quantification of 33 compounds by the XCMS package. Results shown that most of the R2 coefficients of determination exhibited a high correlation (R2 > 0.90) in both ICA–OSD and MCR–ALS moving window-based approaches.Postprint (author's final draft

    Avoiding hard chromatographic segmentation: A moving window approach for the automated resolution of gas chromatography–mass spectrometry-based metabolomics signals by multivariate methods

    No full text
    Gas chromatography–mass spectrometry (GC–MS) produces large and complex datasets characterized by co-eluted compounds and at trace levels, and with a distinct compound ion-redundancy as a result of the high fragmentation by the electron impact ionization. Compounds in GC–MS can be resolved by taking advantage of the multivariate nature of GC–MS data by applying multivariate resolution methods. However, multivariate methods have to be applied in small regions of the chromatogram, and therefore chromatograms are segmented prior to the application of the algorithms. The automation of this segmentation process is a challenging task as it implies separating between informative data and noise from the chromatogram. This study demonstrates the capabilities of independent component analysis–orthogonal signal deconvolution (ICA–OSD) and multivariate curve resolution–alternating least squares (MCR–ALS) with an overlapping moving window implementation to avoid the typical hard chromatographic segmentation. Also, after being resolved, compounds are aligned across samples by an automated alignment algorithm. We evaluated the proposed methods through a quantitative analysis of GC–qTOF MS data from 25 serum samples. The quantitative performance of both moving window ICA–OSD and MCR–ALS-based implementations was compared with the quantification of 33 compounds by the XCMS package. Results shown that most of the R2 coefficients of determination exhibited a high correlation (R2 > 0.90) in both ICA–OSD and MCR–ALS moving window-based approaches

    Automated resolution of chromatographic signals by independent component analysis-orthogonal signal deconvolution in comprehensive gas chromatography/mass spectrometry-based metabolomics

    No full text
    Comprehensive gas chromatography-mass spectrometry (GC x GC-MS) provides a different perspective in metabolomics profiling of samples. However, algorithms for GCx GC-MS data processing are needed in order to automatically process the data and extract the purest information about the compounds appearing in complex biological samples. This study shows the capability of independent component analysis-orthogonal signal deconvolution (ICA-OSD), an algorithm based on blind source separation and distributed in an R package called osd, to extract the spectra of the compounds appearing in GCx GC-MS chromatograms in an automated manner. We studied the performance of ICA-OSD by the quantification of 38 metabolites through a set of 20 Jurkat cell samples analyzed by GCx GC-MS. The quantification by ICA-OSD was compared with a supervised quantification by selective ions, and most of the R2 coefficients of determination were in good agreement (R-2>0.90) while up to 24 cases exhibited an excellent linear relation (R-2>0.95). We concluded that ICA-OSD can be used to resolve co-eluted compounds in GC x GC-MS. (C) 2016 Elsevier Ireland Ltd. All rights reserved

    Computational Expansion of High-Resolution-MSn Spectral Libraries

    No full text
    <p>Commonly, in MS-based untargeted metabolomics, some metabolites cannot be confidently identified due to ambiguities in resolving isobars and structurally similar species. To address this, analytical techniques beyond traditional MS2 analysis, such as MSn fragmentation, can be applied to probe metabolites for additional structural information. In MSn fragmentation, recursive cycles of activation are applied to fragment ions originating from the same precursor ion detected on an MS1 spectrum. This resonant-type collision-activated dissociation (CAD) can yield information that cannot be ascertained from MS2 spectra alone, which helps improve the performance of metabolite identification workflows. However, most approaches for metabolite identification require mass-to-charge (<i>m</i>/<i>z</i>) values measured with high resolution, as this enables the determination of accurate mass values. Unfortunately, high-resolution-MSn spectra are relatively rare in spectral libraries. Here, we describe a computational approach to generate a database of high-resolution-MSn spectra by converting existing low-resolution-MSn spectra using complementary high-resolution-MS2 spectra generated by beam-type CAD. Using this method, we have generated a database, derived from the NIST20 MS/MS database, of MSn spectral trees representing 9637 compounds and 19386 precursor ions where at least 90% of signal intensity was converted from low-to-high resolution.</p><p>The GLOMICAVE project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 952908. </p&gt

    Compound identification in gas chromatography/mass spectrometry-based metabolomics by blind source separation

    No full text
    Metabolomics GC-MS samples involve high complexity data that must be effectively resolved to produce chemically meaningful results. Multivariate curve resolution-alternating least squares (MCR-ALS) is the most frequently reported technique for that purpose. More recently, independent component analysis (ICA) has been reported as an alternative to MCR. Those algorithms attempt to infer a model describing the observed data and, therefore, the least squares regression used in MCR assumes that the data is a linear combination of that model. However, due to the high complexity of real data, the construction of a model to describe optimally the observed data is a critical step and these algorithms should prevent the influence from outlier data. This study proves independent component regression (ICR) as an alternative for GC-MS compound identification. Both ICR and MCR though require least squares regression to correctly resolve the mixtures. In this paper, a novel orthogonal signal deconvolution (OSD) approach is introduced, which uses principal component analysis to determine the compound spectra. The study includes a compound identification comparison between the results by ICA-OSD, MCR-OSD, ICR and MCR-ALS using pure standards and human serum samples. Results shows that ICR may be used as an alternative to multivariate curve methods, as ICR efficiency is comparable to MCR-ALS. Also, the study demonstrates that the proposed OSD approach achieves greater spectral resolution accuracy than the traditional least squares approach when compounds elute under undue interference of biological matrices. (C) 2015 Elsevier B.V. All rights reserved

    Enhanced Electrospray In-source Fragmentation for Higher Sensitivity Data Independent Acquisition and Autonomous METLIN Molecular Identification

    No full text
    Electrospray ionization (ESI) in-source fragmentation (ISF) has traditionally been minimized to promote precursor molecular ion formation, and therefore its value in molecular identification underappreciated. Recently a METLIN-guided in-source annotation (MISA) algorithm was introduced to increase confidence in putative identifications by using ubiquitous in-source fragments. However, MISA is limited by ESI sources that are generally designed to minimize ISF. In this study, enhanced ISF with MISA (eMISA) was created by tuning the ISF conditions to generate in-source fragmentation patterns comparable with higher energy fragments generated at higher collision energies as deposited in the METLIN MS/MS library, without compromising the intensity of precursor ions (median loss ≤ 10% in both positive and negative ionization modes). The analysis of 50 molecules was used to validate the approach in comparison to MS/MS spectra produced via data dependent acquisition (DDA) and data independent acquisition mode (DIA) with quadrupole time-of-flight mass spectrometry (QTOF-MS). Enhanced ISF as compared to QTOF DDA, enables for higher peak intensities for the precursor ions (median: 18 times at negative mode and 210 times at positive mode), with the eMISA fragmentation patterns consistent with METLIN for over 90% of the molecules with respect to fragment relative intensity and m/z. eMISA also provides higher peak intensity as opposed to QTOF DIA with a median increase of 20% at negative mode and 80% at positive mode for all precursor ions. Metabolite identification with eMISA was also successfully validated from the analysis of a metabolic extract from macrophages. An interesting side benefit of enhanced ISF is that it significantly improved the compound identification confidence with low resolution single quadrupole mass spectrometry-based untargeted LC/MS experiments. Overall, enhanced ISF allowed for eMISA to be used as a more sensitive alternative to other QTOF DIA and DDA approaches, and further, it enables the acquisition of ESI TOF and ESI single quadrupole mass spectrometry instrumentation spectra with higher sensitivity and improved molecular identification confidence
    corecore