A robust and complete workflow for metabolic profiling and data mining was described in detail. Three independent and complementary analytical techniques for metabolic profiling were applied: hydrophilic interaction chromatography (HILIC–LC–ESI–MS), reversed-phase liquid chromatography (RP–LC–ESI–MS), and gas chromatography (GC–TOF–MS) all coupled to mass spectrometry (MS). Unsupervised methods, such as principle component analysis (PCA) and clustering, and supervised methods, such as classification and PCA-DA (discriminatory analysis) were used for data mining. Genetic Algorithms (GA), a multivariate approach, was probed for selection of the smallest subsets of potentially discriminative predictors. From thousands of peaks found in total, small subsets selected by GA were considered as highly potential predictors allowing discrimination among groups. It was found that small groups of potential top predictors selected with PCA-DA and GA are different and unique. Annotated GC–TOF–MS data generated identified feature metabolites. Metabolites putatively detected with LC–ESI–MS profiling require further elemental composition assignment with accurate mass measurement by Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) and structure elucidation by nuclear magnetic resonance spectroscopy (NMR). GA was also used to generate correlated networks for pathway analysis. Several case studies, comprising groups of plant samples bearing different genotypes and groups of samples of human origin, namely patients and healthy volunteers’ urine samples, demonstrated that such a workflow combining comprehensive metabolic profiling and advanced data mining techniques provides a powerful approach for pattern recognition and biomarker discover
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.