7 research outputs found

    Multi-TGDR, a multi-class regularization method, identifies the metabolic profiles of hepatocellular carcinoma and cirrhosis infected with hepatitis B or hepatitis C virus

    Get PDF
    BACKGROUND: Over the last decade, metabolomics has evolved into a mainstream enterprise utilized by many laboratories globally. Like other “omics” data, metabolomics data has the characteristics of a smaller sample size compared to the number of features evaluated. Thus the selection of an optimal subset of features with a supervised classifier is imperative. We extended an existing feature selection algorithm, threshold gradient descent regularization (TGDR), to handle multi-class classification of “omics” data, and proposed two such extensions referred to as multi-TGDR. Both multi-TGDR frameworks were used to analyze a metabolomics dataset that compares the metabolic profiles of hepatocellular carcinoma (HCC) infected with hepatitis B (HBV) or C virus (HCV) with that of cirrhosis induced by HBV/HCV infection; the goal was to improve early-stage diagnosis of HCC. RESULTS: We applied two multi-TGDR frameworks to the HCC metabolomics data that determined TGDR thresholds either globally across classes, or locally for each class. Multi-TGDR global model selected 45 metabolites with a 0% misclassification rate (the error rate on the training data) and had a 3.82% 5-fold cross-validation (CV-5) predictive error rate. Multi-TGDR local selected 48 metabolites with a 0% misclassification rate and a 5.34% CV-5 error rate. CONCLUSIONS: One important advantage of multi-TGDR local is that it allows inference for determining which feature is related specifically to the class/classes. Thus, we recommend multi-TGDR local be used because it has similar predictive performance and requires the same computing time as multi-TGDR global, but may provide class-specific inference

    To select relevant features for longitudinal gene expression data by extending a pathway analysis method [version 1; referees: 2 approved]

    Get PDF
    The emerging field of pathway-based feature selection that incorporates biological information conveyed by gene sets/pathways to guide the selection of relevant genes has become increasingly popular and widespread. In this study, we adapt a gene set analysis method – the significance analysis of microarray gene set reduction (SAMGSR) algorithm to carry out feature selection for longitudinal microarray data, and propose a pathway-based feature selection algorithm – the two-level SAMGSR method. By using simulated data and a real-world application, we demonstrate that a gene’s expression profiles over time can be considered as a gene set. Thus a suitable gene set analysis method can be utilized or modified to execute the selection of relevant genes for longitudinal omics data. We believe this work paves the way for more research to bridge feature selection and gene set analysis with the development of novel pathway-based feature selection algorithms

    [<sup>18</sup>F]fluorination of biorelevant arylboronic acid pinacol ester scaffolds synthesized by convergence techniques

    Get PDF
    Aim: The development of small molecules through convergent multicomponent reactions (MCR) has been boosted during the last decade due to the ability to synthesize, virtually without any side-products, numerous small drug-like molecules with several degrees of structural diversity.(1) The association of positron emission tomography (PET) labeling techniques in line with the “one-pot” development of biologically active compounds has the potential to become relevant not only for the evaluation and characterization of those MCR products through molecular imaging, but also to increase the library of radiotracers available. Therefore, since the [18F]fluorination of arylboronic acid pinacol ester derivatives tolerates electron-poor and electro-rich arenes and various functional groups,(2) the main goal of this research work was to achieve the 18F-radiolabeling of several different molecules synthesized through MCR. Materials and Methods: [18F]Fluorination of boronic acid pinacol esters was first extensively optimized using a benzaldehyde derivative in relation to the ideal amount of Cu(II) catalyst and precursor to be used, as well as the reaction solvent. Radiochemical conversion (RCC) yields were assessed by TLC-SG. The optimized radiolabeling conditions were subsequently applied to several structurally different MCR scaffolds comprising biologically relevant pharmacophores (e.g. β-lactam, morpholine, tetrazole, oxazole) that were synthesized to specifically contain a boronic acid pinacol ester group. Results: Radiolabeling with fluorine-18 was achieved with volumes (800 μl) and activities (≤ 2 GBq) compatible with most radiochemistry techniques and modules. In summary, an increase in the quantities of precursor or Cu(II) catalyst lead to higher conversion yields. An optimal amount of precursor (0.06 mmol) and Cu(OTf)2(py)4 (0.04 mmol) was defined for further reactions, with DMA being a preferential solvent over DMF. RCC yields from 15% to 76%, depending on the scaffold, were reproducibly achieved. Interestingly, it was noticed that the structure of the scaffolds, beyond the arylboronic acid, exerts some influence in the final RCC, with electron-withdrawing groups in the para position apparently enhancing the radiolabeling yield. Conclusion: The developed method with high RCC and reproducibility has the potential to be applied in line with MCR and also has a possibility to be incorporated in a later stage of this convergent “one-pot” synthesis strategy. Further studies are currently ongoing to apply this radiolabeling concept to fluorine-containing approved drugs whose boronic acid pinacol ester precursors can be synthesized through MCR (e.g. atorvastatin)
    corecore