1 research outputs found

    A Machine Learning Approach for Predicting Caco-2 Cell Permeability in Natural Products from the Biodiversity in Peru

    No full text
    Background: Peru is one of the most biodiverse countries in the world, which is reflected in its wealth of knowledge about medicinal plants. However, there is a lack of information regarding intestinal absorption and the permeability of natural products. The human colon adenocarcinoma cell line (Caco-2) is an in vitro assay used to measure apparent permeability. This study aims to develop a quantitative structure–property relationship (QSPR) model using machine learning algorithms to predict the apparent permeability of the Caco-2 cell in natural products from Peru. Methods: A dataset of 1817 compounds, including experimental log Papp values and molecular descriptors, was utilized. Six QSPR models were constructed: a multiple linear regression (MLR) model, a partial least squares regression (PLS) model, a support vector machine regression (SVM) model, a random forest (RF) model, a gradient boosting machine (GBM) model, and an SVM–RF–GBM model. Results: An evaluation of the testing set revealed that the MLR and PLS models exhibited an RMSE = 0.47 and R2 = 0.63. In contrast, the SVM, RF, and GBM models showcased an RMSE = 0.39–0.40 and R2 = 0.73–0.74. Notably, the SVM–RF–GBM model demonstrated superior performance, with an RMSE = 0.38 and R2 = 0.76. The model predicted log Papp values for 502 natural products falling within the applicability domain, with 68.9% (n = 346) showing high permeability, suggesting the potential for intestinal absorption. Additionally, we categorized the natural products into six metabolic pathways and assessed their drug-likeness. Conclusions: Our results provide insights into the potential intestinal absorption of natural products in Peru, thus facilitating drug development and pharmaceutical discovery efforts
    corecore