36 research outputs found

    Development and Validation of a Computational Model Ensemble for the Early Detection of BCRP/ABCG2 Substrates during the Drug Design Stage

    Get PDF
    Breast Cancer Resistance Protein (BCRP) is an ATP-dependent efflux transporter linked to the multidrug resistance phenomenon in many diseases such as epilepsy and cancer and a potential source of drug interactions. For these reasons, the early identification of substrates and nonsubstrates of this transporter during the drug discovery stage is of great interest. We have developed a computational nonlinear model ensemble based on conformational independent molecular descriptors using a combined strategy of genetic algorithms, J48 decision tree classifiers, and data fusion. The best model ensemble consists in averaging the ranking of the 12 decision trees that showed the best performance on the training set, which also demonstrated a good performance for the test set. It was experimentally validated using the ex vivo everted rat intestinal sac model. Five anticonvulsant drugs classified as nonsubstrates for BRCP by the model ensemble were experimentally evaluated, and none of them proved to be a BCRP substrate under the experimental conditions used, thus confirming the predictive ability of the model ensemble. The model ensemble reported here is a potentially valuable tool to be used as an in silico ADME filter in computer-aided drug discovery campaigns intended to overcome BCRP-mediated multidrug resistance issues and to prevent drug−drug interactions.Facultad de Ciencias ExactasLaboratorio de Investigación y Desarrollo de Bioactivo

    Development of Conformation Independent Computational Models for the Early Recognition of Breast Cancer Resistance Protein Substrates

    Get PDF
    ABC efflux transporters are polyspecific members of the ABC superfamily that, acting as drug and metabolite carriers, provide a biochemical barrier against drug penetration and contribute to detoxification. Their overexpression is linked tomultidrug resistance issues in a diversity of diseases. Breast cancer resistance protein (BCRP) is the most expressed ABC efflux transporter throughout the intestine and the blood-brain barrier, limiting oral absorption and brain bioavailability of its substrates. Early recognition of BCRP substrates is thus essential to optimize oral drug absorption, design of novel therapeutics for central nervous systemconditions, and overcome BCRP-mediated cross-resistance issues. We present the development of an ensemble of ligand-based machine learning algorithms for the early recognition of BCRP substrates, from a database of 262 substrates and nonsubstrates compiled from the literature. Such dataset was rationally partitioned into training and test sets by application of a 2-step clustering procedure. The models were developed through application of linear discriminant analysis to randomsubsamples ofDragonmolecular descriptors. Simple data fusion and statistical comparison of partial areas under the curve of ROC curves were applied to obtain the best 2-model combination, which presented 82% and 74.5% of overall accuracy in the training and test set, respectively.Facultad de Ciencias Exacta

    Development of Conformation Independent Computational Models for the Early Recognition of Breast Cancer Resistance Protein Substrates

    Get PDF
    ABC efflux transporters are polyspecific members of the ABC superfamily that, acting as drug and metabolite carriers, provide a biochemical barrier against drug penetration and contribute to detoxification. Their overexpression is linked tomultidrug resistance issues in a diversity of diseases. Breast cancer resistance protein (BCRP) is the most expressed ABC efflux transporter throughout the intestine and the blood-brain barrier, limiting oral absorption and brain bioavailability of its substrates. Early recognition of BCRP substrates is thus essential to optimize oral drug absorption, design of novel therapeutics for central nervous systemconditions, and overcome BCRP-mediated cross-resistance issues. We present the development of an ensemble of ligand-based machine learning algorithms for the early recognition of BCRP substrates, from a database of 262 substrates and nonsubstrates compiled from the literature. Such dataset was rationally partitioned into training and test sets by application of a 2-step clustering procedure. The models were developed through application of linear discriminant analysis to randomsubsamples ofDragonmolecular descriptors. Simple data fusion and statistical comparison of partial areas under the curve of ROC curves were applied to obtain the best 2-model combination, which presented 82% and 74.5% of overall accuracy in the training and test set, respectively.Facultad de Ciencias Exacta

    Development of Conformation Independent Computational Models for the Early Recognition of Breast Cancer Resistance Protein Substrates

    Get PDF
    ABC efflux transporters are polyspecific members of the ABC superfamily that, acting as drug and metabolite carriers, provide a biochemical barrier against drug penetration and contribute to detoxification. Their overexpression is linked tomultidrug resistance issues in a diversity of diseases. Breast cancer resistance protein (BCRP) is the most expressed ABC efflux transporter throughout the intestine and the blood-brain barrier, limiting oral absorption and brain bioavailability of its substrates. Early recognition of BCRP substrates is thus essential to optimize oral drug absorption, design of novel therapeutics for central nervous systemconditions, and overcome BCRP-mediated cross-resistance issues. We present the development of an ensemble of ligand-based machine learning algorithms for the early recognition of BCRP substrates, from a database of 262 substrates and nonsubstrates compiled from the literature. Such dataset was rationally partitioned into training and test sets by application of a 2-step clustering procedure. The models were developed through application of linear discriminant analysis to randomsubsamples ofDragonmolecular descriptors. Simple data fusion and statistical comparison of partial areas under the curve of ROC curves were applied to obtain the best 2-model combination, which presented 82% and 74.5% of overall accuracy in the training and test set, respectively.Facultad de Ciencias Exacta

    Integrated Application of Enhanced Replacement Method and Ensemble Learning for the Prediction of BCRP/ABCG2 Substrates

    Get PDF
    Breast Cancer Resistance Protein (BCRP or ABCG2) is a polyspecific efflux-transporter which belongs to the ATP-binding Cassette superfamily. Up-regulation of BCRP is associated to multi-drug resistance in a number of conditions, e.g. cancer and epilepsy. Recent proteomic studies show that high-expression levels of BCRP are found in healthy human intestine and at the blood-brain barrier, limiting the absorption and brain distribution of its substrates. Here, we have jointly applied the Enhanced Replacement Method and ensemble learning approaches to obtain combinations of 2D linear classifiers capable of discriminating among substrates and non-substrates of the wild type human BCRP. The best model ensemble obtained outperforms previously reported 2D linear classifiers, showing the ability of the Enhanced Replacement Method and ensemble learning schemes to optimize the performance of individual models. This is the first report of the Enhanced Replacement Method to solve classification problems.Facultad de Ciencias Exacta

    Machine Learning for Modelling Tissue Distribution of Drugs and the Impact of Transporters

    Get PDF
    The ability to predict human pharmacokinetics in early stages of drug development is of paramount importance to prevent late stage attrition as well as in managing toxicity. This thesis explores the machine learning modelling of one of the main pharmacokinetics parameters that determines the therapeutic success of a drug - volume of distribution. In order to do so, a variety of physiological phenomena with known mechanisms of impact on drug distribution were considered as input features during the modelling of volume of distribution namely, Solute Carriers-mediated uptake and ATP-binding Cassette-mediated efflux, drug-induced phospholipidosis and plasma protein binding. These were paired with molecular descriptors to provide both chemical and biological information to the building of the predictive models. Since biological data used as input is limited, prior to modelling volume of distribution, the various types of physiological descriptors were also modelled. Here, a focus was placed on harnessing the information contained in correlations within the two transporter families, which was done by using multi-label classification. The application of such approach to transporter data is very recent and its use to model Solute Carriers data, for example, is reported here for the first time. On both transporter families, there was evidence that accounting for correlations between transporters offers useful information that is not portrayed by molecular descriptors. This effort also allowed uncovering new potential links between members of the Solute Carriers family, which are not obvious from a purely physiological standpoint. The models created for the different physiological parameters were then used to predict these parameters and fill in the gaps in the available experimental data, and the resulting merging of experimental and predicted data was used to model volume of distribution. This exercise improved the accuracy of volume of distribution models, and the generated models incorporated a wide variety of the different physiological descriptors supplied along with molecular features. The use of most of these physiological descriptors in the modelling of distribution is unprecedented, which is one of the main novelty points of this thesis. Additionally, as a parallel complementary work, a new method to characterize the predictive reliability of machine learning classification model was proposed, and an in depth analysis of mispredictions, their trends and causes was carried out, using one of the transporter models as example. This is an important complement to the main body of work in this thesis, as predictive performance is necessarily tied to prediction reliability

    Documenting and predicting topic changes in Computers in Biology and Medicine: A bibliometric keyword analysis from 1990 to 2017

    Get PDF
    The Computers in Biology and Medicine (CBM) journal promotes the use of com-puting machinery in the fields of bioscience and medicine. Since the first volume in 1970, the importance of computers in these fields has grown dramatically, this is evident in the diversification of topics and an increase in the publication rate. In this study, we quantify both change and diversification of topics covered in CBM. This is done by analysing the author supplied keywords, since they were electronically captured in 1990. The analysis starts by selecting 40 keywords, related to Medical (M) (7), Data (D)(10), Feature (F) (17) and Artificial Intelligence (AI) (6) methods. Automated keyword clustering shows the statistical connection between the selected keywords. We found that the three most popular topics in CBM are: Support Vector Machine (SVM), Elec-troencephalography (EEG) and IMAGE PROCESSING. In a separate analysis step, we bagged the selected keywords into sequential one year time slices and calculated the normalized appearance. The results were visualised with graphs that indicate the CBM topic changes. These graphs show that there was a transition from Artificial Neural Network (ANN) to SVM. In 2006 SVM replaced ANN as the most important AI algo-rithm. Our investigation helps the editorial board to manage and embrace topic change. Furthermore, our analysis is interesting for the general reader, as the results can help them to adjust their research directions

    Genetic predictors for epilepsy development, treatment response and dosing

    Get PDF
    Antiepileptic drug (AED) treatment is the first line strategy for seizure control in the majority of individuals with epilepsy but remains challenging, not least because of interindividual variability in efficacy, tolerability and dosing. The studies presented in this thesis set out to explore that variability from a genomic perspective in patients with newly diagnosed epilepsy from across the UK. Single nucleotide polymorphisms (SNPs) in genes encoding drug metabolising enzymes (DMEs) may be associated with the dose of carbamazepine (CBZ) required for seizure control. A cohort of 159 individuals who were seizure-free for 12 months on a stable dose of CBZ monotherapy was genotyped for 51 SNPs across six DMEs. Haplotype analysis identified 8 haplotype blocks across the genes. No single SNPs or haplotype blocks were associated with CBZ dose. Thus, it is unlikely that genetic variability in DMEs accounts for the individual differences in CBZ dose requirement. A splice site SNP (rs3812718) in the SCN1A gene was previously shown to influence maximum doses of AEDs. This SNP was genotyped in 817 patients and tested for association with maximum and maintenance doses of several AEDs. An association was identified between rs3812718 and maximum AED dose, with an interaction analysis suggestive of a drug specific effect. These findings suggest that this SCN1A variant contributes to variability in the limit of tolerability to AEDs. Response to AED treatment is multifactorial and likely to be influenced by multiple genes. Five SNPs previously reported to predict treatment outcome in epilepsy were genotyped in 772 patients and the resulting data, together with data from an Australian cohort, incorporated into a predictive algorithm. The algorithm failed to predict treatment outcome in general but was partially successful in identifying responders to CBZ and valproate. These five SNPs may be relevant to the prognosis of epilepsy, particularly when treated with specific AEDs. Primary generalised epilepsies (PGEs) are highly heritable and believed to be polygenic in origin. Predictive algorithms were employed to explore genetic influences on seizure (absence vs. myoclonus) and epilepsy (PGE vs. focal) type using 1,840 SNP genotypes available from 436 patients with PGE. Although the algorithms failed to distinguish PGE patients on the basis of genetic variants, they showed improved association over univariate methods of analysis. Such an approach may be suitable for future investigations using large genomic datasets. A recent genome-wide association study identified multiple genetic variants that approached genome-wide significance for association with 12 month remission from seizures. Five of these SNPs were genotyped in an independent cohort of 424 patients and tested for association with remission and time to remission. No significant associations were found, questioning the validity of the original observation or the method of replication. Further work is required to understand this outcome. In conclusion, the genetic bases of epilepsy, AED response and AED dose requirement are multigenic and thus far undetectable using traditional association studies in modestly-sized patient cohorts. Further advances in genomic, bioinformatics and statistical methodologies are required before the genetic contribution to heterogeneity in epilepsy-related phenotypes can be translated into improved clinical care
    corecore