153 research outputs found

    Predicting P-Glycoprotein-Mediated Drug Transport Based On Support Vector Machine and Three-Dimensional Crystal Structure of P-glycoprotein

    Get PDF
    Human P-glycoprotein (P-gp) is an ATP-binding cassette multidrug transporter that confers resistance to a wide range of chemotherapeutic agents in cancer cells by active efflux of the drugs from cells. P-gp also plays a key role in limiting oral absorption and brain penetration and in facilitating biliary and renal elimination of structurally diverse drugs. Thus, identification of drugs or new molecular entities to be P-gp substrates is of vital importance for predicting the pharmacokinetics, efficacy, safety, or tissue levels of drugs or drug candidates. At present, publicly available, reliable in silico models predicting P-gp substrates are scarce. In this study, a support vector machine (SVM) method was developed to predict P-gp substrates and P-gp-substrate interactions, based on a training data set of 197 known P-gp substrates and non-substrates collected from the literature. We showed that the SVM method had a prediction accuracy of approximately 80% on an independent external validation data set of 32 compounds. A homology model of human P-gp based on the X-ray structure of mouse P-gp as a template has been constructed. We showed that molecular docking to the P-gp structures successfully predicted the geometry of P-gp-ligand complexes. Our SVM prediction and the molecular docking methods have been integrated into a free web server (http://pgp.althotas.com), which allows the users to predict whether a given compound is a P-gp substrate and how it binds to and interacts with P-gp. Utilization of such a web server may prove valuable for both rational drug design and screening

    Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine

    Get PDF
    Skin sensitization is the most commonly reported occupational illness, causing much suffering to a wide range of people. Identification and labeling of environmental allergens is urgently required to protect people from skin sensitization. The guinea pig maximization test (GPMT) and murine local lymph node assay (LLNA) are the two most important in vivo models for identification of skin sensitizers. In order to reduce the number of animal tests, quantitative structure-activity relationships (QSARs) are strongly encouraged in the assessment of skin sensitization of chemicals. This paper has investigated the skin sensitization potential of 162 compounds with LLNA results and 92 compounds with GPMT results using a support vector machine. A particle swarm optimization algorithm was implemented for feature selection from a large number of molecular descriptors calculated by Dragon. For the LLNA data set, the classification accuracies are 95.37% and 88.89% for the training and the test sets, respectively. For the GPMT data set, the classification accuracies are 91.80% and 90.32% for the training and the test sets, respectively. The classification performances were greatly improved compared to those reported in the literature, indicating that the support vector machine optimized by particle swarm in this paper is competent for the identification of skin sensitizers

    Predicting Binding to P-Glycoprotein by Flexible Receptor Docking

    Get PDF
    P-glycoprotein (P-gp) is an ATP-dependent transport protein that is selectively expressed at entry points of xenobiotics where, acting as an efflux pump, it prevents their entering sensitive organs. The protein also plays a key role in the absorption and blood-brain barrier penetration of many drugs, while its overexpression in cancer cells has been linked to multidrug resistance in tumors. The recent publication of the mouse P-gp crystal structure revealed a large and hydrophobic binding cavity with no clearly defined sub-sites that supports an “induced-fit” ligand binding model. We employed flexible receptor docking to develop a new prediction algorithm for P-gp binding specificity. We tested the ability of this method to differentiate between binders and nonbinders of P-gp using consistently measured experimental data from P-gp efflux and calcein-inhibition assays. We also subjected the model to a blind test on a series of peptidic cysteine protease inhibitors, confirming the ability to predict compounds more likely to be P-gp substrates. Finally, we used the method to predict cellular metabolites that may be P-gp substrates. Overall, our results suggest that many P-gp substrates bind deeper in the cavity than the cyclic peptide in the crystal structure and that specificity in P-gp is better understood in terms of physicochemical properties of the ligands (and the binding site), rather than being defined by specific sub-sites

    Računarski modeli za predviđanje transporta lekova posredovanog P-glikoproteinom

    Get PDF
    P-glycoprotein (Pgp) is a transmembrane transporter which can, by transporting structurally diverse compounds, influence the absorption, distribution and efficacy of a number of drugs. Pgp overexpression in cells is a major contributing factor to the development of drug resistance. For these reasons, potential for compound efflux by Pgp should be assessed early on in the drug discovery process, preferably even prior to compound synthesis. To meet this demand, numerous computational models have been developed during the past decade, capable of predicting Pgp-mediated transport based solely on chemical structures. This paper summarizes the various approaches that have been used for model development, discusses their advantages and disadvantages and focuses on key factors that influence model reliability. The promiscuous nature of the transport can be seen as a major challenge for most computational chemistry methods. Nevertheless, the attained level of accuracy of literature models suggests that they can be useful in the drug discovery setting. Greater availability of experimental data and integration of predictions made by different modeling methods has the potential to further improve the reliability of computational predictions.P-glikoprotein (Pgp) je transmembranski transporter koji, transportujući strukturno raznovrsne lekove iz unutraĆĄnjosti ćelije u ekstracelularnu sredinu, moĆŸe uticati na resorpciju, distribuciju i efikasnost većeg broja lekova. Prekomerna ekspresija Pgp-a u ćelijama predstavlja jedan od mehanizama razvoja rezistencije na lekove. Iz ovih razloga, potrebno je u ranoj fazi otkrića leka predvideti da li je potencijalni lek supstrat za Pgp, idealno i pre same sinteze. U tu svrhu, tokom poslednje decenije razvijen je veliki broj računarskih modela koji omogućavaju predviđanje transporta posredstvom Pgp-a samo na osnovu hemijske strukture. U ovom radu prikazan je pregled različitih pristupa koji su koriơćeni u razvoju modela, razmotrene su njihove prednosti i nedostaci, kao i faktori koji u najvećoj meri utiču na pouzdanost predviđanja. Polispecifičnost ovog transportera predstavlja značajan izazov za većinu metoda računarske hemije. Ipak, dostignut nivo tačnosti modela koji su prikazani u litearaturi ukazuje na činjenicu da oni mogu doprineti racionalizaciji procesa dizajniranja novih lekova. Ć ira dostupnost eksperimentalnih podataka, kao i kombinovanje različitih pristupa modelovanju transporta, mogu dodatno unaprediti postojeće modele

    Discovery of Novel Glycogen Synthase Kinase-3beta Inhibitors: Molecular Modeling, Virtual Screening, and Biological Evaluation

    Get PDF
    Glycogen synthase kinase-3 (GSK-3) is a multifunctional serine/threonine protein kinase which is engaged in a variety of signaling pathways, regulating a wide range of cellular processes. Due to its distinct regulation mechanism and unique substrate specificity in the molecular pathogenesis of human diseases, GSK-3 is one of the most attractive therapeutic targets for the unmet treatment of pathologies, including type-II diabetes, cancers, inflammation, and neurodegenerative disease. Recent advances in drug discovery targeting GSK-3 involved extensive computational modeling techniques. Both ligand/structure-based approaches have been well explored to design ATP-competitive inhibitors. Molecular modeling plus dynamics simulations can provide insight into the protein-substrate and protein-protein interactions at substrate binding pocket and C-lobe hydrophobic groove, which will benefit the discovery of non-ATP-competitive inhibitors. To identify structurally novel and diverse compounds that effectively inhibit GSK-3Ăą, we performed virtual screening by implementing a mixed ligand/structure-based approach, which included pharmacophore modeling, diversity analysis, and ensemble docking. The sensitivities of different docking protocols to the induced-fit effects at the ATP-competitive binding pocket of GSK-3Ăą have been explored. An enrichment study was employed to verify the robustness of ensemble docking compared to individual docking in terms of retrieving active compounds from a decoy dataset. A total of 24 structurally diverse compounds obtained from the virtual screening experiment underwent biological validation. The bioassay results shothat 15 out of the 24 hit compounds are indeed GSK-3Ăą inhibitors, and among them, one compound exhibiting sub-micromolar inhibitory activity is a reasonable starting point for further optimization. To further identify structurally novel GSK-3Ăą inhibitors, we performed virtual screening by implementing another mixed ligand-based/structure-based approach, which included quantitative structure-activity relationship (QSAR) analysis and docking prediction. To integrate and analyze complex data sets from multiple experimental sources, we drafted and validated hierarchical QSAR, which adopts a multi-level structure to take data heterogeneity into account. A collection of 728 GSK-3 inhibitors with diverse structural scaffolds were obtained from published papers of 7 research groups based on different experimental protocols. Support vector machines and random forests were implemented with wrapper-based feature selection algorithms in order to construct predictive learning models. The best models for each single group of compounds were then selected, based on both internal and external validation, and used to build the final hierarchical QSAR model. The predictive performance of the hierarchical QSAR model can be demonstrated by an overall R2 of 0.752 for the 141 compounds in the test set. The compounds obtained from the virtual screening experiment underwent biological validation. The bioassay results confirmed that 2 hit compounds are indeed GSK-3Ăą inhibitors exhibiting sub-micromolar inhibitory activity, and therefore validated hierarchical QSAR as an effective approach to be used in virtual screening experiments. We have successfully implemented a variant of supervised learning algorithm, named multiple-instance learning, in order to predict bioactive conformers of a given molecule which are responsible for the observed biological activity. The implementation requires instance-based embedding, and joint feature selection and classification. The goal of the present project is to implement multiple-instance learning in drug activity prediction, and subsequently to identify the bioactive conformers for each molecule. The proposed approach was proven not to suffer from overfitting and to be highly competitive with classical predictive models, so it is very powerful for drug activity prediction. The approach was also validated as a useful method for pursuit of bioactive conformers

    Development and application of distributed computing tools for virtual screening of large compound libraries

    Get PDF
    Im derzeitigen Drug Discovery Prozess ist die Identifikation eines neuen Targetproteins und dessen potenziellen Liganden langwierig, teuer und zeitintensiv. Die Verwendung von in silico Methoden gewinnt hier zunehmend an Bedeutung und hat sich als wertvolle Strategie zur Erkennung komplexer ZusammenhĂ€nge sowohl im Bereich der Struktur von Proteinen wie auch bei BioaktivitĂ€ten erwiesen. Die zunehmende Nachfrage nach Rechenleistung im wissenschaftlichen Bereich sowie eine detaillierte Analyse der generierten Datenmengen benötigen innovative Strategien fĂŒr die effiziente Verwendung von verteilten Computerressourcen, wie z.B. Computergrids. Diese Grids ergĂ€nzen bestehende Technologien um einen neuen Aspekt, indem sie heterogene Ressourcen zur VerfĂŒgung stellen und koordinieren. Diese Ressourcen beinhalten verschiedene Organisationen, Personen, Datenverarbeitung, Speicherungs- und Netzwerkeinrichtungen, sowie Daten, Wissen, Software und ArbeitsablĂ€ufe. Das Ziel dieser Arbeit war die Entwicklung einer universitĂ€tsweit anwendbaren Grid-Infrastruktur - UVieCo (University of Vienna Condor pool) -, welche fĂŒr die Implementierung von akademisch frei verfĂŒgbaren struktur- und ligandenbasierten Drug Discovery Anwendungen verwendet werden kann. Firewall- und Sicherheitsprobleme wurden mittels eines virtuellen privaten Netzwerkes gelöst, wohingegen die Virtualisierung der Computerhardware ĂŒber das CoLinux Konzept ermöglicht wurde. Dieses ermöglicht, dass unter Linux auszufĂŒhrende AuftrĂ€ge auf Windows Maschinen laufen können. Die EffektivitĂ€t des Grids wurde durch Leistungsmessungen anhand sequenzieller und paralleler Aufgaben ermittelt. Als Anwendungsbeispiel wurde die Assoziation der Expression bzw. der SensitivitĂ€tsprofile von ABC-Transportern mit den AktivitĂ€tsprofilen von Antikrebswirkstoffen durch Data-Mining des NCI (National Cancer Institute) Datensatzes analysiert. Die dabei generierten DatensĂ€tze wurden fĂŒr liganden-basierte Computermethoden wie Shape-Similarity und Klassifikationsalgorithmen mit dem Ziel verwendet, P-glycoprotein (P-gp) Substrate zu identifizieren und sie von Nichtsubstraten zu trennen. Beim Erstellen vorhersagekrĂ€ftiger Klassifikationsmodelle konnte das Problem der extrem unausgeglichenen Klassenverteilung durch Verwendung der „Cost-Sensitive Bagging“ Methode gelöst werden. Applicability Domain Studien ergaben, dass unser Modell nicht nur die NCI Substanzen gut vorhersagen kann, sondern auch fĂŒr wirkstoffĂ€hnliche MolekĂŒle verwendet werden kann. Die entwickelten Modelle waren relativ einfach, aber doch prĂ€zise genug um fĂŒr virtuelles Screening einer großen chemischen Bibliothek verwendet werden zu können. Dadurch könnten P-gp Substrate schon frĂŒhzeitig erkannt werden, was möglicherweise nĂŒtzlich sein kann zur Entfernung von Substanzen mit schlechten ADMET-Eigenschaften bereits in einer frĂŒhen Phase der Arzneistoffentwicklung. ZusĂ€tzlich wurden Shape-Similarity und Self-organizing Map Techniken verwendet um neue Substanzen in einer hauseigenen sowie einer großen kommerziellen Datenbank zu identifizieren, die Ă€hnlich zu selektiven Serotonin-Reuptake-Inhibitoren (SSRI) sind und Apoptose induzieren können. Die erhaltenen Treffer besitzen neue chemische Grundkörper und können als Startpunkte fĂŒr Leitstruktur-Optimierung in Betracht gezogen werden. Die in dieser Arbeit beschriebenen Studien werden nĂŒtzlich sein um eine verteilte Computerumgebung zu kreieren die vorhandene Ressourcen in einer Organisation nutzt, und die fĂŒr verschiedene Anwendungen geeignet ist, wie etwa die effiziente Handhabung der Klassifizierung von unausgeglichenen DatensĂ€tzen, oder mehrstufiges virtuelles Screening.In the current drug discovery process, the identification of new target proteins and potential ligands is very tedious, expensive and time-consuming. Thus, use of in silico techniques is of utmost importance and proved to be a valuable strategy in detecting complex structural and bioactivity relationships. Increased demands of computational power for tremendous calculations in scientific fields and timely analysis of generated piles of data require innovative strategies for efficient utilization of distributed computing resources in the form of computational grids. Such grids add a new aspect to the emerging information technology paradigm by providing and coordinating the heterogeneous resources such as various organizations, people, computing, storage and networking facilities as well as data, knowledge, software and workflows. The aim of this study was to develop a university-wide applicable grid infrastructure, UVieCo (University of Vienna Condor pool) which can be used for implementation of standard structure- and ligand-based drug discovery applications using freely available academic software. Firewall and security issues were resolved with a virtual private network setup whereas virtualization of computer hardware was done using the CoLinux concept in a way to run Linux-executable jobs inside Windows machines. The effectiveness of the grid was assessed by performance measurement experiments using sequential and parallel tasks. Subsequently, the association of expression/sensitivity profiles of ABC transporters with activity profiles of anticancer compounds was analyzed by mining the data from NCI (National Cancer Institute). The datasets generated in this analysis were utilized with ligand-based computational methods such as shape similarity and classification algorithms to identify and separate P-gp substrates from non-substrates. While developing predictive classification models, the problem of imbalanced class distribution was proficiently addressed using the cost-sensitive bagging approach. Applicability domain experiment revealed that our model not only predicts NCI compounds well, but it can also be applied to drug-like molecules. The developed models were relatively simple but precise enough to be applicable for virtual screening of large chemical libraries for the early identification of P-gp substrates which can potentially be useful to remove compounds of poor ADMET properties in an early phase of drug discovery. Additionally, shape-similarity and self-organizing maps techniques were used to screen in-house as well as a large vendor database for identification of novel selective serotonin reuptake inhibitor (SSRI) like compounds to induce apoptosis. The retrieved hits possess novel chemical scaffolds and can be considered as a starting point for lead optimization studies. The work described in this thesis will be useful to create distributed computing environment using available resources within an organization and can be applied to various applications such as efficient handling of imbalanced data classification problems or multistep virtual screening approach

    Integrated Application of Enhanced Replacement Method and Ensemble Learning for the Prediction of BCRP/ABCG2 Substrates

    Get PDF
    Breast Cancer Resistance Protein (BCRP or ABCG2) is a polyspecific efflux-transporter which belongs to the ATP-binding Cassette superfamily. Up-regulation of BCRP is associated to multi-drug resistance in a number of conditions, e.g. cancer and epilepsy. Recent proteomic studies show that high-expression levels of BCRP are found in healthy human intestine and at the blood-brain barrier, limiting the absorption and brain distribution of its substrates. Here, we have jointly applied the Enhanced Replacement Method and ensemble learning approaches to obtain combinations of 2D linear classifiers capable of discriminating among substrates and non-substrates of the wild type human BCRP. The best model ensemble obtained outperforms previously reported 2D linear classifiers, showing the ability of the Enhanced Replacement Method and ensemble learning schemes to optimize the performance of individual models. This is the first report of the Enhanced Replacement Method to solve classification problems.Facultad de Ciencias Exacta

    Investigation of BCRP-inhibitors using QSAR and machine learning methods

    Get PDF
    BCRP is the second member of the subfamily G of the ABC transporters. BCRP is involved in several physiological functions, including protection of the human body from xenobiotics. The overexpression of this membrane protein in certain tumor cell lines leads to cross-resistance against various chemotherapeutic drugs. In this work, the inhibitory activity of several compounds against BCRP and P-gp were assayed using a Hoechst 33342 assay for BCRP and a Calcein AM assay for P-gp. Furthermore, the potency of the studied compounds has been rationalized using a classical QSAR approach. Finally, three machine learning algorithms (Self-Organizing Maps, Support Vector Machine and k-Nearest Neighbors) were used, in order to generate a global model useful to predict if small ligands could be (or not) BCRP-inhibitors
    • 

    corecore