12 research outputs found

    Pharmacophore and Molecular Docking-Based Virtual Screening of B-Cell Lymphoma 2 (BCL 2) Inhibitor from Zinc Natural Database as Anti-Small Cell Lung Cancer

    Get PDF
    Cancer is a disease involving genetic factors in its pathogenesis. The increase of cell survival as a result of genetic changes, which prevent apoptosis such as Bcl2 (B-cell lymphoma-2) activation, will cause the tumor to grow. The overexpression of Bcl2 in small cell lung cancer should be inhibited. This study aims to screen natural products that can inhibit Bcl2 overexpression in lung cancer using pharmacophore- and molecular docking-based virtual screening to ZINC Natural Product database. The validation of pharmacophore-based virtual screening to the three features of the pharmacophore model (2 hydrophobic interactions and 1 hydrogen bond donor) showed that the AUC, EF, Se, Sp, ACC, and GH values were 0.57, 3.8, 0.101, 0.957, 0.936, and 0.149, respectively. On the other hand, the validation of molecular docking-based virtual screening showed that the RMSD values of Vina Wizard and AutoDock Wizard were 1.3Å and 1.9Å, respectively. The pharmacophore model virtual screening first obtained 6,615 compounds, and then the molecular docking-based virtual screening finally gained 255 compounds whose values of ΔG and Ki were lower than those of the native ligand. It was concluded that the virtual screening could yield as many as 255 potential anti-lung cancer drug candidates. Keywords: B-cell lymphoma 2 inhibitors, molecular docking, pharmacophore modeling, virtual screenin

    Visual and computational analysis of structure-activity relationships in high-throughput screening data

    Get PDF
    Novel analytic methods are required to assimilate the large volumes of structural and bioassay data generated by combinatorial chemistry and high-throughput screening programmes in the pharmaceutical and agrochemical industries. This paper reviews recent work in visualisation and data mining that can be used to develop structure-activity relationships from such chemical/biological datasets

    Fusion of molecular representations and prediction of biological activity using convolutional neural network and transfer learning

    Get PDF
    Basic structural features and physicochemical properties of chemical molecules determine their behaviour during chemical, physical, biological and environmental processes and hence need to be investigated for determining and modelling the actions of the molecule. Computational approaches such as machine learning methods are alternatives to predict physiochemical properties of molecules based on their structures. However, limited accuracy and error rates of these predictions restrict their use. This study developed three classes of new methods based on deep learning convolutional neural network for bioactivity prediction of chemical compounds. The molecules are represented as a convolutional neural network (CNN) with new matrix format to represent the molecular structures. The first class of methods involved the introduction of three new molecular descriptors, namely Mol2toxicophore based on molecular interaction with toxicophores features, Mol2Fgs based on distributed representation for constructing abstract features maps of a selected set of small molecules, and Mol2mat, which is a molecular matrix representation adapted from the well-known 2D-fingerprint descriptors. The second class of methods was based on merging multi-CNN models that combined all the molecular representations. The third class of methods was based on automatic learning of features using values within the neurons of the last layer in the proposed CNN architecture. To evaluate the performance of the methods, a series of experiments were conducted using two standard datasets, namely MDL Drug Data Report (MDDR) and Sutherland datasets. The MDDR datasets comprised 10 homogeneous and 10 heterogeneous activity classes, whilst Sutherland datasets comprised four homogeneous activity classes. Based on the experiments, the Mol2toxicophore showed satisfactory prediction rates of 92% and 80% for homogeneous and heterogeneous activity classes, respectively. The Mol2Fgs was better than Mol2toxicophore with prediction accuracy result of 95% for homogeneous and 90% for heterogeneous activity classes. The Mol2mat molecular representation had the highest prediction accuracy with 97% and 94% for homogeneous and heterogeneous datasets, respectively. The combined multi-CNN model leveraging on the knowledge acquired from the three molecular presentations produced better accuracy rate of 99% for the homogeneous and 98% for heterogeneous datasets. In terms of molecular similarity measure, use of the values in the neurons of the last hidden layer as the automatically learned feature in the multi-CNN model as a novel molecular learning representation was found to perform well with 88.6% in terms of average recall value in 5% structures most similar to the target search. The results have demonstrated that the newly developed methods can be effectively used for bioactivity prediction and molecular similarity searching

    In sillico μελέτες αναστολέων της μεταλλαγμένης ογκοπρωτεΐνης BRAFV600E

    Get PDF
    Το σηματοδοτικό μονοπάτι MAPK το οποίο επηρεάζει τον κυτταρικό πολλαπλασιασμό, την απόπτωση, την διαφοροποίηση και την επιβίωση έχει προκαλέσει το ενδιαφέρον στον τομέα της έρευνας κατά του καρκίνου αφού μεταλλάξεις σε κομβικές πρωτεΐνες του διαταράσσουν την κανονική λειτουργία του με αποτέλεσμα τη δημιουργία νεοπλασιών. Η BRAF είναι μία πρωτεϊνική κινάση σερίνης θρεονίνης και ανήκει στην οικογένεια των RAF πρωτεϊνών που αποτελούν ρυθμιστές στο μονοπάτιMAPK. Μεταλλάξεις που οδηγούν σε συνεχή ενεργοποίηση του BRAF εντοπίζονται σε διαφόρους τύπους καρκίνου όπως το μελάνωμα (50%), ο καρκίνος του θυρεοειδούς (35-70%), ο καρκίνος του παχέoς εντέρου (5-20%), ο καρκίνος του ήπατος (~14%) και ο καρκίνος των ωοθηκών (~30%). Η μετάλλαξη BRAFV600E είναι η πιο συχνή μετάλλαξη οδηγώντας σε υπερενεργοποίηση του μονοπατιού και σε καρκινογένεση. Το Vemurafenib (Zelboraf) και το Dabrafenib (Tafinlar) αποτελούν τα δύο εγκεκριμένα φάρμακα, εκλεκτικοί αναστολείς της BRAFV600E που χορηγούνται για τη θεραπεία μεταστατικού ή μη χειρουργήσιμου μελανώματος που φέρει τη συγκεκριμένη μετάλλαξη. Η αποτελεσματικότητά τους όμως είναι περιορισμένη λόγω εμφάνισης αντίστασης αλλά και της επαγωγής νέων καρκίνων μέσω της παράδοξης ενεργοποίησης του MAPK μονοπατιού σε wt-BRAF κύτταρα που φέρουν ογκογενή μετάλλαξη σε πρωτεΐνες που προηγούνται της BRAF στο μονοπάτι (RAS, υποδοχείς κινάσες τυροσίνης). Αν και η χρήση συνδυαστικών θεραπειών έχει δώσει ορισμένα καλά αποτελέσματα, πρόσφατα αναπτύχθηκε μια νέα γενιά εκλεκτικών αναστολέων της BRAFV600E (PLX7904 και PLX8394) που φαίνεται να διαφεύγουν της παράδοξης ενεργοποίησης του MAPK (paradox breakers). Επιπλέον, οι αναστολείς αυτοί παρουσιάζουν αποτελεσματικότητα έναντι αρκετών μηχανισμών που εμφανίζουν ανθεκτικότητα και επί του παρόντος είναι σε στάδιο κλινικών μελετών. Στην συγκεκριμένη διπλωματική εργασία διενεργήθηκε εικονική σάρωση με την χρήση φαρμακοφόρων μοντέλων και τεχνικές μοριακής πρόσδεσης (molecular docking) στοχεύοντας στην εύρεση νέων αναστολέων της BRAFV600E με βελτιωμένο βιολογικό προφίλ. Η δημιουργία των φαρμακοφόρων μοντέλων στόχευσε στην διατήρηση των ισχυρότερων στοιχείων (features) των φαρμακοφόρων μοντέλων που δημιουργήθηκαν με βάση τo φάρμακο Dabrafenib και το paradox breaker plx7904. Τα φαρμακοφόρα μοντέλα επικυρώθηκαν μέσω μιας βιβλιοθήκη ενεργών και ανενεργών μορίων από την βάση δεδομένων ChEMBL και στη συνέχεια εφαρμόστηκαν στην εικονική σάρωση της βιβλιοθήκη μορίων ZINC (~12 εκατ. μόρια). Οι ενώσεις που επιλέχθηκαν με βάση τη βέλτιστη προσαρμογή τους στα φαρμακοφόρα μοντέλα ελέχθησαν ως προς την insilico πρόσδεσή τους στο ενεργό κέντρο της BRAFV600E με χρήση μιας σειράς αλγορίθμων μοριακής πρόσδεσης (Glide HTVS και SP και πρωτόκολλο IFD) ενώ στα φίλτρα που εφαρμόστηκαν συμπεριελήφθηκαι η μελέτη των ADME ιδιοτήτων τους. Η διαδικασία προέκρινε μια τελική ομάδα ενώσεων με δυνατότητα ανάπτυξης ισχυρών αλληλεπιδράσεων με τα κρίσιμα αμινοξέα του ενεργού κέντρου της πρωτεΐνης και ικανοποιητικές ADME ιδιότητες. Η προμήθεια των ενώσεων θα επιτρέψει την in vitro αξιολόγηση της ανασταλτικής τους δράσης και εκλεκτικότητας έναντι της BRAFV600E ενώ τα πλέον υποσχόμενα μόρια θα δοκιμαστούν σε κυτταρικές σειρές για την αποτελεσματικότητά τους.The mitogen-activated protein kinase (MAPK) signaling pathway which affects cell proliferation, apoptosis, migration and differentiation has attracted the attention of anticancer research since abnormal activation of the pathway components is often identified in human cancers. BRAF belongs to the RAF family of serine/threonine protein kinases which are key regulators of the MAPK cascade. Activating BRAF mutations are harbored in certain cancers as in melanoma (50%), thyroid cancer (35-70%), colorectal cancer (5-20%), liver cancer (~14%) and ovarian cancer (~30%). BRAF-V600E is the most frequent mutation leading to multiple and uncontrolled amplification of downstream signal with tumorigenesis as a result. Two selective BRAFV600E inhibitors, Vemurafenib (Zelboraf) and Dabrafenib (Tafinlar), have been already approved for the treatment of unresectable and metastatic BRAF mutated melanoma. However, their efficacy is limited due to intrinsic resistance or the development of acquired resistance. Besides, in the context of wild-type BRAF cells bearing upstream activation (RAS, receptor tyrosine kinase), treatment with BRAF-V600E inhibitors leads to the paradoxical enhancement of MAPK signaling, resulting in enhancement of wt-tumour growth and adverse effects. For that reason combined treatments are being tested with very good clinical outcomes. A new generation of BRAF V600E inhibitors (PLX7904 and PLX8394), being capable of overcoming the MAPK paradoxical activation, has been discovered recently and are currently in clinical investigations. In this thesis, we have conducted a virtual screening approach ,utilizing structure based pharmacophore modeling and in silico docking, towards the identification of novel, selective BRAFV600E inhibitors which potentially could be less prone to resistance and avoid the paradox enhancement of MAPK pathway in wt-BRAF cells. Pharmacophore model generation was based on the top-ranked features extracted from the respective models originated from the crystal complexes of BRAFV600E with the paradox breaker PLX7904 (pdb: 4xv1) and Dabrafenib (pdb: 4xv2). We validated our models by utilizing a library of actives and inactives recovered by ChEMBL database. ZINC database (12M compounds) was queried against the generated pharmacophore models and the selected compounds based on the pharmacophore fit score we refiltered according to a defined set of physicochemical properties. . The filtered compounds were evaluated for their in silico binding at the BRAFV600E active site using Glide HTVS and SP and Induced Fit Docking protocol.The best ranked molecules were further analysed for their drug-likeness properties. The process qualified a final dataset of molecules capable of developing strong interactions with the crucial amino acids of the binding site,and predicted to bear a satisfactory ADME profile. Our future plans include the purchase of the qualified molecules and the in vitro assessment of their inhibitory activity and selectivity against BRAFV600E

    Computational Prediction and Experimental Validation of ADMET Properties for Potential Therapeutics

    Get PDF
    The drug development process in the United States is an expensive and lengthy process, usually taking a decade or more to gain approval for a drug candidate. The majority of proposed, early stage therapeutics fail, even though the typical process narrows from hundreds or thousands of small molecules down to one late stage candidate. One reason for failure is due to the drugs poor or unexpected absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties. Researchers attempt to predict ADMET properties as a way to help prioritize compounds for lead development to minimize expense and time. It was the overall goal of this project to further the prediction of two ADMET properties (absorption and distribution) through the development and application of quantitative structure-activity (QSAR) relationship computational models predicting human intestinal absorption (HIA), Caco-2 permeability (in vivo & in vitro measurements of absorption), and protein binding (measurement of distribution). These combined models would then be paired with additional experimental methods to help prioritize compounds for future ligand discovery efforts in our lab group and for our collaborators. Five computational QSAR models for each of these three properties were created using different molecular descriptor types and solvation models in an effort to examine which approach resulted in optimal performance. The model development process and validation stages of these QSAR models is outlined herein, along with analysis and discussion of commonly mispredicted compounds. Performance was similar across all models (independent of the molecular descriptor used and the solvation models applied. Future efforts at model development will depend on the size of the dataset to be analyzed. If the dataset is small, the i3D-Born solvation models will be used because these models better represent physiological conditions and performed slightly better than the other models. However, if the dataset is large, the 2D descriptor models will be used as these models do not require that a time and resource-intensive conformational search be performed and because it performed nearly as well as the i3D-Born solvation models. There were no common structural features consistently found associated with mispredicted structures. As such we are unable, at this time to pinpoint classes of compounds to avoid in future effortsThe experimental methods outlined in this work focused on developing methods to determine protein binding, specifically determining a fast, inexpensive workflow to classify the difference between high and low protein binding small molecules. Two techniques were used to determine protein binding of small molecules to bovine serum albumin (BSA): fluorescence polarization (FP) competition, and Nano Differential Scanning Fluorimetry (NanoDSF). FP assays quantifies the change in polarization of a target fluorophore between its protein bound and free states, an equilibrium that can be impacted by the presence of small molecule competitors. This method can be performed in a quantitative manner, but it also requires more time and more expensive and specialized instrumentation. In contrast, NanoDSF determines the melting temperature of BSA in the presence (higher) or in the absence (lower) small molecules by determining the intrinsic fluorescence of tryptophan and tyrosine residues while applying a temperature gradient. This method is qualitative, at least in our approach, but is very fast and requires much less expensive instrumentation. In our hands both techniques were successful in distinguishing differences between small molecules exhibiting low and high BSA binding. In summary, this project was successful in that we 1) developed computational tools capable of correctly predicting ADMET properties including HIA, Caco-2 permeability, and protein binding and 2) developed experimental workflows to quantitatively and qualitatively separate small molecules into low and high affinity BSA binders. With these in silico models and in vitro methods established, future research in our group and with our collaborators can make use of these tools to help prioritize compounds in ligand/ inhibitor discovery efforts

    Kernel Methods in Computer-Aided Constructive Drug Design

    Get PDF
    A drug is typically a small molecule that interacts with the binding site of some target protein. Drug design involves the optimization of this interaction so that the drug effectively binds with the target protein while not binding with other proteins (an event that could produce dangerous side effects). Computational drug design involves the geometric modeling of drug molecules, with the goal of generating similar molecules that will be more effective drug candidates. It is necessary that algorithms incorporate strategies to measure molecular similarity by comparing molecular descriptors that may involve dozens to hundreds of attributes. We use kernel-based methods to define these measures of similarity. Kernels are general functions that can be used to formulate similarity comparisons. The overall goal of this thesis is to develop effective and efficient computational methods that are reliant on transparent mathematical descriptors of molecules with applications to affinity prediction, detection of multiple binding modes, and generation of new drug leads. While in this thesis we derive computational strategies for the discovery of new drug leads, our approach differs from the traditional ligandbased approach. We have developed novel procedures to calculate inverse mappings and subsequently recover the structure of a potential drug lead. The contributions of this thesis are the following: 1. We propose a vector space model molecular descriptor (VSMMD) based on a vector space model that is suitable for kernel studies in QSAR modeling. Our experiments have provided convincing comparative empirical evidence that our descriptor formulation in conjunction with kernel based regression algorithms can provide sufficient discrimination to predict various biological activities of a molecule with reasonable accuracy. 2. We present a new component selection algorithm KACS (Kernel Alignment Component Selection) based on kernel alignment for a QSAR study. Kernel alignment has been developed as a measure of similarity between two kernel functions. In our algorithm, we refine kernel alignment as an evaluation tool, using recursive component elimination to eventually select the most important components for classification. We have demonstrated empirically and proven theoretically that our algorithm works well for finding the most important components in different QSAR data sets. 3. We extend the VSMMD in conjunction with a kernel based clustering algorithm to the prediction of multiple binding modes, a challenging area of research that has been previously studied by means of time consuming docking simulations. The results reported in this study provide strong empirical evidence that our strategy has enough resolving power to distinguish multiple binding modes through the use of a standard k-means algorithm. 4. We develop a set of reverse engineering strategies for QSAR modeling based on our VSMMD. These strategies include: (a) The use of a kernel feature space algorithm to design or modify descriptor image points in a feature space. (b) The deployment of a pre-image algorithm to map the newly defined descriptor image points in the feature space back to the input space of the descriptors. (c) The design of a probabilistic strategy to convert new descriptors to meaningful chemical graph templates. The most important aspect of these contributions is the presentation of strategies that actually generate the structure of a new drug candidate. While the training set is still used to generate a new image point in the feature space, the reverse engineering strategies just described allows us to develop a new drug candidate that is independent of issues related to probability distribution constraints placed on test set molecules
    corecore