8,828 research outputs found

    NOVEL ALGORITHMS AND TOOLS FOR LIGAND-BASED DRUG DESIGN

    Get PDF
    Computer-aided drug design (CADD) has become an indispensible component in modern drug discovery projects. The prediction of physicochemical properties and pharmacological properties of candidate compounds effectively increases the probability for drug candidates to pass latter phases of clinic trials. Ligand-based virtual screening exhibits advantages over structure-based drug design, in terms of its wide applicability and high computational efficiency. The established chemical repositories and reported bioassays form a gigantic knowledgebase to derive quantitative structure-activity relationship (QSAR) and structure-property relationship (QSPR). In addition, the rapid advance of machine learning techniques suggests new solutions for data-mining huge compound databases. In this thesis, a novel ligand classification algorithm, Ligand Classifier of Adaptively Boosting Ensemble Decision Stumps (LiCABEDS), was reported for the prediction of diverse categorical pharmacological properties. LiCABEDS was successfully applied to model 5-HT1A ligand functionality, ligand selectivity of cannabinoid receptor subtypes, and blood-brain-barrier (BBB) passage. LiCABEDS was implemented and integrated with graphical user interface, data import/export, automated model training/ prediction, and project management. Besides, a non-linear ligand classifier was proposed, using a novel Topomer kernel function in support vector machine. With the emphasis on green high-performance computing, graphics processing units are alternative platforms for computationally expensive tasks. A novel GPU algorithm was designed and implemented in order to accelerate the calculation of chemical similarities with dense-format molecular fingerprints. Finally, a compound acquisition algorithm was reported to construct structurally diverse screening library in order to enhance hit rates in high-throughput screening

    Visual and computational analysis of structure-activity relationships in high-throughput screening data

    Get PDF
    Novel analytic methods are required to assimilate the large volumes of structural and bioassay data generated by combinatorial chemistry and high-throughput screening programmes in the pharmaceutical and agrochemical industries. This paper reviews recent work in visualisation and data mining that can be used to develop structure-activity relationships from such chemical/biological datasets

    Syntax-directed documentation for PL360

    Get PDF
    PL360 is a phrase-structured programming language which provides the facilities of a symbolic machine language for the IBM 360 computers. An automatic process, syntax-directed documentation, is described which acquires programming documentation through the syntactical analysis of a program, followed by the interrogation of the originating programmer. This documentation can be dispensed through reports of file query replies when other programmers later need to know the program structure and its details. A key principle of the programming documentation process is that it is managed solely on the basis of the syntax of programs

    Intelligent data acquisition for drug design through combinatorial library design

    Get PDF
    A problem that occurs in machine learning methods for drug discovery is aneed for standardized data. Methods and interest exist for producing new databut due to material and budget constraints it is desirable that each iteration ofproducing data is as efficient as possible. In this thesis, we present two papersmethods detailing different problems for selecting data to produce. We invest-igate Active Learning for models that use the margin in model decisiveness tomeasure the model uncertainty to guide data acquisition. We demonstrate thatthe models perform better with Active Learning than with random acquisitionof data independent of machine learning model and starting knowledge. Wealso study the multi-objective optimization problem of combinatorial librarydesign. Here we present a framework that could process the output of gener-ative models for molecular design and give an optimized library design. Theresults show that the framework successfully optimizes a library based onmolecule availability, for which the framework also attempts to identify usingretrosynthesis prediction. We conclude that the next step in intelligent dataacquisition is to combine the two methods and create a library design modelthat use the information of previous libraries to guide subsequent designs

    tackling malaria

    Get PDF
    Malaria is an infectious disease that affects over 216 million people worldwide, killing over 445,000 patients annually. Due to the constant emergence of parasitic resistance to the current antimalarial drugs, the discovery of new drug candidates is a major global health priority. Aiming to make the drug discovery processes faster and less expensive, we developed binary and continuous Quantitative Structure-Activity Relationships (QSAR) models implementing deep learning for predicting antiplasmodial activity and cytotoxicity of untested compounds. Then, we applied the best models for a virtual screening of a large database of chemical compounds. The top computational predictions were evaluated experimentally against asexual blood stages of both sensitive and multi-drug-resistant Plasmodium falciparum strains. Among them, two compounds, LabMol-149 and LabMol-152, showed potent antiplasmodial activity at low nanomolar concentrations (EC50 <500 nM) and low cytotoxicity in mammalian cells. Therefore, the computational approach employing deep learning developed here allowed us to discover two new families of potential next generation antimalarial agents, which are in compliance with the guidelines and criteria for antimalarial target candidates.publishersversionpublishe

    Development and Application of Virtual Screening Methods for G Protein-Coupled Receptors

    Full text link
    G protein-coupled receptors (GPCR) constitute one of the largest family of transmembrane proteins that have been implicated in a multitude of diseases, including cancer and diabetes, and have been an important target in drug deve lopment. While experiment-based high-throughput screening for the unearthing of novel chemical compounds remains the de facto standard for drug discovery, virtual screening has been gaining acceptance as an important complementary method due to its high speed and low cost, which instead employs computers. This dissertation is aimed at the development of virtual screening algorithms as applied to GPCR’s, in addition to the construction of GPCR-related databases (GPCR-EXP, GLASS). MAGELLAN is a ligand-based virtual screening algorithm that makes inferences about what a GPCR would potentially bind based on sequence- and structure-based alignments. Building on top of this work, a sequential virtual screening pipeline combining MAGELLAN with AutoDock Vina was constructed for the discovery of novel, bifunctional opioids with mu opioid receptor (MOR) agonist and delta opioid receptor (DOR) antagonist activity. In the process of developing the virtual screening algorithms, two GPCR-related databases were constructed to provide necessary data for the study. GPCR-EXP is a database of experimentally-validated and predicted GPCR structures. Important features include semi-manual curation of data, weekly updates, a user-friendly web interface, and high-resolution structure models with GPCR-I-TASSER, which many of the other GPCR-related databases lack. Additionally, GLASS database was developed in response to the absence of databases dedicated to GPCR experimental data. As a result, pharmacological data was pooled and integrated into a single source, resulting in over 500,000 unique GPCR-ligand associations; this made it the most comprehensive database of its kind thus far, providing the community with an accessible web interface, freely-available data, and ligands ready for docking. MAGELLAN utilized pharmacological data from GLASS to infer from the ligands of sequence- and structure-based homologues what a target GPCR would bind. It was tested on two public virtual screening databases (DUD-E and GPCR-Bench) and achieved an average EF of 9.75 and 13.70, respectively, which compared favorably with AutoDock Vina (1.48/3.16), DOCK 6 (2.12/3.47), and PoLi (2.2). Lastly, case studies with the mu opioid and motilin receptors demonstrated its applicability to virtual screening in general, as well as GPCR de-orphanization. Subsequently, MAGELLAN was combined with AutoDock Vina into a novel, sequential virtual screen pipeline against both MOR and DOR to compensate for the weaknesses of each algorithm. Retrospective virtual screens against both MAGELLAN and AutoDock Vina were established for both receptors, and both methods were reported to have over-random discrimination between actives and decoys using the GPCR-Bench dataset. In conclusion, structure (GPCR-EXP) and pharmacological data (GLASS) databases were constructed to provide users with a comprehensive source of GPCR data. Moreover, GLASS made it possible for MAGELLAN to be developed, providing it a rich source of experimental data. In return, this resulted in greater performance than competing algorithms. Lastly, a prospective sequential virtual screening pipeline was established for the discovery of novel bifunctional opioids, in which the models for both methods were validated to perform well. In future studies, cAMP and β-arrestin assays will be run on a subset of compounds from a prospective virtual screen in the hopes of discovering a novel opioid with reduced tolerance and withdrawal.PHDBiological ChemistryUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147623/1/wallakin_1.pd

    EROS is a selective chaperone regulating the phagocyte NADPH oxidase and purinergic signalling

    Get PDF
    EROS (essential for reactive oxygen species) protein is indispensable for expression of gp91phox, the catalytic core of the phagocyte NADPH oxidase. EROS deficiency in humans is a novel cause of the severe immunodeficiency, chronic granulomatous disease, but its mechanism of action was unknown until now. We elucidate the role of EROS, showing it acts at the earliest stages of gp91phox maturation. It binds the immature 58 kDa gp91phox directly, preventing gp91phox degradation and allowing glycosylation via the oligosaccharyltransferase machinery and the incorporation of the heme prosthetic groups essential for catalysis. EROS also regulates the purine receptors P2X7 and P2X1 through direct interactions, and P2X7 is almost absent in EROS-deficient mouse and human primary cells. Accordingly, lack of murine EROS results in markedly abnormal P2X7 signalling, inflammasome activation, and T cell responses. The loss of both ROS and P2X7 signalling leads to resistance to influenza infection in mice. Our work identifies EROS as a highly selective chaperone for key proteins in innate and adaptive immunity and a rheostat for immunity to infection. It has profound implications for our understanding of immune physiology, ROS dysregulation, and possibly gene therapy.</jats:p

    Efficient access methods for very large distributed graph databases

    Get PDF
    Subgraph searching is an essential problem in graph databases, but it is also challenging due to the involved subgraph isomorphism NP-Complete sub-problem. Filter-Then-Verify (FTV) methods mitigate performance overheads by using an index to prune out graphs that do not fit the query in a filtering stage, reducing the number of subgraph isomorphism evaluations in a subsequent verification stage. Subgraph searching has to be applied to very large databases (tens of millions of graphs) in real applications such as molecular substructure searching. Previous surveys have identified the FTV solutions GraphGrepSX (GGSX) and CT-Index as the best ones for large databases (thousands of graphs), however they cannot reach reasonable performance on very large ones (tens of millions graphs). This paper proposes a generic approach for the distributed implementation of FTV solutions. Besides, three previous methods that improve the performance of GGSX and CT-Index are adapted to be executed in clusters. The evaluation shows how the achieved solutions provide a great performance improvement (between 70% and 90% of filtering time reduction) in a centralized configuration and how they may be used to achieve efficient subgraph searching over very large databases in cluster configurationsThis work has been co-funded by the Ministerio de Economía y Competitividad of the Spanish government, and by Mestrelab Research S.L. through the project NEXTCHROM (RTC-2015-3812-2) of the call Retos-Colaboración of the program Programa Estatal de Investigación, Desarrollo e Innovación Orientada a los Retos de la Sociedad. The authors wish to thank the financial support provided by Xunta de Galicia under the Project ED431B 2018/28S
    corecore