1,175 research outputs found

    The evolution of an on-line chemical search system for an industrial research unit.

    Get PDF
    The objectives of this study were to design an information system, using modern computer technology, to meet a research chemist's need for chemical structural information, to quantify the effects of increasing degrees of computer technology on the use made of the facilities, and to relate the use of the service back to the individual chemist, his performance and background. A computer system was developed based on Wiswesser Line Notation and molecular formula as the chemical structure descriptors. Systems design and analysis were performed so that access to the information could be obtained directly for individual compounds and more generally for classes of compounds. As the system was being developed, its use by information staff was monitored by constant interaction with the people concerned. Where appropriate, the system was modifiea to meet information staff's requirements, but a number of precautions had to be introduced to prevent mis-use. The research chemists' use of the information services was studied retrospectively over a two-year period. In addition to the use made, several other factors were observed for each chemist. These included performance measures and background information on the chemists' research role. The data showed a steady increase in the demand for the services by the research chemist as the degree of computerisation increased. The use made of the services related closely to the number of compounds prepared by each chemist, but there was no significant correlation between a chemist's success in preparing biologically active compounds and his information use. The very individual way in which chemists conduct their research was highlighted by the wide range of use of the information facilities and the low correlation with background factors. This makes the design of on-line systems for use by chemists themselves complex and justifies the existence of the information scientist as an interface

    Development and prospective application of chemoinformatic tools to explore new ligand chemistry and protein biology

    Get PDF
    Drug discovery and design is a tedious and expensive process whose small chances of success necessitates the development of novel chemoinformatic approaches and concepts. Their common goal is the efficient and robust identification of promising chemical matter and the reliable prediction of its properties. Computer-aided drug discovery and design (CADDD) and its multifarious installments throughout the different phases of the drug discovery pipeline contribute significantly to the expansion of the hits, the understanding of their structure-activity relationship and their rational diversification. They alleviate the development’s costs and its time-demand thus support the search for the needle in the haystack – a potent hit. The HTS-driven brute-force nature of current and of the decades’ past discovery and design strategies compelled researchers to develop ideas and algorithms in order to interfere with the pipeline and prevent its frequent failures. In the introduction, I describe the drug discovery and design pipeline and point out interfaces where CADDD contributes to its success. In Part 1 of this thesis, I present a novel methodology that supports the early-stage hit discovery processes through a fragment-based reduced graph similarity approach (RedFrag). It is a chimeric algorithm that combines fingerprint-based similarity calculation with scaffold-hopping-enabling graph isomorphism. We thoroughly investigated its performance retro- and prospectively. It uses a new type of reduced graph that does not suffer from information loss during its construction and bypasses the necessity of feature definitions. Built upon chemical epitopes resulting from molecule fragmentation, the reduced graph embodies physico-chemical and 2D-structural properties of a molecule. Reduced graphs are compared with a continuous-similarity-distance-driven maximal common subgraph algorithm, which calculates similarity at the fragmental and topological levels. The second chapter, Part 2, is dedicated to PrenDB: A digital compendium of the reaction space of prenyltransferases of the dimethylallyltryptophan synthase (DMATS) superfamily. Their catalytical transformations represent a major skeletal diversification step in the biosynthesis of secondary metabolites including the indole alkaloids. DMATS enzymes thus contribute significantly to the biological and pharmacological diversity of small molecule metabolites. The attachment of the prenyl donor to lead- or drug-like molecules renders the prenyltransferases useful in the access of chemical space that is difficult to reach by conventional synthesis. In PrenDB, we collected the substrates, enzymes and products. We then used a newly developed algorithm based on molecular fragmentation to automatically extract reactive chemical epitopes. The analysis of the collected data sheds light on the thus far explored substrate space of DMATS enzymes. We supplemented the browsable database with algorithmic prediction routines in order to assess the prenylability of novel compounds and did so for a set of 38 molecules. In a case study, Part 3, we investigated the regioselectivity of five prenyltransferases in the presence of unnatural prenyl donors. Detailed biochemical investigations revealed the acceptance of these dimethylallyl pyrophosphate (DMAPP) analogs by all tested enzymes with different relative activities and regioselectivities. In order to understand the activity profiles and their differences on a molecular level we investigated the interaction within the enzyme-prenyl donor-substrate system with molecular dynamics. Our experiments show that the reactivity of a prenyl donor strongly correlates with the distance of its electrophilic, reactive atom and the nucleophilic center of the substrate molecule. It renders the first step towards a better mechanistic understanding of the reactivity of prenyltransferases and expands significantly the potential usage and rational design of tryptophan prenylating enzymes as biocatalysts for Friedel–Crafts alkylation. Lastly, in Part 4, we present the synergistic potential of combined ligand- and structure-based drug discovery methodologies applied to the ÎČ2-adrenergic receptor (ÎČ2AR). The ÎČ2AR is a G protein-coupled receptor (GPCR) and a well-explored target. By the joint application of fingerprint-based similarity, substructure-based searches and docking we discovered 13 ligands – ten of which were novel – of this particular GPCR. Of note, two of the molecules used as starting points for the similarity and substructure searches distinguish themselves from other ÎČ2AR antagonists by their unique scaffold. Thus, the usage of a multistep hierarchical or parallel screening approach enabled us to use these unique structural features and discover novel chemical matter beyond the bounds of the ligand space known so far and emphasize the intrinsic complementarity of ligand- and structure-based approaches. The molecules described in this work allow us to explore the ligand space around the previously reported molecules in greater detail, leading to insights into their structure-activity relationship. In addition, we also characterized our hits with experimental binding and selectivity data and discussed it based on their putative binding modes derived by docking

    Special Libraries, May-June 1978

    Get PDF
    Volume 69, Issue 5-6https://scholarworks.sjsu.edu/sla_sl_1978/1004/thumbnail.jp

    Similarity Methods in Chemoinformatics

    Get PDF
    promoting access to White Rose research paper

    Matching algorithms for handling three dimensional molecular co-ordinate data.

    Get PDF

    Data integration for biological network databases: MetNetDB labeled graph model and graph matching algorithm

    Get PDF
    To understand the cellular functions of genes requires investigating a variety of biological data, including experimental data, annotation from online databases and literatures, information about cellular interactions, and domain knowledge from biologists. These requirements demand a flexible and powerful biological data management system. MetNetDB is the biological database component of the MetNet platform (http://metnetdb.org/), a software platform for Arabidopsis system biology. This work describes a labeled graph model that addresses the challenges associated with biological network databases, and discusses the implementation of this model in MetNetDB. MetNetDB integrates most recent data from various sources, including biological networks, gene annotation, metabolite information, and protein localization data. The integration contains four steps: data model transformation and integration; semantic mapping; data conversion and integration; and conflict resolution. MetNetDB is established as a labeled graph model. The graph structure supports network data storage and application of graph analysis algorithm. The node and edge labels have the same extension capability as object data model. In addition, rules are used to guarantee the biological network data integrity; operations are defined for graph edit and comparison. To facilitate the integration of network data, which is often inaccurate or incomplete, a subgraph extraction algorithm is designed for MetNetDB. This algorithm allows subgraph querying based on user-specified biomolecules. Both exact matching and approximate matching with biomolecules in networks are supported. The similarity among biomolecules is inferred from expression patterns, gene ontology, chemical ontology, and protein-gene relationships. Combined with the implementation of Messmer\u27s approximate subgraph isomorphism algorithm, MetNetDB supports exact and approximate graph matching. Based on the MetNetDB labeled graph model and the graph matching algorithms, the MetNetDB curator tool is built with several innovative features, including active biological rule checking during network curation, tracking data change history, and a biologist-friendly visual graph query system

    Information retrieval and text mining technologies for chemistry

    Get PDF
    Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.A.V. and M.K. acknowledge funding from the European Community’s Horizon 2020 Program (project reference: 654021 - OpenMinted). M.K. additionally acknowledges the Encomienda MINETAD-CNIO as part of the Plan for the Advancement of Language Technology. O.R. and J.O. thank the Foundation for Applied Medical Research (FIMA), University of Navarra (Pamplona, Spain). This work was partially funded by Consellería de Cultura, Educación e Ordenación Universitaria (Xunta de Galicia), and FEDER (European Union), and the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UID/BIO/04469/2013 unit and COMPETE 2020 (POCI-01-0145-FEDER-006684). We thank Iñigo Garciá -Yoldi for useful feedback and discussions during the preparation of the manuscript.info:eu-repo/semantics/publishedVersio
    • 

    corecore