105 research outputs found

    Artificial intelligence in biological activity prediction

    Get PDF
    Artificial intelligence has become an indispensable resource in chemoinformatics. Numerous machine learning algorithms for activity prediction recently emerged, becoming an indispensable approach to mine chemical information from large compound datasets. These approaches enable the automation of compound discovery to find biologically active molecules with important properties. Here, we present a review of some of the main machine learning studies in biological activity prediction of compounds, in particular for sweetness prediction. We discuss some of the most used compound featurization techniques and the major databases of chemical compounds relevant to these tasks.This study was supported by the European Commission through project SHIKIFACTORY100 - Modular cell factories for the production of 100 compounds from the shikimate pathway (Reference 814408), and by the Portuguese FCT under the scope of the strategic funding of UID/BIO/04469/2019 unit and BioTecNorte operation (NORTE-01-0145-FEDER-000004) funded by the European Regional Development Fund under the scope of Norte2020.info:eu-repo/semantics/publishedVersio

    BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology.

    Get PDF
    BindingDB, www.bindingdb.org, is a publicly accessible database of experimental protein-small molecule interaction data. Its collection of over a million data entries derives primarily from scientific articles and, increasingly, US patents. BindingDB provides many ways to browse and search for data of interest, including an advanced search tool, which can cross searches of multiple query types, including text, chemical structure, protein sequence and numerical affinities. The PDB and PubMed provide links to data in BindingDB, and vice versa; and BindingDB provides links to pathway information, the ZINC catalog of available compounds, and other resources. The BindingDB website offers specialized tools that take advantage of its large data collection, including ones to generate hypotheses for the protein targets bound by a bioactive compound, and for the compounds bound by a new protein of known sequence; and virtual compound screening by maximal chemical similarity, binary kernel discrimination, and support vector machine methods. Specialized data sets are also available, such as binding data for hundreds of congeneric series of ligands, drawn from BindingDB and organized for use in validating drug design methods. BindingDB offers several forms of programmatic access, and comes with extensive background material and documentation. Here, we provide the first update of BindingDB since 2007, focusing on new and unique features and highlighting directions of importance to the field as a whole

    3D Molecular Representations Based on the Wave Transform for Convolutional Neural Networks

    Get PDF
    © 2018 American Chemical Society. Convolutional neural networks (CNN) have been successfully used to handle three-dimensional data and are a natural match for data with spatial structure such as 3D molecular structures. However, a direct 3D representation of a molecule with atoms localized at voxels is too sparse, which leads to poor performance of the CNNs. In this work, we present a novel approach where atoms are extended to fill other nearby voxels with a transformation based on the wave transform. Experimenting on 4.5 million molecules from the Zinc database, we show that our proposed representation leads to better performance of CNN-based autoencoders than either the voxel-based representation or the previously used Gaussian blur of atoms and then successfully apply the new representation to classification tasks such as MACCS fingerprint prediction

    InChI isotopologue and isotopomer specifications

    Get PDF
    This work presents a proposed extension to the International Union of Pure and Applied Chemistry (IUPAC) International Chemical Identifier (InChI) standard that allows the representation of isotopically-resolved chemical entities at varying levels of ambiguity in isotope location. This extension includes an improved interpretation of the current isotopic layer within the InChI standard and a new isotopologue layer specification for representing chemical intensities with ambiguous isotope localization. Both improvements support the unique isotopically-resolved chemical identification of features detected and measured in analytical instrumentation, specifically nuclear magnetic resonance and mass spectrometry. Scientific contribution This new extension to the InChI standard would enable improved annotation of analytical datasets characterizing chemical entities, supporting the FAIR (Findable, Accessible, Interoperable, and Reusable) guiding principles of data stewardship for chemical datasets, ultimately promoting Open Science in chemistry

    Grad Sunca, otvorenog pristupa i institucijskih repozitorija

    Get PDF
    Idealni uvjeti znanstveno-istraživačkog rada podrazumijevaju, osim ostalog, i nesmetan pristup svim nastojanjima (pogotovo recentnim) u obliku znanstvenih radova i rezultata istraživanja vezanih uz određeno područje znanstvenog djelovanja. Isto tako, podrazumijevaju i mogućnost objavljivanja vlastite produkcije (pojedinca ili ustanove) tako da bude vidljiva i dostupna cjelokupnoj znanstvenoj zajednici. Donedavno su samo dijelovi ukupne produkcije nekog znanstvenog područja bili dostupni pojedincu znanstveniku u ovisnosti o visini sredstava koja se izdvajaju na nacionalnoj razini, razini ustanove ili osobno za pretplatu na časopise. Napretkom tehnologije i razvojem internetskog servisa te postupnim prelaskom s tiskanih na elektronička izdanja javljaju se i nove mogućnosti znanstvenog publiciranja, a usporedo s time i ideja “otvorenog pristupa” – slobodno dostupnog i besplatnog pristupa rezultatima znanstveno-istraživačkog rada. U nastavku ćemo opisati načine kojima se nastoji postići veća razina dostupnosti znanstveno-istraživačke produkcije te dati primjere dostupnih rješenja
    corecore