20 research outputs found

    Switching ion binding selectivity of thiacalix[4]arene monocrowns at liquid–liquid and 2D-confined interfaces

    Get PDF
    Understanding the interaction of ions with organic receptors in confined space is of fundamental importance and could advance nanoelectronics and sensor design. In this work, metal ion complexation of conformationally varied thiacalix[4]monocrowns bearing lower-rim hydroxy (type I), dodecyloxy (type II), or methoxy (type III) fragments was evaluated. At the liquid–liquid interface, alkylated thiacalixcrowns-5(6) selectively extract alkali metal ions according to the induced-fit concept, whereas crown-4 receptors were ineffective due to distortion of the crown-ether cavity, as predicted by quantum-chemical calculations. In type-I ligands, alkali-metal ion extraction by the solvent-accessible crown-ether cavity was prevented, which resulted in competitive Ag+ extraction by sulfide bridges. Surprisingly, amphiphilic type-I/II conjugates moderately extracted other metal ions, which was attributed to calixarene aggregation in salt aqueous phase and supported by dynamic light scattering measurements. Cation–monolayer interactions at the air–water interface were monitored by surface pressure/potential measurements and UV/visible reflection–absorption spectroscopy. Topology-varied selectivity was evidenced, towards Sr2+ (crown-4), K+ (crown-5), and Ag+ (crown-6) in type-I receptors and Na+ (crown-4), Ca2+ (crown-5), and Cs+ (crown-6) in type-II receptors. Nuclear magnetic resonance and electronic absorption spectroscopy revealed exocyclic coordination in type-I ligands and cation–π interactions in type-II ligands

    Bidirectional Graphormer for Reactivity Understanding: neural network trained to reaction atom-to-atom mapping task

    No full text
    This work introduces GraphormerMapper – a new algorithm for reactions atom-to-atom mapping (AAM) based on a distance-aware BERT neural network. In benchmarking studies with IBM RxnMapper, the best AAM algorithm according to our previous study, we demonstrate that our AAM algorithm is superior on our “Golden” benchmarking dataset. The mapper is implemented in Chython [https://github.com/chython/chython] and Chytorch [https://github.com/chython/chytorch, https://github.com/chython/chytorch-rxnmap] Python packages which are freely available for out-the-box use. Chython is a cheminformatics library with a simple interface for processing reaction and molecular data. The key features of Chython are: chemical functional groups standardization, checking atom valence errors, substructure search, and advanced reaction manipulation, for example, generating products from reactants and reaction atom-to-atom mapping. Chytorch provides a PyTorch-like interface for graph-based neural networks developed specifically for chemical tasks

    QSAR Modeling Based on Conformation Ensembles Using a Multi-Instance Learning Approach

    No full text
    Modern QSAR approaches have wide practical applications in drug discovery for screening potentially bioactive molecules before their experimental testing. Most models predicting the bioactivity of compounds are based on molecular descriptors derived from 2D structure losing explicit information about the spatial structure of molecules which is important for protein-ligand recognition. The major problem in constructing models using 3D descriptors is the choice of a probable bioactive conformation that affects the predictive performance. Multi-instance (MI) learning approach considering multiple conformations upon the model training can be a reasonable solution to the above problem. Here, we compared MI-QSAR with the classical single-instance QSAR (SI-QSAR) approach, where each molecule was encoded by either 2D descriptors or 3D descriptors issued from the single lowest-energy conformation. The calculations were carried out on a sample of 175 datasets extracted from the ChEMBL23 database. It was demonstrated that (i) MI-QSAR outperforms SI-QSAR in numerous cases and (ii) MI algorithms can automatically identify plausible bioactive conformations. Instance-attention based network can be applied for most important conformer selection which was shown to correspond PDB conformer in 50-84% of molecules

    Cross-validation strategies in QSPR modelling of chemical reactions

    Get PDF
    In this article, we consider cross-validation of the quantitative structure-property relationship models for reactions and show that the conventional k-fold cross-validation (CV) procedure gives an `optimistically' biased assessment of prediction performance. To address this issue, we suggest two strategies of model cross-validation, `transformation-out' CV, and `solvent-out' CV. Unlike the conventional k-fold cross-validation approach that does not consider the nature of objects, the proposed procedures provide an unbiased estimation of the predictive performance of the models for novel types of structural transformations in chemical reactions and reactions going under new conditions. Both the suggested strategies have been applied to predict the rate constants of bimolecular elimination and nucleophilic substitution reactions, and Diels-Alder cycloaddition. All suggested cross-validation methodologies and tutorial are implemented in the open-source software package CIMtools (https://github.com/cimm-kzn/CIMtools)

    Comprehensive Analysis of Applicability Domains of QSPR Models for Chemical Reactions

    Get PDF
    Nowadays, the problem of the model's applicability domain (AD) definition is an active research topic in chemoinformatics. Although many various AD definitions for the models predicting properties of molecules (Quantitative Structure-Activity/Property Relationship (QSAR/QSPR) models) were described in the literature, no one for chemical reactions (Quantitative Reaction-Property Relationships (QRPR)) has been reported to date. The point is that a chemical reaction is a much more complex object than an individual molecule, and its yield, thermodynamic and kinetic characteristics depend not only on the structures of reactants and products but also on experimental conditions. The QRPR models' performance largely depends on the way that chemical transformation is encoded. In this study, various AD definition methods extensively used in QSAR/QSPR studies of individual molecules, as well as several novel approaches suggested in this work for reactions, were benchmarked on several reaction datasets. The ability to exclude wrong reaction types, increase coverage, improve the model performance and detect Y-outliers were tested. As a result, several "best" AD definitions for the QRPR models predicting reaction characteristics have been revealed and tested on a previously published external dataset with a clear AD definition problem

    Global Reactivity Models are Impactful in Industrial Synthesis Applications

    No full text
    Artificial Intelligence is revolutionizing many aspects of the pharmaceutical industry. Deep learning models are now routinely applied to guide drug discovery projects leading to faster and improved findings, but there are still many tasks with enormous unrealized potential. One such task is the reaction yield prediction. Every year more than one fifth of all synthesis attempts result in product yields which are either zero or too low. This equates to chemical and human resources being spent on activities which ultimately do not progress the programs, leading to a triple loss when accounting for the cost of opportunity in time wasted. In this work we pre-train a BERT model on more than 16 million reactions from 4 different data sources, and fine tune it to achieve an uncertainty calibrated global yield prediction model. This model is an improvement upon state of the art not just from the increase in pre-train data but also by introducing a new embedding layer which solves a few limitations of SMILES and enables integration of additional information such as equivalents and molecule role into the reaction encoding, the model is called BERT Enriched Embedding (BEE). The model is benchmarked on an open-source dataset against a state-of-the-art synthesis focused BERT showing a near 20-point improvement in r2 score. The model is fine-tuned and tested on an internal company data benchmark, and a prospective study shows that the application of the model can reduce the total number of negative reactions (yield under 5%) ran in Janssen by at least 34%. Lastly, we corroborate the previous results through experimental validation, by directly deploying the model in an on-going drug discovery project and showing that it can also be used successfully as a reagent recommender due to its fast inference speed and reliable confidence estimation, a critical feature for industry application

    Discovery of Novel Chemical Reactions by Deep Generative Recurrent Neural Network

    No full text
    Here, we report an application of Artificial Intelligence techniques to generate novel chemical reactions of the given type. A sequence-to-sequence autoencoder was trained on the USPTO reaction database. Each reaction was converted into a single Condensed Graph of Reaction (CGR), followed by their translation into on-purpose developed SMILES/GGR text strings. The autoencoder latent space was visualized on the two-dimensional generative topographic map, from which some zones populated by Suzuki coupling reactions were targeted. These served for the generation of novel reactions by sampling the latent space points and decoding them to SMILES/CGR.<br /
    corecore