Search CORE

5 research outputs found

What the Heck?Automated Regioselectivity Calculations of Palladium-Catalyzed Heck Reactions Using Quantum Chemistry

Author: Andreas H. Göller (1353264)
Jan H. Jensen (111108)
Nicolai Ree (4580002)
Publication venue: 'American Chemical Society (ACS)'
Publication date: 29/09/2022
Field of study

We present a quantum chemistry (QM)-based method that computes the relative energies of intermediates in the Heck reaction that relate to the regioselective reaction outcome: branched (α), linear (β), or a mix of the two. The calculations are done for two different reaction pathways (neutral and cationic) and are based on r2SCAN-3c single-point calculations on GFN2-xTB geometries that, in turn, derive from a GFNFF-xTB conformational search. The method is completely automated and is sufficiently efficient to allow for the calculation of thousands of reaction outcomes. The method can mostly reproduce systematic experimental studies where the ratios of regioisomers are carefully determined. For a larger dataset extracted from Reaxys, the results are somewhat worse with accuracies of 63% for β-selectivity using the neutral pathway and 29% for α-selectivity using the cationic pathway. Our analysis of the dataset suggests that only the major or desired regioisomer is reported in the literature in many cases, which makes accurate comparisons difficult. The code is freely available on GitHub under the MIT open-source license: https://github.com/jensengroup/HeckQM

PubMed Central

ChemRxiv

FigShare

Reliable and Performant Identification of Low-Energy Conformers in the Gas Phase and Water

Author: Alexander Hillisch (1353261)
Andreas H. Göller (1353264)
Anna Theresa Cavasin (5226596)
Felix Uellendahl (5226599)
Sebastian Schneckener (436626)
Publication venue
Publication date
Field of study

Prediction of compound properties from structure via quantitative structure–activity relationship and machine-learning approaches is an important computational chemistry task in small-molecule drug research. Though many such properties are dependent on three-dimensional structures or even conformer ensembles, the majority of models are based on descriptors derived from two-dimensional structures. Here we present results from a thorough benchmark study of force field, semiempirical, and density functional methods for the calculation of conformer energies in the gas phase and water solvation as a foundation for the correct identification of relevant low-energy conformers. We find that the tight-binding ansatz GFN-xTB shows the lowest error metrics and highest correlation to the benchmark PBE0-D3(BJ)/def2-TZVP in the gas phase for the computationally fast methods and that in solvent OPLS3 becomes comparable in performance. MMFF94, AM1, and DFTB+ perform worse, whereas the performance-optimized but far more expensive functional PBEh-3c yields energies almost perfectly correlated to the benchmark and should be used whenever affordable. On the basis of our findings, we have implemented a reliable and fast protocol for the identification of low-energy conformers of drug-like molecules in water that can be used for the quantification of strain energy and entropy contributions to target binding as well as for the derivation of conformer-ensemble-dependent molecular descriptors

FigShare

Reliable and Performant Identification of Low-Energy Conformers in the Gas Phase and Water

Author: Alexander Hillisch (1353261)
Andreas H. Göller (1353264)
Anna Theresa Cavasin (5226596)
Felix Uellendahl (5226599)
Sebastian Schneckener (436626)
Publication venue
Publication date
Field of study

FigShare

Predictive Modeling of PROTAC Cell Permeability with Machine Learning

Author: Andreas H. Göller (1353264)
Anja Giese (4243621)
Daniel Meibom (9473231)
Florian Kölling (9044819)
Jan Kihlberg (26018)
Lutz Lehmann (2153491)
Vasanthanathan Poongavanam (459234)
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2023
Field of study

Approaches for predicting proteolysis targeting chimera (PROTAC) cell permeability are of major interest to reduce resource-demanding synthesis and testing of low-permeable PROTACs. We report a comprehensive investigation of the scope and limitations of machine learning-based binary classification models developed using 17 simple descriptors for large and structurally diverse sets of cereblon (CRBN) and von Hippel–Lindau (VHL) PROTACs. For the VHL PROTAC set, kappa nearest neighbor and random forest models performed best and predicted the permeability of a blinded test set with >80% accuracy (k ≥ 0.57). Models retrained by combining the original training and the blinded test set performed equally well for a second blinded VHL set. However, models for CRBN PROTACs were less successful, mainly due to the imbalanced nature of the CRBN datasets. All descriptors contributed to the models, but size and lipophilicity were the most important. We conclude that properly trained machine learning models can be integrated as effective filters in the PROTAC design process

Publikationer från Uppsala Universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line

FigShare

Best of Both Worlds: Combining Pharma Data and State of the Art Modeling Technology To Improve in Silico pKa Prediction

Author: Alexander Hillisch (1353261)
Andreas H. Göller (1353264)
Mario Lobell (1353267)
Robert D. Clark (1353270)
Robert Fraczkiewicz (1353258)
Rolf Schoenneis (1353255)
Ursula Krenz (1353252)
Publication venue
Publication date
Field of study

In a unique collaboration between a software company and a pharmaceutical company, we were able to develop a new in silico pKa prediction tool with outstanding prediction quality. An existing pKa prediction method from Simulations Plus based on artificial neural network ensembles (ANNE), microstates analysis, and literature data was retrained with a large homogeneous data set of drug-like molecules from Bayer. The new model was thus built with curated sets of ∼14,000 literature pKa values (∼11,000 compounds, representing literature chemical space) and ∼19,500 pKa values experimentally determined at Bayer Pharma (∼16,000 compounds, representing industry chemical space). Model validation was performed with several test sets consisting of a total of ∼31,000 new pKa values measured at Bayer. For the largest and most difficult test set with >16,000 pKa values that were not used for training, the original model achieved a mean absolute error (MAE) of 0.72, root-mean-square error (RMSE) of 0.94, and squared correlation coefficient (R2) of 0.87. The new model achieves significantly improved prediction statistics, with MAE = 0.50, RMSE = 0.67, and R2 = 0.93. It is commercially available as part of the Simulations Plus ADMET Predictor release 7.0. Good predictions are only of value when delivered effectively to those who can use them. The new pKa prediction model has been integrated into Pipeline Pilot and the PharmacophorInformatics (PIx) platform used by scientists at Bayer Pharma. Different output formats allow customized application by medicinal chemists, physical chemists, and computational chemists

FigShare