Search CORE

5 research outputs found

Application of Multivariate Adaptive Regression Splines (MARSplines) for Predicting Hansen Solubility Parameters Based on 1D and 2D Molecular Descriptors Computed from SMILES String

Author: Cysewski Piotr
Jeliński Tomasz
Przybyłek Maciej
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2019
Field of study

A new method of Hansen solubility parameters (HSPs) prediction was developed by combining the multivariate adaptive regression splines (MARSplines) methodology with a simple multivariable regression involving 1D and 2D PaDEL molecular descriptors. In order to adopt the MARSplines approach to QSPR/QSAR problems, several optimization procedures were proposed and tested. The effectiveness of the obtained models was checked via standard QSPR/QSAR internal validation procedures provided by the QSARINS software and by predicting the solubility classification of polymers and drug-like solid solutes in collections of solvents. By utilizing information derived only from SMILES strings, the obtained models allow for computing all of the three Hansen solubility parameters including dispersion, polarization, and hydrogen bonding. Although several descriptors are required for proper parameters estimation, the proposed procedure is simple and straightforward and does not require a molecular geometry optimization. The obtained HSP values are highly correlated with experimental data, and their application for solving solubility problems leads to essentially the same quality as for the original parameters. Based on provided models, it is possible to characterize any solvent and liquid solute for which HSP data are unavailable

arXiv.org e-Print Archive

Directory of Open Access Journals

A confidence predictor for logD using conformal regression and a support-vector machine

Author: A Gaulton
Arvid Berg
B Bienfait
EH Kerns
EL Willighagen
G Fu
H Papadopoulos
I Cortes-Ciriano
I Cortes-Ciriano
J Alvarsson
J Alvarsson
JD Fernández
JD Hughes
JL Faulon
Jonathan Alvarsson
L Carlsson
M Dumontier
M Lapins
M Lindh
Maris Lapins
MJ Waring
MJ Waring
MJ Waring
MM Hann
N Jeliazkova
N Jeliazkova
O Spjuth
Ola Spjuth
PD Leeson
R Mannhold
R-E Fan
S Kim
Samuel Lampa
Staffan Arvidsson
TT Wager
TW Johnson
U Norinder
U Norinder
V Vapnik
V Vovk
V Vovk
Wesley Schaal
YW Alelyunas
YW Low
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water-octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of [Formula: see text] and with the best performing nonconformity measure having median prediction interval of [Formula: see text] log units at 80% confidence and [Formula: see text] log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service

Crossref

ZENODO

Publikationer från Uppsala Universitet

Directory of Open Access Journals

Digitala Vetenskapliga Arkivet - Academic Archive On-line

A confidence predictor for logD using conformal regression and a support-vector machine

Author: A Gaulton
Arvid Berg
B Bienfait
EH Kerns
EL Willighagen
G Fu
H Papadopoulos
I Cortes-Ciriano
I Cortes-Ciriano
J Alvarsson
J Alvarsson
JD Fernández
JD Hughes
JL Faulon
Jonathan Alvarsson
L Carlsson
M Dumontier
M Lapins
M Lindh
Maris Lapins
MJ Waring
MJ Waring
MJ Waring
MM Hann
N Jeliazkova
N Jeliazkova
O Spjuth
Ola Spjuth
PD Leeson
R Mannhold
R-E Fan
S Kim
Samuel Lampa
Staffan Arvidsson
TT Wager
TW Johnson
U Norinder
U Norinder
V Vapnik
V Vovk
V Vovk
Wesley Schaal
YW Alelyunas
YW Low
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

RDF Dataset for article: A confidence predictor for logD using conformal regression and a support-vector machine

Author: Alvarsson Jonathan
Arvidsson Staffan
Berg Arvid
Lampa Samuel
Lapins Maris
Schaal Wesley
Spjuth Ola
Publication venue
Publication date
Field of study

RDF dataset described in article: "A confidence predictor for logD using conformal regression and a support-vector machine" (Manuscript in preparation). The dataset contains conformal logD values at 90% confidence level, computed for 91M compounds from PubChem, in RDF format. The .hdt.gz version contains the dataset in RDF HDT format (http://www.rdfhdt.org/), compressed with tar and gzip. The archive contains both the .hdt file, and an index file, generated by the hdtSearch C++ tool. The .ttl.gz file is a gzipped file in RDF Turtle format (https://www.w3.org/TR/turtle/)

ZENODO

Uncertainty estimation for QSAR models using machine learning methods

Author: Founti Christina Maria
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 01/09/2019
Field of study

White Rose E-theses Online