Search CORE

2,086 research outputs found

Activity, assay and target data curation and quality in the ChEMBL database

Author
Publication venue: Springer
Publication date: 01/09/2015
Field of study

Chemical databases: curation or integration by user-defined equivalence?

Author: Bellis Louisa
Chambers Jon
Gaulton Anna
Hersey Anne
Overington John P.
Patrícia Bento A.
Publication venue: The Authors. Published by Elsevier Ltd.
Publication date: 31/07/2015
Field of study

There is a wealth of valuable chemical information in publicly available databases for use by scientists undertaking drug discovery. However finite curation resource, limitations of chemical structure software and differences in individual database applications mean that exact chemical structure equivalence between databases is unlikely to ever be a reality. The ability to identify compound equivalence has been made significantly easier by the use of the International Chemical Identifier (InChI), a non-proprietary line-notation for describing a chemical structure. More importantly, advances in methods to identify compounds that are the same at various levels of similarity, such as those containing the same parent component or having the same connectivity, are now enabling related compounds to be linked between databases where the structure matches are not exact

Elsevier - Publisher Connector

Probabilistic Random Forest improves bioactivity predictions close to the classification threshold by taking into account experimental uncertainty.

Author: Afzal Avid M
Barrett Ian P
Bender Andreas
Engkvist Ola
Mervin Lewis H
Trapotsi Maria-Anna
Publication venue: J Cheminform
Publication date: 01/01/2021
Field of study

Measurements of protein-ligand interactions have reproducibility limits due to experimental errors. Any model based on such assays will consequentially have such unavoidable errors influencing their performance which should ideally be factored into modelling and output predictions, such as the actual standard deviation of experimental measurements (σ) or the associated comparability of activity values between the aggregated heterogenous activity units (i.e., Ki versus IC50 values) during dataset assimilation. However, experimental errors are usually a neglected aspect of model generation. In order to improve upon the current state-of-the-art, we herein present a novel approach toward predicting protein-ligand interactions using a Probabilistic Random Forest (PRF) classifier. The PRF algorithm was applied toward in silico protein target prediction across ~ 550 tasks from ChEMBL and PubChem. Predictions were evaluated by taking into account various scenarios of experimental standard deviations in both training and test sets and performance was assessed using fivefold stratified shuffled splits for validation. The largest benefit in incorporating the experimental deviation in PRF was observed for data points close to the binary threshold boundary, when such information was not considered in any way in the original RF algorithm. For example, in cases when σ ranged between 0.4-0.6 log units and when ideal probability estimates between 0.4-0.6, the PRF outperformed RF with a median absolute error margin of ~ 17%. In comparison, the baseline RF outperformed PRF for cases with high confidence to belong to the active class (far from the binary decision threshold), although the RF models gave errors smaller than the experimental uncertainty, which could indicate that they were overtrained and/or over-confident. Finally, the PRF models trained with putative inactives decreased the performance compared to PRF models without putative inactives and this could be because putative inactives were not assigned an experimental pXC50 value, and therefore they were considered inactives with a low uncertainty (which in practice might not be true). In conclusion, PRF can be useful for target prediction models in particular for data where class boundaries overlap with the measurement uncertainty, and where a substantial part of the training data is located close to the classification threshold

Directory of Open Access Journals

Chalmers Research

Apollo (Cambridge)

FigShare

A maximum common substructure-based algorithm for searching and predicting drug-like compounds

Author: Abt
Altman
Barrow
Berretti
Blower
Bunke
Bunke
Burden
Carhart
Chen
Cheng
Christianini
Cone
Conte
Cordella
Cordella
Dean
Deshpande
Dimitriadou
Dobson
Dumay
Engels
Garey
Girke
Gonzalez
Hagadone
Holder
James
Johnson
King
Koch
Levi
Luo
McGregor
Moriguchi
Provost
Raymond
Raymond
Sheridan
Tao Jiang
Thomas Girke
Tsai
Wang
Wheeler
Willett
Wilson
Yan
Yiqun Cao
Publication venue: Oxford University Press
Publication date: 01/07/2008
Field of study

Motivation: The prediction of biologically active compounds is of great importance for high-throughput screening (HTS) approaches in drug discovery and chemical genomics. Many computational methods in this area focus on measuring the structural similarities between chemical structures. However, traditional similarity measures are often too rigid or consider only global similarities between structures. The maximum common substructure (MCS) approach provides a more promising and flexible alternative for predicting bioactive compounds

Crossref

PubMed Central

eScholarship - University of California

Predicting drug metabolism: experiment and/or computation?

Author: Glen Robert C
Göller Andreas H
Kirchmair Johannes
Kunze Jens
Lang Dieter
Schneider Gisbert
Testa Bernard
Wilson Ian D
Publication venue: Nat Rev Drug Discov
Publication date: 01/01/2015
Field of study

Drug metabolism can produce metabolites with physicochemical and pharmacological properties that differ substantially from those of the parent drug, and consequently has important implications for both drug safety and efficacy. To reduce the risk of costly clinical-stage attrition due to the metabolic characteristics of drug candidates, there is a need for efficient and reliable ways to predict drug metabolism in vitro, in silico and in vivo. In this Perspective, we provide an overview of the state of the art of experimental and computational approaches for investigating drug metabolism. We highlight the scope and limitations of these methods, and indicate strategies to harvest the synergies that result from combining measurement and prediction of drug metabolism.This is the accepted manuscript of a paper published in Nature Reviews Drug Discovery (Kirchmair J, Göller AH, Lang D, Kunze J, Testa B, Wilson ID, Glen RC, Schneider G, Nature Reviews Drug Discovery, 2015, 14, 387–404, doi:10.1038/nrd4581). The final version is available at http://dx.doi.org/10.1038/nrd458

Serveur académique lausannois

Spiral - Imperial College Digital Repository

Apollo (Cambridge)

Investigating dietary vitamin D in Australia

Author: Dunlop Eleanor Shu-ying
Publication venue: Curtin University
Publication date: 01/01/2022
Field of study

The high prevalence of vitamin D deficiency in Australia is concerning as vitamin D is essential for bone health. This thesis provides the evidence required to explore food-based strategies to improve vitamin D status in the Australian population by: developing Australia’s first comprehensive vitamin D food composition database; generating Australia’s first population representative estimate of usual intakes of vitamin D; and, evaluating the effect of vitamin D food fortification on vitamin D status

espace@Curtin

Stochastic modeling of near-field exposure to parabens in personal care products

Author: Csiszar Susan A.
Ernstoff Alexi
Fantke Peter
Jolliet Olivier
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Online Research Database In Technology

VB-MK-LMF: Fusion of drugs, targets and interactions using Variational Bayesian Multiple Kernel Logistic Matrix Factorization

Author: Antal Péter
Bolgár Bence
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Background Computational fusion approaches to drug-target interaction (DTI) prediction, capable of utilizing multiple sources of background knowledge, were reported to achieve superior predictive performance in multiple studies. Other studies showed that specificities of the DTI task, such as weighting the observations and focusing the side information are also vital for reaching top performance. Method We present Variational Bayesian Multiple Kernel Logistic Matrix Factorization (VB-MK-LMF), which unifies the advantages of (1) multiple kernel learning, (2) weighted observations, (3) graph Laplacian regularization, and (4) explicit modeling of probabilities of binary drug-target interactions. Results VB-MK-LMF achieves significantly better predictive performance in standard benchmarks compared to state-of-the-art methods, which can be traced back to multiple factors. The systematic evaluation of the effect of multiple kernels confirm their benefits, but also highlights the limitations of linear kernel combinations, already recognized in other fields. The analysis of the effect of prior kernels using varying sample sizes sheds light on the balance of data and knowledge in DTI tasks and on the rate at which the effect of priors vanishes. This also shows the existence of ``small sample size'' regions where using side information offers significant gains. Alongside favorable predictive performance, a notable property of MF methods is that they provide a unified space for drugs and targets using latent representations. Compared to earlier studies, the dimensionality of this space proved to be surprisingly low, which makes the latent representations constructed by VB-ML-LMF especially well-suited for visual analytics. The probabilistic nature of the predictions allows the calculation of the expected values of hits in functionally relevant sets, which we demonstrate by predicting drug promiscuity. The variational Bayesian approximation is also implemented for general purpose graphics processing units yielding significantly improved computational time. Conclusion In standard benchmarks, VB-MK-LMF shows significantly improved predictive performance in a wide range of settings. Beyond these benchmarks, another contribution of our work is highlighting and providing estimates for further pharmaceutically relevant quantities, such as promiscuity, druggability and total number of interactions. Availability Data and code are available at http://bioinformatics.mit.bme.hu

Directory of Open Access Journals

Repository of the Academy's Library