Search CORE

2,658 research outputs found

Evaluation of a Bayesian inference network for ligand-based virtual screening

Author: A Abdo
A Bender
AG Maldonado
AN Jain
AR Leach
AR Leach
Beining Chen
Christoph Mueller
CX Zhai
D Metzler
EJ Gardiner
EM Voorhees
G Salton
GW Bemis
H Eckert
H Turtle
J Bajorath
J Hert
J Hert
J-F Truchon
JA Grant
JD Holliday
JP Callan
JP Callan
JR Fischer
K Spärck Jones
K Spärck Jones
N Nikolova
P Prathipati
P Willett
P Willett
P Willett
P Willett
P Willett
Peter Willett
RC Glen
RD Brown
RP Sheridan
RP Sheridan
S Siegel
SJ Edgar
T Lengauer
T Strohman
TI Oprea
WR Greiff
X Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Background Bayesian inference networks enable the computation of the probability that an event will occur. They have been used previously to rank textual documents in order of decreasing relevance to a user-defined query. Here, we modify the approach to enable a Bayesian inference network to be used for chemical similarity searching, where a database is ranked in order of decreasing probability of bioactivity. Results Bayesian inference networks were implemented using two different types of network and four different types of belief function. Experiments with the MDDR and WOMBAT databases show that a Bayesian inference network can be used to provide effective ligand-based screening, especially when the active molecules being sought have a high degree of structural homogeneity; in such cases, the network substantially out-performs a conventional, Tanimoto-based similarity searching system. However, the effectiveness of the network is much less when structurally heterogeneous sets of actives are being sought. Conclusion A Bayesian inference network provides an interesting alternative to existing tools for ligand-based virtual screening

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

White Rose Research Online

Evaluation of machine-learning methods for ligand-based virtual screening

Author: A Bender
A Bender
A Bender
A Ormerod
A Ormerod
AE Klon
AM Capelli
AR Leach
B Chen
Beining Chen
C Williams
D Hand
D Rogers
D Wilton
DA Cosgrove
David J. Wood
DB Kitchen
DE Clark
DJ Hand
DJ Wilton
DM Hawkins
E Parzen
FL Stahura
G Harper
G Redl
G Schneider
George Papadatos
H Eckert
H Kubinyi
HM Berman
J Aitchison
J Bajorath
J Delaney
J Hert
J Hert
J Hert
JC Saeh
L Hodes
L Hodes
L Hodes
M Congreve
M Glick
M Glick
M Wagener
M Whittle
N Christianini
N Nikolova
Nikolaus Stiefl
P Constans
P Domingos
P Willett
P Willett
P Willett
P Willett
P Willett
Paulette Greenidge
Peter Willett
Q Zhang
R P Sheridan
RD Brown
RD Brown
RD Cramer
RE Carhart
RO Duda
Robert F. Harrison
S Anzali
TJ McNeany
TM Mitchell
Xiao Qing Lewell
XY Xia
YC Martin
YC Martin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Machine-learning methods can be used for virtual screening by analysing the structural characteristics of molecules of known (in)activity, and we here discuss the use of kernel discrimination and naive Bayesian classifier (NBC) methods for this purpose. We report a kernel method that allows the processing of molecules represented by binary, integer and real-valued descriptors, and show that it is little different in screening performance from a previously described kernel that had been developed specifically for the analysis of binary fingerprint representations of molecular structure. We then evaluate the performance of an NBC when the training-set contains only a very few active molecules. In such cases, a simpler approach based on group fusion would appear to provide superior screening performance, especially when structurally heterogeneous datasets are to be processed

Crossref

White Rose Research Online

VB-MK-LMF: Fusion of drugs, targets and interactions using Variational Bayesian Multiple Kernel Logistic Matrix Factorization

Author: Antal Péter
Bolgár Bence
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Background Computational fusion approaches to drug-target interaction (DTI) prediction, capable of utilizing multiple sources of background knowledge, were reported to achieve superior predictive performance in multiple studies. Other studies showed that specificities of the DTI task, such as weighting the observations and focusing the side information are also vital for reaching top performance. Method We present Variational Bayesian Multiple Kernel Logistic Matrix Factorization (VB-MK-LMF), which unifies the advantages of (1) multiple kernel learning, (2) weighted observations, (3) graph Laplacian regularization, and (4) explicit modeling of probabilities of binary drug-target interactions. Results VB-MK-LMF achieves significantly better predictive performance in standard benchmarks compared to state-of-the-art methods, which can be traced back to multiple factors. The systematic evaluation of the effect of multiple kernels confirm their benefits, but also highlights the limitations of linear kernel combinations, already recognized in other fields. The analysis of the effect of prior kernels using varying sample sizes sheds light on the balance of data and knowledge in DTI tasks and on the rate at which the effect of priors vanishes. This also shows the existence of ``small sample size'' regions where using side information offers significant gains. Alongside favorable predictive performance, a notable property of MF methods is that they provide a unified space for drugs and targets using latent representations. Compared to earlier studies, the dimensionality of this space proved to be surprisingly low, which makes the latent representations constructed by VB-ML-LMF especially well-suited for visual analytics. The probabilistic nature of the predictions allows the calculation of the expected values of hits in functionally relevant sets, which we demonstrate by predicting drug promiscuity. The variational Bayesian approximation is also implemented for general purpose graphics processing units yielding significantly improved computational time. Conclusion In standard benchmarks, VB-MK-LMF shows significantly improved predictive performance in a wide range of settings. Beyond these benchmarks, another contribution of our work is highlighting and providing estimates for further pharmaceutically relevant quantities, such as promiscuity, druggability and total number of interactions. Availability Data and code are available at http://bioinformatics.mit.bme.hu

Directory of Open Access Journals

Repository of the Academy's Library

Machine learning-guided directed evolution for protein engineering

Author: Arnold Frances H.
Wu Zachary
Yang Kevin K.
Publication venue
Publication date: 19/04/2019
Field of study

Machine learning (ML)-guided directed evolution is a new paradigm for biological design that enables optimization of complex functions. ML methods use data to predict how sequence maps to function without requiring a detailed model of the underlying physics or biological pathways. To demonstrate ML-guided directed evolution, we introduce the steps required to build ML sequence-function models and use them to guide engineering, making recommendations at each stage. This review covers basic concepts relevant to using ML for protein engineering as well as the current literature and applications of this new engineering paradigm. ML methods accelerate directed evolution by learning from information contained in all measured variants and using that information to select sequences that are likely to be improved. We then provide two case studies that demonstrate the ML-guided directed evolution process. We also look to future opportunities where ML will enable discovery of new protein functions and uncover the relationship between protein sequence and function.Comment: Made significant revisions to focus on aspects most relevant to applying machine learning to speed up directed evolutio

arXiv.org e-Print Archive

Caltech Authors

Challenges Predicting Ligand-Receptor Interactions of Promiscuous Proteins: The Nuclear Receptor PXR

Author: A Bender
A Biswas
A Khandelwal
AE Klon
AK Dunker
AK Ghose
B Blumberg
BL Urquhart
CR Yates
CY Ung
D Gupta
D Rogers
D Schuster
DG Teotico
DG Teotico
Erica J. Reschly
G Bertilsson
G Jones
IV Tetko
IV Tetko
J Feng
J Zhou
James M. Briggs
JE Chrencik
JT Metz
K Azzaoui
K Bachmann
K Yasuda
M Hassan
M Suarez
MA Lill
MA Lill
MA Lill
Manisha Iyer
Markus A. Lill
Matthew D. Krasowski
Matthew R. Redinbo
MD Krasowski
MD Krasowski
MN Jacobs
Nidhi
P Prathipati
RD Cramer
RE Watkins
RE Watkins
RE Watkins
S Ekins
S Ekins
S Ekins
S Ekins
S Ekins
S Ekins
S Kortagere
S Kortagere
S Kortagere
S Mani
S Verma
SA Kliewer
Sandhya Kortagere
Sean Ekins
W Mnif
WL DeLano
Y Xue
Y Xue
YD Gao
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Transcriptional regulation of some genes involved in xenobiotic detoxification and apoptosis is performed via the human pregnane X receptor (PXR) which in turn is activated by structurally diverse agonists including steroid hormones. Activation of PXR has the potential to initiate adverse effects, altering drug pharmacokinetics or perturbing physiological processes. Reliable computational prediction of PXR agonists would be valuable for pharmaceutical and toxicological research. There has been limited success with structure-based modeling approaches to predict human PXR activators. Slightly better success has been achieved with ligand-based modeling methods including quantitative structure-activity relationship (QSAR) analysis, pharmacophore modeling and machine learning. In this study, we present a comprehensive analysis focused on prediction of 115 steroids for ligand binding activity towards human PXR. Six crystal structures were used as templates for docking and ligand-based modeling approaches (two-, three-, four- and five-dimensional analyses). The best success at external prediction was achieved with 5D-QSAR. Bayesian models with FCFP_6 descriptors were validated after leaving a large percentage of the dataset out and using an external test set. Docking of ligands to the PXR structure co-crystallized with hyperforin had the best statistics for this method. Sulfated steroids (which are activators) were consistently predicted as non-activators while, poorly predicted steroids were docked in a reverse mode compared to 5α-androstan-3β-ol. Modeling of human PXR represents a complex challenge by virtue of the large, flexible ligand-binding cavity. This study emphasizes this aspect, illustrating modest success using the largest quantitative data set to date and multiple modeling approaches

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Carolina Digital Repository

D-Scholarship@Pitt

Purdue E-Pubs

Data-driven modelling of biological multi-scale processes

Author: Hasenauer Jan
Hross Sabrina
Jagiella Nick
Theis Fabian J.
Publication venue
Publication date: 01/01/2015
Field of study

Biological processes involve a variety of spatial and temporal scales. A holistic understanding of many biological processes therefore requires multi-scale models which capture the relevant properties on all these scales. In this manuscript we review mathematical modelling approaches used to describe the individual spatial scales and how they are integrated into holistic models. We discuss the relation between spatial and temporal scales and the implication of that on multi-scale modelling. Based upon this overview over state-of-the-art modelling approaches, we formulate key challenges in mathematical and computational modelling of biological multi-scale and multi-physics processes. In particular, we considered the availability of analysis tools for multi-scale models and model-based multi-scale data integration. We provide a compact review of methods for model-based data integration and model-based hypothesis testing. Furthermore, novel approaches and recent trends are discussed, including computation time reduction using reduced order and surrogate models, which contribute to the solution of inference problems. We conclude the manuscript by providing a few ideas for the development of tailored multi-scale inference methods.Comment: This manuscript will appear in the Journal of Coupled Systems and Multiscale Dynamics (American Scientific Publishers

arXiv.org e-Print Archive

PuSH

An Optimal Likelihood Free Method for Biological Model Selection

Author: Hui Elliot E.
Zaballa Vincent D.
Publication venue
Publication date: 03/08/2022
Field of study

Systems biology seeks to create math models of biological systems to reduce inherent biological complexity and provide predictions for applications such as therapeutic development. However, it remains a challenge to determine which math model is correct and how to arrive optimally at the answer. We present an algorithm for automated biological model selection using mathematical models of systems biology and likelihood free inference methods. Our algorithm shows improved performance in arriving at correct models without a priori information over conventional heuristics used in experimental biology and random search. This method shows promise to accelerate biological basic science and drug discovery.Comment: 2022 International Conference on Machine Learning Workshop on Computational Biolog

arXiv.org e-Print Archive