Search CORE

47 research outputs found

Structural similarity assessment for drug sensitivity prediction in cancer

Author: A Monks
DS Gilmour
DT Ross
E Sayers
J Khan
J Perret
JE Staunton
JK Lee
LM Shi
LM Shi
LN Harris
Michael Krauthammer
Pavithra Shivakumar
SJ Swamidass
T Sørlie
TR Golub
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The ability to predict drug sensitivity in cancer is one of the exciting promises of pharmacogenomic research. Several groups have demonstrated the ability to predict drug sensitivity by integrating chemo-sensitivity data and associated gene expression measurements from large anti-cancer drug screens such as NCI-60. The general approach is based on comparing gene expression measurements from sensitive and resistant cancer cell lines and deriving drug sensitivity profiles consisting of lists of genes whose expression is predictive of response to a drug. Importantly, it has been shown that such profiles are generic and can be applied to cancer cell lines that are not part of the anti-cancer screen. However, one limitation is that the profiles can not be generated for untested drugs (i.e., drugs that are not part of an anti-cancer drug screen). In this work, we propose using an existing drug sensitivity profile for drug A as a substitute for an untested drug B given high structural similarities between drugs A and B. Results We first show that structural similarity between pairs of compounds in the NCI-60 dataset highly correlates with the similarity between their activities across the cancer cell lines. This result shows that structurally similar drugs can be expected to have a similar effect on cancer cell lines. We next set out to test our hypothesis that we can use existing drug sensitivity profiles as substitute profiles for untested drugs. In a cross-validation experiment, we found that the use of substitute profiles is possible without a significant loss of prediction accuracy if the substitute profile was generated from a compound with high structural similarity to the untested compound. Conclusion Anti-cancer drug screens are a valuable resource for generating omics-based drug sensitivity profiles. We show that it is possible to extend the usefulness of existing screens to untested drugs by deriving substitute sensitivity profiles from structurally similar drugs part of the screen.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

WENDI: A tool for finding non-obvious relationships between compounds and biological properties, genes, diseases and scholarly publications

Author: B Chen
David J Wild
DJ Wild
DJ Wild
E Willighagen
F Belleau
GM Cramer
H Wang
J Hur
JL Durant
MA Johnson
Michael S Lajiness
PJ Ballester
Qian Zhu
R Mullin
SJ Swamidass
X Dong
X Dong
Ying Ding
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background In recent years, there has been a huge increase in the amount of publicly-available and proprietary information pertinent to drug discovery. However, there is a distinct lack of data mining tools available to harness this information, and in particular for knowledge discovery across multiple information sources. At Indiana University we have an ongoing project with Eli Lilly to develop web-service based tools for integrative mining of chemical and biological information. In this paper, we report on the first of these tools, called WENDI (Web Engine for Non-obvious Drug Information) that attempts to find non-obvious relationships between a query compound and scholarly publications, biological properties, genes and diseases using multiple information sources. Results We have created an aggregate web service that takes a query compound as input, calls multiple web services for computation and database search, and returns an XML file that aggregates this information. We have also developed a client application that provides an easy-to-use interface to this web service. Both the service and client are publicly available. Conclusions Initial testing indicates this tool is useful in identifying potential biological applications of compounds that are not obvious, and in identifying corroborating and conflicting information from multiple sources. We encourage feedback on the tool to help us refine it further. We are now developing further tools based on this model.</p

Crossref

Springer - Publisher Connector

IUScholarWorks (Indiana University)

Directory of Open Access Journals

PubMed Central

Interpreting linear support vector machine models with heat map molecule coloring

Author: A Bender
Andreas Jahn
Andreas Zell
B Schölkopf
C Steinbeck
D Bossemeyer
D Fourches
D Rogers
D Weininger
G Hinselmann
Georg Hinselmann
H Kubinyi
I Guyon
J Bajorath
J Kazius
J Mohr
J Orts
K Hasegawa
KD Freeman-Cook
KH Bleicher
L Han
L Prade
L Ralaivola
Lars Rosenbaum
MS Buchanan
N Fechner
P Jonathan
RE Fan
SG Rohrer
SJ Swamidass
SM Free
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Model-based virtual screening plays an important role in the early drug discovery stage. The outcomes of high-throughput screenings are a valuable source for machine learning algorithms to infer such models. Besides a strong performance, the interpretability of a machine learning model is a desired property to guide the optimization of a compound in later drug discovery stages. Linear support vector machines showed to have a convincing performance on large-scale data sets. The goal of this study is to present a heat map molecule coloring technique to interpret linear support vector machine models. Based on the weights of a linear model, the visualization approach colors each atom and bond of a compound according to its importance for activity. Results We evaluated our approach on a toxicity data set, a chromosome aberration data set, and the maximum unbiased validation data sets. The experiments show that our method sensibly visualizes structure-property and structure-activity relationships of a linear support vector machine model. The coloring of ligands in the binding pocket of several crystal structures of a maximum unbiased validation data set target indicates that our approach assists to determine the correct ligand orientation in the binding pocket. Additionally, the heat map coloring enables the identification of substructures important for the binding of an inhibitor. Conclusions In combination with heat map coloring, linear support vector machine models can help to guide the modification of a compound in later stages of drug discovery. Particularly substructures identified as important by our method might be a starting point for optimization of a lead compound. The heat map coloring should be considered as complementary to structure based modeling approaches. As such, it helps to get a better understanding of the binding mode of an inhibitor.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Prediction of chemical compounds properties using a deep learning model

Author: A Agrawal
A Agrawal
A Koutsoukas
A Lusci
A Mayr
A Shivanyuk
AG Gagorik
AP Bento
AP Bradley
B Cox
C Zhang
D Bajusz
D Butina
D Weininger
D Wishart
FP Miller
HH Aghdam
J Irwin
J Ker
M Davies
M Klose
M Mozaffar
M Popova
MD Hoffman
O Kramer
R Gómez-Bombarelli
S Simplified
SG Rohrer
SJ Swamidass
SM Kearnes
T Dietterich
Y Zhang
Z Wu
Z Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/10/2021
Field of study

Crossref

Ulster University`s Research Portal

ROC Curves for the Statistical Analysis of Microarray Data

Author: C Liu
CA Tsai
CA Tsai
CE Metz
D Berrar
D Ghosh
DM Green
DV Zaykin
HM Hsueh
HP Chan
JA Hanley
JA Swets
JD Mari
JJ Chen
L Kang
M Greiner
MS Pepe
MS Pepe
MS Pepe
MS Pepe
MS Pepe
MS Pepe
PH Westfall
R Tibshirani
RR Delongchamp
SG Baker
SJ Swamidass
T Hastie
T Hastie
T Rankinen
TA Alonzo
VG Tusher
W Gu
Y Benjamini
Y Benjamini
Y Hochberg
Y Huang
Publication venue: Springer Nature
Publication date: 01/01/2019
Field of study

This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/978-1-4939-9442-7_11[Abstract]: A receiver operating characteristic (ROC) curve is a graphical plot that illustrates the diagnostic ability of a binary classifier as a function of its discrimination threshold. This chapter is an overview on the use of ROC curves for microarray data. The notion of ROC curve and its motivation is introduced in Subheading 1. Relevant scientific contributions concerning the use of ROC curves for microarray data are briefly reviewed in Subheading 2. The special case with covariates is considered in Subheading 3. Two relevant aspects are reviewed in this section: the use of LASSO techniques for selecting and combining relevant markers and how to correct for multiple testing when a large number of markers are available. Finally, some conclusions are included

Repositorio da Universidade da Coruña

Crossref

A constructive approach for discovering new drug leads: Using a kernel methodology for the inverse-QSAR problem

Author: A Tatsuya
A Tatsuya
AC Good
AC Good
B Mak
BB Masek
C Steinbeck
C Steinbeck
CA Azencott
CJ Churchwell
DB Reitz
FJ Burkowski
Forbes J Burkowski
GH Bakir
HC Huang
J Shawe-Taylor
JJ Sutherland
JL Faulon
JL Faulon
JL Faulon
JL Faulon
JTY Kwok
JW Robin
K-R Müller
KA Sharp
L Ralaivola
LB Kier
LH Hall
LH Hall
MI Skvortsova
N Brown
P Chavatte
P Mahe
P Mahe
PA Pevzner
R Todeschini
RA Lewis
RC Glenn
RP Sheridan
S Mika
SJ Swamidass
V Kvasnicka
V Venkatasubramanian
VJ Gillet
William WL Wong
X Leval
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The inverse-QSAR problem seeks to find a new molecular descriptor from which one can recover the structure of a molecule that possess a desired activity or property. Surprisingly, there are very few papers providing solutions to this problem. It is a difficult problem because the molecular descriptors involved with the inverse-QSAR algorithm must adequately address the forward QSAR problem for a given biological activity if the subsequent recovery phase is to be meaningful. In addition, one should be able to construct a feasible molecule from such a descriptor. The difficulty of recovering the molecule from its descriptor is the major limitation of most inverse-QSAR methods. Results In this paper, we describe the reversibility of our previously reported descriptor, the vector space model molecular descriptor (VSMMD) based on a vector space model that is suitable for kernel studies in QSAR modeling. Our inverse-QSAR approach can be described using five steps: (1) generate the VSMMD for the compounds in the training set; (2) map the VSMMD in the input space to the kernel feature space using an appropriate kernel function; (3) design or generate a new point in the kernel feature space using a kernel feature space algorithm; (4) map the feature space point back to the input space of descriptors using a pre-image approximation algorithm; (5) build the molecular structure template using our VSMMD molecule recovery algorithm. Conclusion The empirical results reported in this paper show that our strategy of using kernel methodology for an inverse-Quantitative Structure-Activity Relationship is sufficiently powerful to find a meaningful solution for practical problems.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Structure-activity models of oral clearance, cytotoxicity, and LD50: a screen for promising anticancer compounds

Author: A Hoskuldsson
AE Soffers
AK Saxena
AL Boulesteix
CW Andrews
D Thai
D Zmuidinavicius
DM Hawkins
DV Nguyen
F Yoshida
G Fort
G Lou
G Wang
H Gonzalez-Diaz
H Wold
HJ Pieniaszek Jr.
IO Juranic
J Ghasemi
J Tunkel
J Wegelin
JC Boik
JC Madden
John C Boik
JR Votano
JV Turner
K Yu
L Eriksson
L Ralaivola
M Ashton
M Momma
M Olah
M Pintore
M Zahouily
MA Perez
MD Wessel
N Brown
O Isayev
P Buchwald
R Burgos-Vargas
R Caruana
R Rosipal
RK Ando
Robert A Newman
S Ben-David
S Rannar
SJ Swamidass
T Evgeniou
T Hou
T Niwa
T Wajima
TE Yen
TI Oprea
W Deng
W Halle
WJ Hunter
Y Xue
YH Zhao
YH Zhao
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Confidence limits, error bars and method comparison in molecular modeling. Part 2: comparing methods

Author: A Nicholls
A. Nicholls
C Woolston
GY Zou
H Motulsky
H Theil
I Rivals
J Cohen
J Tukey
K Pearson
K Pearson
L Wasserman
M Keuls
MWL Cheung
OJ Dunn
PF Sullivan
R Fisher
S Holm
SJ Swamidass
ST Ziliak
VE Johnson
Y Hochberg
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

All-atom/coarse-grained hybrid predictions of distribution coefficients in SAMPL5

Author: A Jakalian
AV Marenich
C Abrams
D Perez
D Shivakumar
DL Mobley
DL Mobley
DL Mobley
HJC Berendsen
J Wang
J Zhang
J Zhang
J-P Ryckaert
JG Kirkwood
JL Knight
Jonathan W. Essex
JP Guthrie
M Orsi
M Orsi
M Orsi
MG Saunders
MR Shirts
MT Geballe
MT Geballe
P Maragakis
PH Hünenberger
RO Dror
RW Hockney
S Genheden
S Genheden
Samuel Genheden
SJ Marrink
SJ Swamidass
T Schlick
WG Noid
X Periole
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

We present blind predictions submitted to the SAMPL5 challenge on calculating distribution coefficients. The predictions were based on estimating the solvation free energies in water and cyclohexane of the 53 compounds in the challenge. These free energies were computed using alchemical free energy simulations based on a hybrid all-atom/coarse-grained model. The compounds were treated with the general Amber force field, whereas the solvent molecules were treated with the Elba coarse-grained model. Considering the simplicity of the solvent model and that we approximate the distribution coefficient with the partition coefficient of the neutral species, the predictions are of good accuracy. The correlation coefficient, R is 0.64, 82 % of the predictions have the correct sign and the mean absolute deviation is 1.8 log units. This is on a par with or better than the other simulation-based predictions in the challenge. We present an analysis of the deviations to experiments and compare the predictions to another submission that used all-atom solvent. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s10822-016-9926-z) contains supplementary material, which is available to authorized users

Crossref

Springer - Publisher Connector

PubMed Central

Swepub

Beyond new chemical entities

Author: Cascorbi I
Shao L
Swamidass SJ
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref