178 research outputs found
Computer-Aided Multi-Objective Optimization in Small Molecule Discovery
Molecular discovery is a multi-objective optimization problem that requires
identifying a molecule or set of molecules that balance multiple, often
competing, properties. Multi-objective molecular design is commonly addressed
by combining properties of interest into a single objective function using
scalarization, which imposes assumptions about relative importance and uncovers
little about the trade-offs between objectives. In contrast to scalarization,
Pareto optimization does not require knowledge of relative importance and
reveals the trade-offs between objectives. However, it introduces additional
considerations in algorithm design. In this review, we describe pool-based and
de novo generative approaches to multi-objective molecular discovery with a
focus on Pareto optimization algorithms. We show how pool-based molecular
discovery is a relatively direct extension of multi-objective Bayesian
optimization and how the plethora of different generative models extend from
single-objective to multi-objective optimization in similar ways using
non-dominated sorting in the reward function (reinforcement learning) or to
select molecules for retraining (distribution learning) or propagation (genetic
algorithms). Finally, we discuss some remaining challenges and opportunities in
the field, emphasizing the opportunity to adopt Bayesian optimization
techniques into multi-objective de novo design
Modeling reactivity to biological macromolecules with a deep multitask network
Most
small-molecule drug candidates fail before entering the market,
frequently because of unexpected toxicity. Often, toxicity is detected
only late in drug development, because many types of toxicities, especially
idiosyncratic adverse drug reactions (IADRs), are particularly hard
to predict and detect. Moreover, drug-induced liver injury (DILI)
is the most frequent reason drugs are withdrawn from the market and
causes 50% of acute liver failure cases in the United States. A common
mechanism often underlies many types of drug toxicities, including
both DILI and IADRs. Drugs are bioactivated by drug-metabolizing enzymes
into reactive metabolites, which then conjugate to sites in proteins
or DNA to form adducts. DNA adducts are often mutagenic and may alter
the reading and copying of genes and their regulatory elements, causing
gene dysregulation and even triggering cancer. Similarly, protein
adducts can disrupt their normal biological functions and induce harmful
immune responses. Unfortunately, reactive metabolites are not reliably
detected by experiments, and it is also expensive to test drug candidates
for potential to form DNA or protein adducts during the early stages
of drug development. In contrast, computational methods have the potential
to quickly screen for covalent binding potential, thereby flagging
problematic molecules and reducing the total number of necessary experiments.
Here, we train a deep convolution neural networkî—¸the XenoSite
reactivity modelî—¸using literature data to accurately predict
both sites and probability of reactivity for molecules with glutathione,
cyanide, protein, and DNA. On the site level, cross-validated predictions
had area under the curve (AUC) performances of 89.8% for DNA and 94.4%
for protein. Furthermore, the model separated molecules electrophilically
reactive with DNA and protein from nonreactive molecules with cross-validated
AUC performances of 78.7% and 79.8%, respectively. On both the site-
and molecule-level, the model’s performances significantly
outperformed reactivity indices derived from quantum simulations that
are reported in the literature. Moreover, we developed and applied
a selectivity score to assess preferential reactions with the macromolecules
as opposed to the common screening traps. For the entire data set
of 2803 molecules, this approach yielded totals of 257 (9.2%) and
227 (8.1%) molecules predicted to be reactive only with DNA and protein,
respectively, and hence those that would be missed by standard reactivity
screening experiments. Site of reactivity data is an underutilized
resource that can be used to not only predict if molecules are reactive,
but also show where they might be modified to reduce toxicity while
retaining efficacy. The XenoSite reactivity model is available at http://swami.wustl.edu/xenosite/p/reactivity
Recommended from our members
Chemical Information Bulletin
Periodic supplement for "the regular journals of the American Chemical Society," containing annotated bibliographies of chemical documentation literature as well as information about meetings, conferences, awards, scholarships, and other news from the American Chemical Society (ACS) Division of Chemical Literature
Recommended from our members
Chemical Information Bulletin
Created as a supplement for "the regular journals of the American Chemical Society," this publication contains annotated bibliographies of chemical documentation literature as well as information about meetings, conferences, awards, scholarships, and other news from the American Chemical Society (ACS) Division of Chemical Information (CINF)
Recommended from our members
Chemical Information Bulletin
Periodic supplement for "the regular journals of the American Chemical Society," containing annotated bibliographies of chemical documentation literature as well as information about meetings, conferences, awards, scholarships, and other news from the American Chemical Society (ACS) Division of Chemical Literature
Development and implementation of in silico molecule fragmentation algorithms for the cheminformatics analysis of natural product spaces
Computational methodologies extracting specific substructures like functional groups or molecular scaffolds from input molecules can be grouped under the term “in silico molecule fragmentation”. They can be used to investigate what specifically characterises a heterogeneous compound class, like pharmaceuticals or Natural Products (NP) and in which aspects they are similar or dissimilar. The aim is to determine what specifically characterises NP structures to transfer patterns favourable for bioactivity to drug development. As part of this thesis, the first algorithmic approach to in silico deglycosylation, the removal of glycosidic moieties for the study of aglycones, was developed with the Sugar Removal Utility (SRU) (Publication A). The SRU has also proven useful for investigating NP glycoside space. It was applied to one of the largest open NP databases, COCONUT (COlleCtion of Open Natural prodUcTs), for this purpose (Publication B). A contribution was made to the Chemistry Development Kit (CDK) by developing the open Scaffold Generator Java library (Publication C). Scaffold Generator can extract different scaffold types and dissect them into smaller parent scaffolds following the scaffold tree or scaffold network approach. Publication D describes the OngLai algorithm, the first automated method to identify homologous series in input datasets, group the member structures of each group, and extract their common core. To support the development of new fragmentation algorithms, the open Java rich client graphical user interface application MORTAR (MOlecule fRagmenTAtion fRamework) was developed as part of this thesis (Publication E). MORTAR allows users to quickly execute the steps of importing a structural dataset, applying a fragmentation algorithm, and visually inspecting the results in different ways. All software developed as part of this thesis is freely and openly available (see https://github.com/JonasSchaub)
Recommended from our members
Chemical Information Bulletin
Created as a supplement for "the regular journals of the American Chemical Society," this publication contains annotated bibliographies of chemical documentation literature as well as information about meetings, conferences, awards, scholarships, and other news from the American Chemical Society (ACS) Division of Chemical Information (CINF)
- …