60 research outputs found
Deep Archetypal Analysis
"Deep Archetypal Analysis" generates latent representations of
high-dimensional datasets in terms of fractions of intuitively understandable
basic entities called archetypes. The proposed method is an extension of linear
"Archetypal Analysis" (AA), an unsupervised method to represent multivariate
data points as sparse convex combinations of extremal elements of the dataset.
Unlike the original formulation of AA, "Deep AA" can also handle side
information and provides the ability for data-driven representation learning
which reduces the dependence on expert knowledge. Our method is motivated by
studies of evolutionary trade-offs in biology where archetypes are species
highly adapted to a single task. Along these lines, we demonstrate that "Deep
AA" also lends itself to the supervised exploration of chemical space, marking
a distinct starting point for de novo molecular design. In the unsupervised
setting we show how "Deep AA" is used on CelebA to identify archetypal faces.
These can then be superimposed in order to generate new faces which inherit
dominant traits of the archetypes they are based on.Comment: Published at the German Conference on Pattern Recognition 2019 (GCPR
Deep Archetypal Analysis
Deep Archetypal Analysis (DeepAA) generates latent representations of high-dimensional datasets in terms of intuitively understandable basic entities called archetypes. The proposed method extends linear Archetypal Analysis (AA), an unsupervised method to represent multivariate data points as convex combinations of extremal data points. Unlike the original formulation, Deep AA is generative and capable of handling side information. In addition, our model provides the ability for data-driven representation learning which reduces the dependence on expert knowledge. We empirically demonstrate the applicability of our approach by exploring the chemical space of small organic molecules. In doing so, we employ the archetype constraint to learn two different latent archetype representations for the same dataset, with respect to two chemical properties. This type of supervised exploration marks a distinct starting point and let us steer de novo molecular design
Discovery of Potent Positive Allosteric Modulators of the α3β2 Nicotinic Acetylcholine Receptor by a Chemical Space Walk in ChEMBL
While a plethora of ligands are known for the well studied ?7 and ?4?2 nicotinic acetylcholine receptor (nAChR), only very few ligands address the related ?3?2 nAChR expressed in the central nervous system and at the neuromuscular junction. Starting with the public database ChEMBL organized in the chemical space of Molecular Quantum Numbers (MQN, a series of 42 integer value descriptors of molecular structure), a visual survey of nearest neighbors of the ?7 nAChR partial agonist N-(3R)-1- azabicyclo[2.2.2]oct-3-yl-4-chlorobenzamide (PNU-282,987) pointed to N-(2-halobenzyl)-3-aminoquinuclidines as possible nAChR modulators. This simple "chemical space walk" was performed using a web-browser available at www.gdb.unibe.ch. Electrophysiological recordings revealed that these ligands represent a new and to date most potent class of positive allosteric modulators (PAMs) of the ?3?2 nAChR, which also exert significant effects in vivo. The present discovery highlights the value of surveying chemical space neighbors of known drugs within public databases to uncover new pharmacology. ďż˝ 2014 American Chemical Society
Deep Molecular Representation in Cheminformatics
It is clear that the molecular representations are clustered by the corresponding ELUMO values 7 Conclusion In this work the applications of machine learning in Cheminformatics are outlined together with the background of quantum-chemical ..
Chemogenomic Profiling Provides Insights into the Limited Activity of Irreversible EGFR Inhibitors in Tumor Cells Expressing the T790M EGFR Resistance Mutation
Copyright © 2010 by the American Association for Cancer Researc
Expanding the fragrance chemical space for virtual screening
The properties of fragrance molecules in the public databases SuperScent and Flavornet were analyzed to define a “fragrance-like” (FL) property range (Heavy Atom Count ≤ 21, only C, H, O, S, (O + S) ≤ 3, Hydrogen Bond Donor ≤ 1) and the corresponding chemical space including FL molecules from PubChem (NIH repository of molecules), ChEMBL (bioactive molecules), ZINC (drug-like molecules), and GDB-13 (all possible organic molecules up to 13 atoms of C, N, O, S, Cl). The FL subsets of these databases were classified by MQN (Molecular Quantum Numbers, a set of 42 integer value descriptors of molecular structure) and formatted for fast MQN-similarity searching and interactive exploration of color-coded principal component maps in form of the FL-mapplet and FL-browser applications freely available at http://www.gdb.unibe.ch. MQN-similarity is shown to efficiently recover 15 different fragrance molecule families from the different FL subsets, demonstrating the relevance of the MQN-based tool to explore the fragrance chemical space
- …