456 research outputs found

    Computational Analysis of Structure-Activity Relationships : From Prediction to Visualization Methods

    Get PDF
    Understanding how structural modifications affect the biological activity of small molecules is one of the central themes in medicinal chemistry. By no means is structure-activity relationship (SAR) analysis a priori dependent on computational methods. However, as molecular data sets grow in size, we quickly approach our limits to access and compare structures and associated biological properties so that computational data processing and analysis often become essential. Here, different types of approaches of varying complexity for the analysis of SAR information are presented, which can be applied in the context of screening and chemical optimization projects. The first part of this thesis is dedicated to machine-learning strategies that aim at de novo ligand prediction and the preferential detection of potent hits in virtual screening. High emphasis is put on benchmarking of different strategies and a thorough evaluation of their utility in practical applications. However, an often claimed disadvantage of these prediction methods is their "black box" character because they do not necessarily reveal which structural features are associated with biological activity. Therefore, these methods are complemented by more descriptive SAR analysis approaches showing a higher degree of interpretability. Concepts from information theory are adapted to identify activity-relevant structure-derived descriptors. Furthermore, compound data mining methods exploring prespecified properties of available bioactive compounds on a large scale are designed to systematically relate molecular transformations to activity changes. Finally, these approaches are complemented by graphical methods that primarily help to access and visualize SAR data in congeneric series of compounds and allow the formulation of intuitive SAR rules applicable to the design of new compounds. The compendium of SAR analysis tools introduced in this thesis investigates SARs from different perspectives

    Computational Approaches to Drug Profiling and Drug-Protein Interactions

    Get PDF
    Despite substantial increases in R&D spending within the pharmaceutical industry, denovo drug design has become a time-consuming endeavour. High attrition rates led to a long period of stagnation in drug approvals. Due to the extreme costs associated with introducing a drug to the market, locating and understanding the reasons for clinical failure is key to future productivity. As part of this PhD, three main contributions were made in this respect. First, the web platform, LigNFam enables users to interactively explore similarity relationships between ‘drug like’ molecules and the proteins they bind. Secondly, two deep-learning-based binding site comparison tools were developed, competing with the state-of-the-art over benchmark datasets. The models have the ability to predict offtarget interactions and potential candidates for target-based drug repurposing. Finally, the open-source ScaffoldGraph software was presented for the analysis of hierarchical scaffold relationships and has already been used in multiple projects, including integration into a virtual screening pipeline to increase the tractability of ultra-large screening experiments. Together, and with existing tools, the contributions made will aid in the understanding of drug-protein relationships, particularly in the fields of off-target prediction and drug repurposing, helping to design better drugs faster

    The varieties of the psychedelic experience: A preliminary study of the association between the reported subjective effects and the binding affinity profiles of substituted phenethylamines and tryptamines

    Get PDF
    Classic psychedelics are substances of paramount cultural and neuroscientific importance. A distinctive feature of psychedelic drugs is the wide range of potential subjective effects they can elicit, known to be deeply influenced by the internal state of the user (“set”) and the surroundings (“setting”). The observation of cross-tolerance and a series of empirical studies in humans and animal models support agonism at the serotonin (5-HT)2A receptor as a common mechanism for the action of psychedelics. The diversity of subjective effects elicited by different compounds has been attributed to the variables of “set” and “setting,” to the binding affinities for other 5-HT receptor subtypes, and to the heterogeneity of transduction pathways initiated by conformational receptor states as they interact with different ligands (“functional selectivity”). Here we investigate the complementary (i.e., not mutually exclusive) possibility that such variety is also related to the binding affinity for a range of neurotransmitters and monoamine transporters including (but not limited to) 5-HT receptors. Building on two independent binding affinity datasets (compared to “in silico” estimates) in combination with natural language processing tools applied to a large repository of reports of psychedelic experiences (Erowid’s Experience Vaults), we obtained preliminary evidence supporting that the similarity between the binding affinity profiles of psychoactive substituted phenethylamines and tryptamines is correlated with the semantic similarity of the associated reports. We also showed that the highest correlation was achieved by considering the combined binding affinity for the 5-HT, dopamine (DA), glutamate, muscarinic and opioid receptors and for the Ca+ channel. Applying dimensionality reduction techniques to the reports, we linked the compounds, receptors, transporters and the Ca+ channel to distinct fingerprints of the reported subjective effects. To the extent that the existing binding affinity data is based on a low number of displacement curves that requires further replication, our analysis produced preliminary evidence consistent with the involvement of different binding sites in the reported subjective effects elicited by psychedelics. Beyond the study of this particular class of drugs, we provide a methodological framework to explore the relationship between the binding affinity profiles and the reported subjective effects of other psychoactive compounds.Fil: Zamberlan, Federico. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física de Buenos Aires; ArgentinaFil: Sanz, Camila. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires; ArgentinaFil: Martínez Vivot, Rocío. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones Biomédicas. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones Biomédicas; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física de Buenos Aires; ArgentinaFil: Pallavicini, Carla. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física de Buenos Aires; Argentina. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; ArgentinaFil: Erowid, Fire. Grass Valley; Estados UnidosFil: Erowid, Earth. Grass Valley; Estados UnidosFil: Tagliazucchi, Enzo Rodolfo. Institut du cerveau et de la moelle épinière; Francia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física de Buenos Aires; Argentin

    Multi-faceted Structure-Activity Relationship Analysis Using Graphical Representations

    Get PDF
    A core focus in medicinal chemistry is the interpretation of structure-activity relationships (SARs) of small molecules. SAR analysis is typically carried out on a case-by-case basis for compound sets that share activity against a given target. Although SAR investigations are not a priori dependent on computational approaches, limitations imposed by steady rise in activity information have necessitated the use of such methodologies. Moreover, understanding SARs in multi-target space is extremely difficult. Conceptually different computational approaches are reported in this thesis for graphical SAR analysis in single- as well as multi-target space. Activity landscape models are often used to describe the underlying SAR characteristics of compound sets. Theoretical activity landscapes that are reminiscent of topological maps intuitively represent distributions of pair-wise similarity and potency difference information as three-dimensional surfaces. These models provide easy access to identification of various SAR features. Therefore, such landscapes for actual data sets are generated and compared with graph-based representations. Existing graphical data structures are adapted to include mechanism of action information for receptor ligands to facilitate simultaneous SAR and mechanism-related analyses with the objective of identifying structural modifications responsible for switching molecular mechanisms of action. Typically, SAR analysis focuses on systematic pair-wise relationships of compound similarity and potency differences. Therefore, an approach is reported to calculate SAR feature probabilities on the basis of these pair-wise relationships for individual compounds in a ligand set. The consequent expansion of feature categories improves the analysis of local SAR environments. Graphical representations are designed to avoid a dependence on preconceived SAR models. Such representations are suitable for systematic large-scale SAR exploration. Methods for the navigation of SARs in multi-target space using simple and interpretable data structures are introduced. In summary, multi-faceted SAR analysis aided by computational means forms the primary objective of this dissertation

    Prediction of drug-drug interaction potential using machine learning approaches

    Get PDF
    Drug discovery is a long, expensive, and complex, yet crucial process for the benefit of society. Selecting potential drug candidates requires an understanding of how well a compound will perform at its task, and more importantly, how safe the compound will act in patients. A key safety insight is understanding a molecule\u27s potential for drug-drug interactions. The metabolism of many drugs is mediated by members of the cytochrome P450 superfamily, notably, the CYP3A4 enzyme. Inhibition of these enzymes can alter the bioavailability of other drugs, potentially increasing their levels to toxic amounts. Four models were developed to predict CYP3A4 inhibition: logistic regression, random forests, support vector machine, and neural network. Two novel convolutional approaches were explored for data featurization: SMILES string auto-extraction and 2D structure auto-extraction. The logistic regression model achieved an accuracy of 83.2%, the random forests model, 83.4%, the support vector machine model, 81.9%, and the neural network model, 82.3%. Additionally, the model built with SMILE string auto-extraction had an accuracy of 82.3%, and the model with 2D structure auto-extraction, 76.4%. The advantages of the novel featurization methods are their ability to learn relevant features from compound SMILE strings, eliminating feature engineering. The developed methodologies can be extended towards predicting any structure-activity relationship and fitted for other areas of drug discovery and development

    Optical High Content Nanoscopy of Epigenetic Marks Decodes Phenotypic Divergence in Stem Cells

    Get PDF
    While distinct stem cell phenotypes follow global changes in chromatin marks, single-cell chromatin technologies are unable to resolve or predict stem cell fates. We propose the first such use of optical high content nanoscopy of histone epigenetic marks (epi-marks) in stem cells to classify emergent cell states. By combining nanoscopy with epi-mark textural image informatics, we developed a novel approach, termed EDICTS (Epi-mark Descriptor Imaging of Cell Transitional States), to discern chromatin organizational changes, demarcate lineage gradations across a range of stem cell types and robustly track lineage restriction kinetics. We demonstrate the utility of EDICTS by predicting the lineage progression of stem cells cultured on biomaterial substrates with graded nanotopographies and mechanical stiffness, thus parsing the role of specific biophysical cues as sensitive epigenetic drivers. We also demonstrate the unique power of EDICTS to resolve cellular states based on epi-marks that cannot be detected via mass spectrometry based methods for quantifying the abundance of histone posttranslational modifications. Overall, EDICTS represents a powerful new methodology to predict single cell lineage decisions by integrating high content super-resolution nanoscopy and imaging informatics of the nuclear organization of epi-marks.National Institutes of Health (U.S.) (Grant GM110174

    Big-Data Science in Porous Materials: Materials Genomics and Machine Learning

    Full text link
    By combining metal nodes with organic linkers we can potentially synthesize millions of possible metal organic frameworks (MOFs). At present, we have libraries of over ten thousand synthesized materials and millions of in-silico predicted materials. The fact that we have so many materials opens many exciting avenues to tailor make a material that is optimal for a given application. However, from an experimental and computational point of view we simply have too many materials to screen using brute-force techniques. In this review, we show that having so many materials allows us to use big-data methods as a powerful technique to study these materials and to discover complex correlations. The first part of the review gives an introduction to the principles of big-data science. We emphasize the importance of data collection, methods to augment small data sets, how to select appropriate training sets. An important part of this review are the different approaches that are used to represent these materials in feature space. The review also includes a general overview of the different ML techniques, but as most applications in porous materials use supervised ML our review is focused on the different approaches for supervised ML. In particular, we review the different method to optimize the ML process and how to quantify the performance of the different methods. In the second part, we review how the different approaches of ML have been applied to porous materials. In particular, we discuss applications in the field of gas storage and separation, the stability of these materials, their electronic properties, and their synthesis. The range of topics illustrates the large variety of topics that can be studied with big-data science. Given the increasing interest of the scientific community in ML, we expect this list to rapidly expand in the coming years.Comment: Editorial changes (typos fixed, minor adjustments to figures

    Leveraging 3D chemical similarity, target and phenotypic data in the identification of drug-protein and drug-adverse effect associations

    Get PDF
    Additional file 5: Figure S4. Number of side effects and targets for each drug in the target-phenotype model

    A triple helix model of medical innovation: supply, demand, and technological capabilities in terms of medical subject headings

    Get PDF
    We develop a model of innovation that enables us to trace the interplay among three key dimensions of the innovation process: (i) demand of and (ii) supply for innovation, and (iii) technological capabilities available to generate innovation in the forms of products, processes, and services. Building on triple helix research, we use entropy statistics to elaborate an indicator of mutual information among these dimensions that can provide indication of reduction of uncertainty. To do so, we focus on the medical context, where uncertainty poses significant challenges to the governance of innovation. We use the Medical Subject Headings (MeSH) of MEDLINE/PubMed to identify publications classified within the categories “Diseases" (C), "Drugs and Chemicals" (D), "Analytic, Diagnostic, and Therapeutic Techniques and Equipment" (E) and use these as knowledge representations of demand, supply, and technological capabilities, respectively. Three case-studies of medical research areas are used as representative 'entry perspectives' of the medical innovation process. These are: (i) human papilloma virus, (ii) RNA interference, and (iii) magnetic resonance imaging. We find statistically significant periods of synergy among demand, supply, and technological capabilities (C-D-E) that point to three-dimensional interactions as a fundamental perspective for the understanding and governance of the uncertainty associated with medical innovation. Among the pairwise configurations in these contexts, the demand-technological capabilities (C-E) provided the strongest link, followed by the supply-demand (D-C) and the supply-technological capabilities (D-E) channels
    corecore