2,310 research outputs found

    Interactive Exploration of Chemical Space with Scaffold Hunter

    Get PDF
    The supporting information is composed of the following files: I. pyruvatekinasedata.zip The pyruvate kinase data set used for the analysis described in the referenced publication is contained in this file. The analysis is based on the Pyruvate Kinase Screen as published in PubChem under the assay ID 361. It contains all compounds checked in this screen together with the scaffold tree generated from it. Scaffold Hunter can be used to query the database and interactively display the scaffold tree. This file is a dump from a MySQL 5.1 database and was generated with MySQL Administrator 1.2.5. It can be restored with the same program. II. scaffoldhunter_profiles.zip Scaffold Hunter saves the user profiles either on the hard disk or in a database. The corresponding database schema is contained in this zip file. This schema must be contained in the MySQL database before Scaffold Hunter can be run. This file is a dump from a MySQL 5.1 database and was generated with MySQL Administrator 1.2.5. It can be restored with the same program. III. InstallationGuide_Databases.pdf This document describes the installation of a local MySQL database server and the graphical user interface MySQL Administrator. Restoration of the profiles and sample databases are also described. IV. run_ScaffoldHunter.bat Windows batch file to run Scaffold Hunter with 1024 MByte of Memory. V. run_ScaffoldTreeGenerator.bat Windows batch file to run ScaffoldTreeGenerator with 1024 MByte of Memory. VI. ScaffoldHunter_readme.txt Textfile with advice for the installation of Scaffold Hunter. VII. ScaffoldTreeGenerator_readme.txt Textfile with advice for the installation of ScaffoldTree Generator

    HiTSEE KNIME: a visualization tool for hit selection and analysis in high-throughput screening experiments for the KNIME platform

    Get PDF
    We present HiTSEE (High-Throughput Screening Exploration Environment), a visualization tool for the analysis of large chemical screens used to examine biochemical processes. The tool supports the investigation of structure-activity relationships (SAR analysis) and, through a flexible interaction mechanism, the navigation of large chemical spaces. Our approach is based on the projection of one or a few molecules of interest and the expansion around their neighborhood and allows for the exploration of large chemical libraries without the need to create an all encompassing overview of the whole library. We describe the requirements we collected during our collaboration with biologists and chemists, the design rationale behind the tool, and two case studies on different datasets. The described integration (HiTSEE KNIME) into the KNIME platform allows additional flexibility in adopting our approach to a wide range of different biochemical problems and enables other research groups to use HiTSEE

    Exploration of scaffolds from natural products with antiplasmodial activities, currently registered antimalarial drugs and public malarial screen data

    Get PDF
    In light of current resistance to antimalarial drugs, there is a need to discover new classes of antimalarial agents with unique mechanisms of action. Identification of unique scaffolds from natural products with in vitro antiplasmodial activities may be the starting point for such new classes of antimalarial agents. We therefore conducted scaffold diversity and comparison analysis of natural products with in vitro antiplasmodial activities (NAA), currently registered antimalarial drugs (CRAD) and malaria screen data from Medicine for Malaria Ventures (MMV). The scaffold diversity analyses on the three datasets were performed using scaffold counts and cumulative scaffold frequency plots. Scaffolds from the NAA were compared to those from CRAD and MMV. A Scaffold Tree was also generated for each of the datasets and the scaffold diversity of NAA was found to be higher than that of MMV. Among the NAA compounds, we identified unique scaffolds that were not contained in any of the other compound datasets. These scaffolds from NAA also possess desirable drug-like properties making them ideal starting points for antimalarial drug design considerations. The Scaffold Tree showed the preponderance of ring systems in NAA and identified virtual scaffolds, which may be potential bioactive compounds

    ScaffoldGraph: an open-source library for the generation and analysis of molecular scaffold networks and scaffold trees

    Get PDF
    SUMMARY: ScaffoldGraph (SG) is an open-source Python library and command-line tool for the generation and analysis of molecular scaffold networks and trees, with the capability of processing large sets of input molecules. With the increase in high-throughput screening (HTS) data, scaffold graphs have proven useful for the navigation and analysis of chemical space, being used for visualisation, clustering, scaffold-diversity analysis and active-series identification. Built on RDKit and NetworkX, SG integrates scaffold graph analysis into the growing scientific/cheminformatics Python stack, increasing the flexibility and extendibility of the tool compared to existing software. AVAILABILITY AND IMPLEMENTATION: SG is freely available and released under the MIT license at https://github.com/UCLCheminformatics/ScaffoldGraph

    Automated exploration of prebiotic chemical reaction space: progress and perspectives

    Get PDF
    Prebiotic chemistry often involves the study of complex systems of chemical reactions that form large networks with a large number of diverse species. Such complex systems may have given rise to emergent phenomena that ultimately led to the origin of life on Earth. The environmental conditions and processes involved in this emergence may not be fully recapitulable, making it difficult for experimentalists to study prebiotic systems in laboratory simulations. Computational chemistry offers efficient ways to study such chemical systems and identify the ones most likely to display complex properties associated with life. Here, we review tools and techniques for modelling prebiotic chemical reaction networks and outline possible ways to identify self-replicating features that are central to many origin-of-life models

    Development and implementation of in silico molecule fragmentation algorithms for the cheminformatics analysis of natural product spaces

    Get PDF
    Computational methodologies extracting specific substructures like functional groups or molecular scaffolds from input molecules can be grouped under the term “in silico molecule fragmentation”. They can be used to investigate what specifically characterises a heterogeneous compound class, like pharmaceuticals or Natural Products (NP) and in which aspects they are similar or dissimilar. The aim is to determine what specifically characterises NP structures to transfer patterns favourable for bioactivity to drug development. As part of this thesis, the first algorithmic approach to in silico deglycosylation, the removal of glycosidic moieties for the study of aglycones, was developed with the Sugar Removal Utility (SRU) (Publication A). The SRU has also proven useful for investigating NP glycoside space. It was applied to one of the largest open NP databases, COCONUT (COlleCtion of Open Natural prodUcTs), for this purpose (Publication B). A contribution was made to the Chemistry Development Kit (CDK) by developing the open Scaffold Generator Java library (Publication C). Scaffold Generator can extract different scaffold types and dissect them into smaller parent scaffolds following the scaffold tree or scaffold network approach. Publication D describes the OngLai algorithm, the first automated method to identify homologous series in input datasets, group the member structures of each group, and extract their common core. To support the development of new fragmentation algorithms, the open Java rich client graphical user interface application MORTAR (MOlecule fRagmenTAtion fRamework) was developed as part of this thesis (Publication E). MORTAR allows users to quickly execute the steps of importing a structural dataset, applying a fragmentation algorithm, and visually inspecting the results in different ways. All software developed as part of this thesis is freely and openly available (see https://github.com/JonasSchaub)

    C-SPADE : a web-tool for interactive analysis and visualization of drug screening experiments through compound-specific bioactivity dendrograms

    Get PDF
    The advent of polypharmacology paradigm in drug discovery calls for novel chemoinformatic tools for analyzing compounds' multi-targeting activities. Such tools should provide an intuitive representation of the chemical space through capturing and visualizing underlying patterns of compound similarities linked to their polypharmacological effects. Most of the existing compound-centric chemoinformatics tools lack interactive options and user interfaces that are critical for the real-time needs of chemical biologists carrying out compound screening experiments. Toward that end, we introduce C-SPADE, an open-source exploratory web-tool for interactive analysis and visualization of drug profiling assays (biochemical, cell-based or cell-free) using compound-centric similarity clustering. C-SPADE allows the users to visually map the chemical diversity of a screening panel, explore investigational compounds in terms of their similarity to the screening panel, perform polypharmacological analyses and guide drug-target interaction predictions. C-SPADE requires only the raw drug profiling data as input, and it automatically retrieves the structural information and constructs the compound clusters in real-time, thereby reducing the time required for manual analysis in drug development or repurposing applications. The web-tool provides a customizable visual workspace that can either be downloaded as figure or Newick tree file or shared as a hyperlink with other users. C-SPADE is freely available at http://cspade.fimm.fi/.Peer reviewe

    Interactive graph drawing with constraints

    Get PDF
    This thesis investigates the requirements for graph drawing stemming from practical applications, and presents both theoretical as well as practical results and approaches to handle them. Many approaches to compute graph layouts in various drawing styles exist, but the results are often not sufficient for use in practice. Drawing conventions, graphical notation standards, and user-defined requirements restrict the set of admissible drawings. These restrictions can be formalized as constraints for the layout computation. We investigate the requirements and give an overview and categorization of the corresponding constraints. Of main importance for the readability of a graph drawing is the number of edge crossings. In case the graph is planar it should be drawn without crossings, otherwise we should aim to use the minimum number of crossings possible. However, several types of constraints may impose restrictions on the way the graph can be embedded in the plane. These restrictions may have a strong impact on crossing minimization. For two types of such constraints we present specific solutions how to consider them in layout computation: We introduce the class of so-called embedding constraints, which restrict the order of the edges around a vertex. For embedding constraints we describe approaches for planarity testing, embedding, and edge insertion with the minimum number of crossings. These problems can be solved in linear time with our approaches. The second constraint type that we tackle are clusters. Clusters describe a hierarchical grouping of the graph's vertices that has to be reflected in the drawing. The complexity of the corresponding clustered planarity testing problem for clustered graphs is unknown so far. We describe a technique to compute a maximum clustered planar subgraph of a clustered graph. Our solution is based on an Integer Linear Program (ILP) formulation and includes also the first practical clustered planarity test for general clustered graphs. The resulting subgraph can be used within the first step of the planarization approach for clustered graphs. In addition, we describe how to improve the performance for pure clustered planarity testing by implying a branch-and-price approach. Large and complex graphs nowadays arise in many application domains. These graphs require interaction and navigation techniques to allow exploration of the underlying data. The corresponding concepts are presented and solutions for three practical applications are proposed: First, we describe Scaffold Hunter, a tool for the exploration of chemical space. We show how to use a hierarchical classification of molecules for the visual navigation in chemical space. The resulting visualization is embedded into an interactive environment that allows visual analysis of chemical compound databases. Finally, two interactive visualization approaches for two types of biological networks, protein-domain networks and residue interaction networks, are presented.In zahlreichen Anwendungsgebieten werden Informationen als Graphen modelliert und mithilfe dieser Graphen visualisiert. Eine übersichtliche Darstellung hilft bei der Analyse und unterstützt das Verständnis bei der Präsentation von Informationen mittels graph-basierter Diagramme. Neben allgemeinen ästhetischen Kriterien bestehen für eine solche Darstellung Anforderungen, die sich aus der Charakteristik der Daten, etablierten Darstellungskonventionen und der konkreten Fragestellung ergeben. Zusätzlich ist häufig eine individuelle Anpassung der Darstellung durch den Anwender gewünscht. Diese Anforderungen können mithilfe von Nebenbedingungen für die Berechnung eines Layouts formuliert werden. Trotz einer Vielzahl unterschiedlicher Anforderungen aus zahlreichen Anwendungsgebieten können die meisten Anforderungen über einige generische Nebenbedingungen formuliert werden. In dieser Arbeit untersuchen wir die Anforderungen aus der Praxis und beschreiben eine Zuordnung zu Nebenbedingungen für die Layoutberechnung. Wir geben eine Übersicht über den aktuellen Stand der Behandlung von Nebenbedingungen beim Zeichnen von Graphen und kategorisieren diese nach grundlegenden Eigenschaften. Von besonderer Wichtigkeit für die Qualität einer Darstellung ist die Anzahl der Kreuzungen. Planare Graphen sollten kreuzungsfrei gezeichnet werden, bei nicht-planaren Graphen sollte die minimale Anzahl Kreuzungen erreicht werden. Einige Nebenbedingungen beschränken jedoch die Möglichkeit, den Graph in die Ebene einzubetten. Dies kann starke Auswirkungen auf das Ergebnis der Kreuzungsminimierung haben. Zwei wichtige Typen solcher Nebenbedingungen werden in dieser Arbeit näher untersucht. Mit den Embedding Constraints führen wir eine Klasse von Nebenbedingungen ein, welche die mögliche Reihenfolge der Kanten um einen Knoten beschränken. Für diese Klasse präsentieren wir Linearzeitalgorithmen für das Testen der Planarität und das optimale Einfügen von Kanten unter Beachtung der Einbettungsbeschränkungen. Der zweite Typ von Nebenbedingungen sind Cluster, die eine hierarchische Gruppierung von Knoten vorgeben. Für das Testen der Cluster-Planarität unter solchen Nebenbedingungen ist die Komplexität bisher unbekannt. Wir beschreiben ein Verfahren, um einen maximalen Cluster-planaren Untergraphen zu berechnen. Wir nutzen dabei eine Formulierung als ganzzahliges lineares Programm sowie einen Branch-and-Cut Ansatz zur Lösung. Das Verfahren erlaubt auch die Bestimmung der Cluster-Planarität und stellt damit den ersten praktischen Ansatz zum Testen allgemeiner Clustergraphen dar. Zusätzlich beschreiben wir eine Verbesserung für den Fall, dass lediglich Cluster-Planarität getestet werden muss, der maximale Cluster-planare Untergraph aber nicht von Interesse ist. Für dieses Szenario geben wir eine vereinfachte Formulierung und präsentieren ein Lösungsverfahren, das auf einem Branch-and-Price Ansatz beruht. In der Praxis müssen häufig sehr große oder komplexe Graphen untersucht werden. Dazu werden entsprechende Interaktions- und Navigationsmethoden benötigt. Wir beschreiben die entsprechenden Konzepte und stellen Lösungen für drei Anwendungsbereiche vor: Zunächst beschreiben wir Scaffold Hunter, eine Software zur Navigation im chemischen Strukturraum. Scaffold Hunter benutzt eine hierarchische Klassifikation von Molekülen als Grundlage für die visuelle Navigation. Die Visualisierung ist eingebettet in eine interaktive Oberfläche die eine visuelle Analyse von chemischen Strukturdatenbanken erlaubt. Für zwei Typen von biologischen Netzwerken, Protein-Domänen Netzwerke und Residue-Interaktionsnetzwerke, stellen wir Ansätze für die interaktive Visualisierung dar. Die entsprechenden Layoutverfahren unterliegen einer Reihe von Nebenbedingungen für eine sinnvolle Darstellung
    corecore