151 research outputs found

    The PubChem chemical structure sketcher

    Get PDF
    PubChem is an important public, Web-based information source for chemical and bioactivity information. In order to provide convenient structure search methods on compounds stored in this database, one mandatory component is a Web-based drawing tool for interactive sketching of chemical query structures. Web-enabled chemical structure sketchers are not new, being in existence for years; however, solutions available rely on complex technology like Java applets or platform-dependent plug-ins. Due to general policy and support incident rate considerations, Java-based or platform-specific sketchers cannot be deployed as a part of public NCBI Web services. Our solution: a chemical structure sketching tool based exclusively on CGI server processing, client-side JavaScript functions, and image sequence streaming. The PubChem structure editor does not require the presence of any specific runtime support libraries or browser configurations on the client. It is completely platform-independent and verified to work on all major Web browsers, including older ones without support for Web2.0 JavaScript objects

    A Cross-Disciplinary Process Modelling Language for Validating Reconfigured Production Processes

    Get PDF
    Modelling and reconfiguration of production processes require knowledge across different domains. This in-depth knowledge is necessary to avoid possible side effects that could threaten the production plant, the workpiece or the worker. Therefore, process modelling approaches allow adding additional data to the steps of a process. Such additions can be constraints, which need to be fulfilled before a step can be executed. Upon reconfiguration of production processes, these constraints need to be validated to ensure that the objective of the process is still met. However, this task demands expertise in the field of process modelling as well as in the domain of the production process and the production plant. To the best of our knowledge, state-of-the-art production process modelling approaches are unable to determine the semantic validity of a reconfigured production process. In this paper, we introduce a domain-specific modelling language dedicated to model and validate constraints between production steps. With this approach, we aim to assist the operator in reconfiguring production processes. We evaluate this approach in three case studies and show that our approach can detect violated constraints in production processes

    In silico assessment of potential druggable pockets on the surface of α1-Antitrypsin conformers

    Get PDF
    The search for druggable pockets on the surface of a protein is often performed on a single conformer, treated as a rigid body. Transient druggable pockets may be missed in this approach. Here, we describe a methodology for systematic in silico analysis of surface clefts across multiple conformers of the metastable protein α1-antitrypsin (A1AT). Pathological mutations disturb the conformational landscape of A1AT, triggering polymerisation that leads to emphysema and hepatic cirrhosis. Computational screens for small molecule inhibitors of polymerisation have generally focused on one major druggable site visible in all crystal structures of native A1AT. In an alternative approach, we scan all surface clefts observed in crystal structures of A1AT and in 100 computationally produced conformers, mimicking the native solution ensemble. We assess the persistence, variability and druggability of these pockets. Finally, we employ molecular docking using publicly available libraries of small molecules to explore scaffold preferences for each site. Our approach identifies a number of novel target sites for drug design. In particular one transient site shows favourable characteristics for druggability due to high enclosure and hydrophobicity. Hits against this and other druggable sites achieve docking scores corresponding to a Kd in the µM–nM range, comparing favourably with a recently identified promising lead. Preliminary ThermoFluor studies support the docking predictions. In conclusion, our strategy shows considerable promise compared with the conventional single pocket/single conformer approach to in silico screening. Our best-scoring ligands warrant further experimental investigation

    11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

    Get PDF

    Functional Group and Substructure Searching as a Tool in Metabolomics

    Get PDF
    BACKGROUND: A direct link between the names and structures of compounds and the functional groups contained within them is important, not only because biochemists frequently rely on literature that uses a free-text format to describe functional groups, but also because metabolic models depend upon the connections between enzymes and substrates being known and appropriately stored in databases. METHODOLOGY: We have developed a database named "Biochemical Substructure Search Catalogue" (BiSSCat), which contains 489 functional groups, >200,000 compounds and >1,000,000 different computationally constructed substructures, to allow identification of chemical compounds of biological interest. CONCLUSIONS: This database and its associated web-based search program (http://bisscat.org/) can be used to find compounds containing selected combinations of substructures and functional groups. It can be used to determine possible additional substrates for known enzymes and for putative enzymes found in genome projects. Its applications to enzyme inhibitor design are also discussed

    Quantitative assessment of the expanding complementarity between public and commercial databases of bioactive compounds

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Since 2004 public cheminformatic databases and their collective functionality for exploring relationships between compounds, protein sequences, literature and assay data have advanced dramatically. In parallel, commercial sources that extract and curate such relationships from journals and patents have also been expanding. This work updates a previous comparative study of databases chosen because of their bioactive content, availability of downloads and facility to select informative subsets.</p> <p>Results</p> <p>Where they could be calculated, extracted compounds-per-journal article were in the range of 12 to 19 but compound-per-protein counts increased with document numbers. Chemical structure filtration to facilitate standardised comparisons typically reduced source counts by between 5% and 30%. The pair-wise overlaps between 23 databases and subsets were determined, as well as changes between 2006 and 2008. While all compound sets have increased, PubChem has doubled to 14.2 million. The 2008 comparison matrix shows not only overlap but also unique content across all sources. Many of the detailed differences could be attributed to individual strategies for data selection and extraction. While there was a big increase in patent-derived structures entering PubChem since 2006, GVKBIO contains over 0.8 million unique structures from this source. Venn diagrams showed extensive overlap between compounds extracted by independent expert curation from journals by GVKBIO, WOMBAT (both commercial) and BindingDB (public) but each included unique content. In contrast, the approved drug collections from GVKBIO, MDDR (commercial) and DrugBank (public) showed surprisingly low overlap. Aggregating all commercial sources established that while 1 million compounds overlapped with PubChem 1.2 million did not.</p> <p>Conclusion</p> <p>On the basis of chemical structure content <it>per se </it>public sources have covered an increasing proportion of commercial databases over the last two years. However, commercial products included in this study provide links between compounds and information from patents and journals at a larger scale than current public efforts. They also continue to capture a significant proportion of unique content. Our results thus demonstrate not only an encouraging overall expansion of data-supported bioactive chemical space but also that both commercial and public sources are complementary for its exploration.</p

    Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry

    Get PDF
    BACKGROUND: Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas. RESULTS: An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80–99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries. CONCLUSION: The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65–81%. Corresponding software and supplemental data are available for downloads from the authors' website

    Identification of Anti-Malarial Compounds as Novel Antagonists to Chemokine Receptor CXCR4 in Pancreatic Cancer Cells

    Get PDF
    Despite recent advances in targeted therapies, patients with pancreatic adenocarcinoma continue to have poor survival highlighting the urgency to identify novel therapeutic targets. Our previous investigations have implicated chemokine receptor CXCR4 and its selective ligand CXCL12 in the pathogenesis and progression of pancreatic intraepithelial neoplasia and invasive pancreatic cancer; hence, CXCR4 is a promising target for suppression of pancreatic cancer growth. Here, we combined in silico structural modeling of CXCR4 to screen for candidate anti-CXCR4 compounds with in vitro cell line assays and identified NSC56612 from the National Cancer Institute's (NCI) Open Chemical Repository Collection as an inhibitor of activated CXCR4. Next, we identified that NSC56612 is structurally similar to the established anti-malarial drugs chloroquine and hydroxychloroquine. We evaluated these compounds in pancreatic cancer cells in vitro and observed specific antagonism of CXCR4-mediated signaling and cell proliferation. Recent in vivo therapeutic applications of chloroquine in pancreatic cancer mouse models have demonstrated decreased tumor growth and improved survival. Our results thus provide a molecular target and basis for further evaluation of chloroquine and hydroxychloroquine in pancreatic cancer. Historically safe in humans, chloroquine and hydroxychloroquine appear to be promising agents to safely and effectively target CXCR4 in patients with pancreatic cancer

    Structure-based classification and ontology in chemistry

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent years have seen an explosion in the availability of data in the chemistry domain. With this information explosion, however, retrieving <it>relevant </it>results from the available information, and <it>organising </it>those results, become even harder problems. Computational processing is essential to filter and organise the available resources so as to better facilitate the work of scientists. Ontologies encode expert domain knowledge in a hierarchically organised machine-processable format. One such ontology for the chemical domain is ChEBI. ChEBI provides a classification of chemicals based on their structural features and a role or activity-based classification. An example of a structure-based class is 'pentacyclic compound' (compounds containing five-ring structures), while an example of a role-based class is 'analgesic', since many different chemicals can act as analgesics without sharing structural features. Structure-based classification in chemistry exploits elegant regularities and symmetries in the underlying chemical domain. As yet, there has been neither a systematic analysis of the types of structural classification in use in chemistry nor a comparison to the capabilities of available technologies.</p> <p>Results</p> <p>We analyze the different categories of structural classes in chemistry, presenting a list of patterns for features found in class definitions. We compare these patterns of class definition to tools which allow for automation of hierarchy construction within cheminformatics and within logic-based ontology technology, going into detail in the latter case with respect to the expressive capabilities of the Web Ontology Language and recent extensions for modelling structured objects. Finally we discuss the relationships and interactions between cheminformatics approaches and logic-based approaches.</p> <p>Conclusion</p> <p>Systems that perform intelligent reasoning tasks on chemistry data require a diverse set of underlying computational utilities including algorithmic, statistical and logic-based tools. For the task of automatic structure-based classification of chemical entities, essential to managing the vast swathes of chemical data being brought online, systems which are capable of hybrid reasoning combining several different approaches are crucial. We provide a thorough review of the available tools and methodologies, and identify areas of open research.</p
    corecore