114 research outputs found
Tautomerism in large databases
We have used the Chemical Structure DataBase (CSDB) of the NCI CADD Group, an aggregated collection of over 150 small-molecule databases totaling 103.5 million structure records, to conduct tautomerism analyses on one of the largest currently existing sets of real (i.e. not computer-generated) compounds. This analysis was carried out using calculable chemical structure identifiers developed by the NCI CADD Group, based on hash codes available in the chemoinformatics toolkit CACTVS and a newly developed scoring scheme to define a canonical tautomer for any encountered structure. CACTVS’s tautomerism definition, a set of 21 transform rules expressed in SMIRKS line notation, was used, which takes a comprehensive stance as to the possible types of tautomeric interconversion included. Tautomerism was found to be possible for more than 2/3 of the unique structures in the CSDB. A total of 680 million tautomers were calculated from, and including, the original structure records. Tautomerism overlap within the same individual database (i.e. at least one other entry was present that was really only a different tautomeric representation of the same compound) was found at an average rate of 0.3% of the original structure records, with values as high as nearly 2% for some of the databases in CSDB. Projected onto the set of unique structures (by FICuS identifier), this still occurred in about 1.5% of the cases. Tautomeric overlap across all constituent databases in CSDB was found for nearly 10% of the records in the collection
Functional Group and Substructure Searching as a Tool in Metabolomics
BACKGROUND: A direct link between the names and structures of compounds and the functional groups contained within them is important, not only because biochemists frequently rely on literature that uses a free-text format to describe functional groups, but also because metabolic models depend upon the connections between enzymes and substrates being known and appropriately stored in databases. METHODOLOGY: We have developed a database named "Biochemical Substructure Search Catalogue" (BiSSCat), which contains 489 functional groups, >200,000 compounds and >1,000,000 different computationally constructed substructures, to allow identification of chemical compounds of biological interest. CONCLUSIONS: This database and its associated web-based search program (http://bisscat.org/) can be used to find compounds containing selected combinations of substructures and functional groups. It can be used to determine possible additional substrates for known enzymes and for putative enzymes found in genome projects. Its applications to enzyme inhibitor design are also discussed
Identification of Anti-Malarial Compounds as Novel Antagonists to Chemokine Receptor CXCR4 in Pancreatic Cancer Cells
Despite recent advances in targeted therapies, patients with pancreatic adenocarcinoma continue to have poor survival highlighting the urgency to identify novel therapeutic targets. Our previous investigations have implicated chemokine receptor CXCR4 and its selective ligand CXCL12 in the pathogenesis and progression of pancreatic intraepithelial neoplasia and invasive pancreatic cancer; hence, CXCR4 is a promising target for suppression of pancreatic cancer growth. Here, we combined in silico structural modeling of CXCR4 to screen for candidate anti-CXCR4 compounds with in vitro cell line assays and identified NSC56612 from the National Cancer Institute's (NCI) Open Chemical Repository Collection as an inhibitor of activated CXCR4. Next, we identified that NSC56612 is structurally similar to the established anti-malarial drugs chloroquine and hydroxychloroquine. We evaluated these compounds in pancreatic cancer cells in vitro and observed specific antagonism of CXCR4-mediated signaling and cell proliferation. Recent in vivo therapeutic applications of chloroquine in pancreatic cancer mouse models have demonstrated decreased tumor growth and improved survival. Our results thus provide a molecular target and basis for further evaluation of chloroquine and hydroxychloroquine in pancreatic cancer. Historically safe in humans, chloroquine and hydroxychloroquine appear to be promising agents to safely and effectively target CXCR4 in patients with pancreatic cancer
Structure-based classification and ontology in chemistry
<p>Abstract</p> <p>Background</p> <p>Recent years have seen an explosion in the availability of data in the chemistry domain. With this information explosion, however, retrieving <it>relevant </it>results from the available information, and <it>organising </it>those results, become even harder problems. Computational processing is essential to filter and organise the available resources so as to better facilitate the work of scientists. Ontologies encode expert domain knowledge in a hierarchically organised machine-processable format. One such ontology for the chemical domain is ChEBI. ChEBI provides a classification of chemicals based on their structural features and a role or activity-based classification. An example of a structure-based class is 'pentacyclic compound' (compounds containing five-ring structures), while an example of a role-based class is 'analgesic', since many different chemicals can act as analgesics without sharing structural features. Structure-based classification in chemistry exploits elegant regularities and symmetries in the underlying chemical domain. As yet, there has been neither a systematic analysis of the types of structural classification in use in chemistry nor a comparison to the capabilities of available technologies.</p> <p>Results</p> <p>We analyze the different categories of structural classes in chemistry, presenting a list of patterns for features found in class definitions. We compare these patterns of class definition to tools which allow for automation of hierarchy construction within cheminformatics and within logic-based ontology technology, going into detail in the latter case with respect to the expressive capabilities of the Web Ontology Language and recent extensions for modelling structured objects. Finally we discuss the relationships and interactions between cheminformatics approaches and logic-based approaches.</p> <p>Conclusion</p> <p>Systems that perform intelligent reasoning tasks on chemistry data require a diverse set of underlying computational utilities including algorithmic, statistical and logic-based tools. For the task of automatic structure-based classification of chemical entities, essential to managing the vast swathes of chemical data being brought online, systems which are capable of hybrid reasoning combining several different approaches are crucial. We provide a thorough review of the available tools and methodologies, and identify areas of open research.</p
A taxonomic backbone for the global synthesis of species diversity in the angiosperm order Caryophyllales
The Caryophyllales constitute a major lineage of flowering plants with approximately 12500 species in 39 families. A taxonomic backbone at the genus level is provided that reflects the current state of knowledge and accepts 749 genera for the order. A detailed review of the literature of the past two decades shows that enormous progress has been made in understanding overall phylogenetic relationships in Caryophyllales. The process of re-circumscribing families in order to be monophyletic appears to be largely complete and has led to the recognition of eight new families (Anacampserotaceae, Kewaceae, Limeaceae, Lophiocarpaceae, Macarthuriaceae, Microteaceae, Montiaceae and Talinaceae), while the phylogenetic evaluation of generic concepts is still well underway. As a result of this, the number of genera has increased by more than ten percent in comparison to the last complete treatments in the Families and genera of vascular plants” series. A checklist with all currently accepted genus names in Caryophyllales, as well as nomenclatural references, type names and synonymy is presented. Notes indicate how extensively the respective genera have been studied in a phylogenetic context. The most diverse families at the generic level are Cactaceae and Aizoaceae, but 28 families comprise only one to six genera. This synopsis represents a first step towards the aim of creating a global synthesis of the species diversity in the angiosperm order Caryophyllales integrating the work of numerous specialists around the world
Digital Watermarking of Chemical Structure Sets
The information about 3D atomic coordinates of chemical structures is valuable knowledge in many respect. For large sets of different structures, the computation or measurement of these coordinates is an expensive process. Therefore, the originator of such a data set is interested in enforcing his intellectual property right. In this paper, a method for copyright protection of chemical structure sets based on digital watermarking is proposed. A complete watermarking system including synchronization of the watermark detector and verification of the decoded watermark message is presented. The basic embedding scheme, denoted SCS (Scalar Costa Scheme) watermarking, is based on considering watermarking as a communications problem with side information at the encoder
- …