4,964 research outputs found
Ontologies in medicinal chemistry: current status and future challenges
[Abstract] Recent years have seen a dramatic increase in the amount and availability of data in the diverse areas of medicinal chemistry, making it possible to achieve significant advances in fields such as the design, synthesis and biological evaluation of compounds. However, with this data explosion, the storage, management and analysis of available data to extract relevant information has become even a more complex task that offers challenging research issues to Artificial Intelligence (AI) scientists. Ontologies have emerged in AI as a key tool to formally represent and semantically organize aspects of the real world. Beyond glossaries or thesauri, ontologies facilitate communication between experts and allow the application of computational techniques to extract useful information from available data. In medicinal chemistry, multiple ontologies have been developed during the last years which contain knowledge about chemical compounds and processes of synthesis of pharmaceutical products. This article reviews the principal standards and ontologies in medicinal chemistry, analyzes their main applications and suggests future directions.Instituto de Salud Carlos III; FIS-PI10/02180Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo; 209RT0366Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; CN2012/217Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; CN2011/034Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; CN2012/21
Structure-based classification and ontology in chemistry
<p>Abstract</p> <p>Background</p> <p>Recent years have seen an explosion in the availability of data in the chemistry domain. With this information explosion, however, retrieving <it>relevant </it>results from the available information, and <it>organising </it>those results, become even harder problems. Computational processing is essential to filter and organise the available resources so as to better facilitate the work of scientists. Ontologies encode expert domain knowledge in a hierarchically organised machine-processable format. One such ontology for the chemical domain is ChEBI. ChEBI provides a classification of chemicals based on their structural features and a role or activity-based classification. An example of a structure-based class is 'pentacyclic compound' (compounds containing five-ring structures), while an example of a role-based class is 'analgesic', since many different chemicals can act as analgesics without sharing structural features. Structure-based classification in chemistry exploits elegant regularities and symmetries in the underlying chemical domain. As yet, there has been neither a systematic analysis of the types of structural classification in use in chemistry nor a comparison to the capabilities of available technologies.</p> <p>Results</p> <p>We analyze the different categories of structural classes in chemistry, presenting a list of patterns for features found in class definitions. We compare these patterns of class definition to tools which allow for automation of hierarchy construction within cheminformatics and within logic-based ontology technology, going into detail in the latter case with respect to the expressive capabilities of the Web Ontology Language and recent extensions for modelling structured objects. Finally we discuss the relationships and interactions between cheminformatics approaches and logic-based approaches.</p> <p>Conclusion</p> <p>Systems that perform intelligent reasoning tasks on chemistry data require a diverse set of underlying computational utilities including algorithmic, statistical and logic-based tools. For the task of automatic structure-based classification of chemical entities, essential to managing the vast swathes of chemical data being brought online, systems which are capable of hybrid reasoning combining several different approaches are crucial. We provide a thorough review of the available tools and methodologies, and identify areas of open research.</p
ClassyFire: automated chemical classification with a comprehensive, computable taxonomy
Additional file 5. Use cases. Text-based search on the ClassyFire web server. (A) Building the query. (B) Sparteine, one of the returned compounds
Recommended from our members
Computational Toxinology
Venoms are complex mixtures of biological macromolecules and other compounds that are used for predatory and defensive purposes by hundreds of thousands of known species worldwide. Throughout human history, venoms and venom components have been used to treat a vast array of illnesses, causing them to be of great clinical, economic, and academic interest to the drug discovery and toxinology communities. In spite of major computational advances that facilitate data-driven drug discovery, most therapeutic venom effects are still discovered via tedious trial-and-error, or simply by accident. In this dissertation, I describe a body of work that aims to establish a new subdiscipline of translational bioinformatics, which I name “computational toxinology”.
To accomplish this goal, I present three integrated components that span a wide range of informatics techniques: (1) VenomKB, (2) VenomSeq, and (3) VenomKB’s Semantic API. To provide a platform for structuring, representing, retrieving, and integrating venom data relevant to drug discovery, VenomKB provides a database-backed web application and knowledge base for computational toxinology. VenomKB is structured according to a fully-featured ontology of venoms, and provides data aggregated from many popular web re- sources. VenomSeq is a biotechnology workflow that is designed to generate new high-throughput sequencing data for incorporation into VenomKB. Specifically, we expose human cells to controlled doses of crude venoms, conduct RNA-Sequencing, and build profiles of differential gene expression, which we then compare to publicly-available differential expression data for known dis- eases and drugs with known effects, and use those comparisons to hypothesize ways that the venoms could act in a therapeutic manner, as well. These data are then integrated into VenomKB, where they can be effectively retrieved and evaluated using existing data and known therapeutic associations. VenomKB’s Semantic API further develops this functionality by providing an intelligent, powerful, and user-friendly interface for querying the complex underlying data in VenomKB in a way that reflects the intuitive, human-understandable mean- ing of those data. The Semantic API is designed to cater to the needs of advanced users as well as laypersons and bench scientists without previous expertise in computational biology and semantic data analysis.
In each chapter of the dissertation, I describe how we evaluated these 3 components through various approaches. We demonstrate the utility of VenomKB and the Semantic API by testing a number of practical use-cases for each, designed to highlight their ability to rediscover existing knowledge as well as suggesting potential areas for future exploration. We use statistics and data science techniques to evaluate VenomSeq on 25 diverse species of venomous animals, and propose biologically feasible explanations for significant findings. In evaluating the Semantic API, I show how observations on VenomSeq data can be interpreted and placed into the context of past research by members of the larger toxinology community.
Computational toxinology is a toolbox designed to be used by multiple stakeholders (toxinologists, computational biologists, and systems pharmacologists, among others) to improve the return rate of clinically-significant findings from manual experimentation. It aims to achieve this goal by enabling access to data, providing means for easy validation of results, and suggesting specific hypotheses that are preliminarily supported by rigorous inferential statistics. All components of the research I describe are open-access and publicly available, to improve reproducibility and encourage widespread adoptio
Comparative analysis of knowledge representation and reasoning requirements across a range of life sciences textbooks.
BackgroundUsing knowledge representation for biomedical projects is now commonplace. In previous work, we represented the knowledge found in a college-level biology textbook in a fashion useful for answering questions. We showed that embedding the knowledge representation and question-answering abilities in an electronic textbook helped to engage student interest and improve learning. A natural question that arises from this success, and this paper's primary focus, is whether a similar approach is applicable across a range of life science textbooks. To answer that question, we considered four different textbooks, ranging from a below-introductory college biology text to an advanced, graduate-level neuroscience textbook. For these textbooks, we investigated the following questions: (1) To what extent is knowledge shared between the different textbooks? (2) To what extent can the same upper ontology be used to represent the knowledge found in different textbooks? (3) To what extent can the questions of interest for a range of textbooks be answered by using the same reasoning mechanisms?ResultsOur existing modeling and reasoning methods apply especially well both to a textbook that is comparable in level to the text studied in our previous work (i.e., an introductory-level text) and to a textbook at a lower level, suggesting potential for a high degree of portability. Even for the overlapping knowledge found across the textbooks, the level of detail covered in each textbook was different, which requires that the representations must be customized for each textbook. We also found that for advanced textbooks, representing models and scientific reasoning processes was particularly important.ConclusionsWith some additional work, our representation methodology would be applicable to a range of textbooks. The requirements for knowledge representation are common across textbooks, suggesting that a shared semantic infrastructure for the life sciences is feasible. Because our representation overlaps heavily with those already being used for biomedical ontologies, this work suggests a natural pathway to include such representations as part of the life sciences curriculum at different grade levels
Developing the Quantitative Histopathology Image Ontology : A case study using the hot spot detection problem
Interoperability across data sets is a key challenge for quantitative histopathological imaging. There is a need for an ontology that can support effective merging of pathological image data with associated clinical and demographic data. To foster organized, cross-disciplinary, information-driven collaborations in the pathological imaging field, we propose to develop an ontology to represent imaging data and methods used in pathological imaging and analysis, and call it Quantitative Histopathological Imaging Ontology – QHIO. We apply QHIO to breast cancer hot-spot detection with the goal of enhancing reliability of detection by promoting the sharing of data between image analysts
- …