Search CORE

27,890 research outputs found

Structure-based classification and ontology in chemistry

Abstract Background Recent years have seen an explosion in the availability of data in the chemistry domain. With this information explosion, however, retrieving <it>relevant </it>results from the available information, and <it>organising </it>those results, become even harder problems. Computational processing is essential to filter and organise the available resources so as to better facilitate the work of scientists. Ontologies encode expert domain knowledge in a hierarchically organised machine-processable format. One such ontology for the chemical domain is ChEBI. ChEBI provides a classification of chemicals based on their structural features and a role or activity-based classification. An example of a structure-based class is 'pentacyclic compound' (compounds containing five-ring structures), while an example of a role-based class is 'analgesic', since many different chemicals can act as analgesics without sharing structural features. Structure-based classification in chemistry exploits elegant regularities and symmetries in the underlying chemical domain. As yet, there has been neither a systematic analysis of the types of structural classification in use in chemistry nor a comparison to the capabilities of available technologies. Results We analyze the different categories of structural classes in chemistry, presenting a list of patterns for features found in class definitions. We compare these patterns of class definition to tools which allow for automation of hierarchy construction within cheminformatics and within logic-based ontology technology, going into detail in the latter case with respect to the expressive capabilities of the Web Ontology Language and recent extensions for modelling structured objects. Finally we discuss the relationships and interactions between cheminformatics approaches and logic-based approaches. Conclusion Systems that perform intelligent reasoning tasks on chemistry data require a diverse set of underlying computational utilities including algorithmic, statistical and logic-based tools. For the task of automatic structure-based classification of chemical entities, essential to managing the vast swathes of chemical data being brought online, systems which are capable of hybrid reasoning combining several different approaches are crucial. We provide a thorough review of the available tools and methodologies, and identify areas of open research.</p

Repository for Publications and Research Data

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

The University of Manchester - Institutional Repository

Ontology of core data mining entities

Author: A Bernstein
A Golbraikh
A Karalic
B Smith
B Smith
B Smith
C Silla
C Vens
D Demšar
D Kocev
D Kocev
D Qi
D Young
DJ Hand
F Serban
G Madjarov
G Tsoumakas
GH Bakir
H Mannila
HP Kriegel
I Slavkov
J Vanschoren
K Button
Larisa Soldatova
LN Soldatova
M Courtot
M Ford
M Žáková
MA Avery
MA Avery
MF López
O Spjuth
P Robinson
Panče Panov
Q Yang
R Caruana
R Guha
R Guha
RD King
RD King
RR Brinkman
Sašo Džeroski
T Dietterich
V Podpečan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/07/2014
Field of study

In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

Crossref

Brunel University Research Archive

Towards automatic classification within the ChEBI ontology

Author: Christoph Steinbeck
Janna Hastings
Marcus Ennis
Paula de Matos
Publication venue
Publication date: 31/07/2009
Field of study

*Background*
Appearing in a wide variety of contexts, biochemical 'small molecules' are a core element of biomedical data. Chemical ontologies, which provide stable identifiers and a shared vocabulary for use in referring to such biochemical small molecules, are crucial to enable the interoperation of such data. One such chemical ontology is ChEBI (Chemical Entities of Biological Interest), a candidate member ontology of the OBO Foundry. ChEBI is a publicly available, manually annotated database of chemical entities and contains around 18000 annotated entities as of the last release (May 2009). ChEBI provides stable unique identifiers for chemical entities; a controlled vocabulary in the form of recommended names (which are unique and unambiguous), common synonyms, and systematic chemical names; cross-references to other databases; and a structural and role-based classification within the ontology. ChEBI is widely used for annotation of chemicals within biological databases, text-mining, and data integration. ChEBI can be accessed online at "http://www.ebi.ac.uk/chebi/":http://www.ebi.ac.uk/chebi/ and the full dataset is available for download in various formats including SDF and OBO.

*Automated Classification*
The selection of chemical entities for inclusion in the ChEBI database is user-driven. As the use of ChEBI has grown, so too has the backlog of user-requested entries. Inevitably, the annotation backlog creates a bottleneck, and to speed up the annotation process, ChEBI has recently released a submission tool which allows community submissions of chemical entities, groups, and classes. However, classification of chemical entities within the ontology is a difficult and niche activity, and it is unlikely that the community as a whole will be able or willing to correctly and consistently classify each submitted entity, creating required classes where they are missing. As a result, it is likely that while the size of the database grows, the ontological classification will become less sophisticated, unless the classification of new entities is assisted computationally. In addition, the ChEBI database is expecting substantial size growth in the next year, so automatic classification, which has up till now not been possible, is urgently required. Automatic classification would also enable the ChEBI ontology classes to be applied to other compound databases such as PubChem. 

*Description Logic Reasoning*
Description logic based reasoning technology is a prime candidate for development of such an automatic classification system as it allows the rules of the classification system to be encoded within the knowledgebase. Already at 18000 entities, ChEBI is a fair size for a real-world application of description logic reasoning technology, and as the ontology is enhanced with a richer density of asserted relationships, the classification will become more complex and challenging. We have successfully tested a description logic-based classification of chemical entities based on specified structural properties using the hypertableaux-based HermiT reasoner, and found it to be sufficiently efficient to be feasible for use in a production environment on a database of the size that ChEBI is now. However, much work still remains to enrich the ChEBI knowledgebase itself with the properties needed to provide the formal class definitions for use in the automated classification, and to assess the efficiency of the available description logic reasoning technology on a database the size of ChEBI's forecast future growth.

*Acknowledgements*
ChEBI is funded by the European Commission under SLING, grant agreement number 226073 (Integrating Activity) within Research Infrastructures of the FP7 Capacities Specific Programme, and by the BBSRC, grant agreement number BB/G022747/1 within the “Bioinformatics and biological resources” fund

Crossref

Nature Precedings

The RNA Ontology (RNAO): An ontology for integrating RNA sequence and structure data

Author: Chris Mungall
Colin Batchelor
Craig Zirbel
Eric Westhof
Jane Richardson
Jesse Stombaugh
Karen Eilbeck
Neocles Leontis
Rob Knight
Thomas Bittner
Publication venue
Publication date: 06/08/2009
Field of study

Biomedical Ontologies are intended to integrate diverse biomedical data to enable intelligent data-mining and facilitate translation of basic research into useful clinical knowledge. We present the first version of RNAO, an ontology for integrating RNA 3D structural, biochemical and sequence data. While each 3D data file depicts the structure of a specific molecule, such data have broader significance as representatives of classes of homologous molecules, which, while differing in sequence, generally share core structural features of functional importance. Thus, 3D structure data gain value by being linked to homologous sequences in genomic data and databases of sequence alignments. Likewise genomic data can increase in value by annotation of shared structural features, especially when these can be linked to specific functions. The RNAO is being developed in line with the developing standards of the Open Biomedical Ontologies (OBO) Consortium

Nature Precedings

Towards OWL-based Knowledge Representation in Petrology

Author: Kudryavtsev Dmitry
Ryakhovsky Vladimir
Shkotin Alex
Publication venue
Publication date: 08/06/2011
Field of study

This paper presents our work on development of OWL-driven systems for formal representation and reasoning about terminological knowledge and facts in petrology. The long-term aim of our project is to provide solid foundations for a large-scale integration of various kinds of knowledge, including basic terms, rock classification algorithms, findings and reports. We describe three steps we have taken towards that goal here. First, we develop a semi-automated procedure for transforming a database of igneous rock samples to texts in a controlled natural language (CNL), and then a collection of OWL ontologies. Second, we create an OWL ontology of important petrology terms currently described in natural language thesauri. We describe a prototype of a tool for collecting definitions from domain experts. Third, we present an approach to formalization of current industrial standards for classification of rock samples, which requires linear equations in OWL 2. In conclusion, we discuss a range of opportunities arising from the use of semantic technologies in petrology and outline the future work in this area.Comment: 10 pages. The paper has been accepted by OWLED2011 as a long presentatio

arXiv.org e-Print Archive

CiteSeerX

Bottom-up construction of ontologies

Author: Mars N.J.I.
Vet P.E. van der
Publication venue: IEEE
Publication date: 01/01/1998
Field of study

Presents a particular way of building ontologies that proceeds in a bottom-up fashion. Concepts are defined in a way that mirrors the way their instances are composed out of smaller objects. The smaller objects themselves may also be modeled as being composed. Bottom-up ontologies are flexible through the use of implicit and, hence, parsimonious part-whole and subconcept-superconcept relations. The bottom-up method complements current practice, where, as a rule, ontologies are built top-down. The design method is illustrated by an example involving ontologies of pure substances at several levels of detail. It is not claimed that bottom-up construction is a generally valid recipe; indeed, such recipes are deemed uninformative or impossible. Rather, the approach is intended to enrich the ontology developer's toolki

University of Twente Research Information

How to Find Suitable Ontologies Using an Ontology-based WWW Broker

Author: Arpirez J.C.
Gómez-Pérez A.
Lozano-Tello A.
Pinto S.
Publication venue: Facultad de Informática (UPM)
Publication date: 01/07/1999
Field of study

Knowledge reuse by means of outologies now faces three important problems: (1) there are no standardized identifying features that characterize ontologies from the user point of view; (2) there are no web sites using the same logical organization, presenting relevant information about ontologies; and (3) the search for appropriate ontologies is hard, time-consuming and usually fruitless. To solve the above problems, we present: (1) a living set of features that allow us to characterize ontologies from the user point of view and have the same logical organization; (2) a living domain ontology about ontologies (called ReferenceOntology) that gathers, describes and has links to existing ontologies; and (3) (ONTO)2Agent, the ontology-based www broker about ontologies that uses the Reference Ontology as a source of its knowledge and retrieves descriptions of ontologies that satisfy a given set of constraints. (ONTO)~Agent is available at http://delicias.dia.fi.upm.es/REFERENCE ONTOLOGY

Archivo Digital UPM

Ontology: Towards a new synthesis

Author: Smith Barry
Welty Chris
Publication venue
Publication date: 01/01/2001
Field of study

This introduction to the second international conference on Formal Ontology and Information Systems presents a brief history of ontology as a discipline spanning the boundaries of philosophy and information science. We sketch some of the reasons for the growth of ontology in the information science field, and offer a preliminary stocktaking of how the term ‘ontology’ is currently used. We conclude by suggesting some grounds for optimism as concerns the future collaboration between philosophical ontologists and information scientists

PhilPapers