20,426 research outputs found

    Ontology of core data mining entities

    Get PDF
    In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

    The Infectious Disease Ontology in the Age of COVID-19

    Get PDF
    The Infectious Disease Ontology (IDO) is a suite of interoperable ontology modules that aims to provide coverage of all aspects of the infectious disease domain, including biomedical research, clinical care, and public health. IDO Core is designed to be a disease and pathogen neutral ontology, covering just those types of entities and relations that are relevant to infectious diseases generally. IDO Core is then extended by a collection of ontology modules focusing on specific diseases and pathogens. In this paper we present applications of IDO Core within various areas of infectious disease research, together with an overview of all IDO extension ontologies and the methodology on the basis of which they are built. We also survey recent developments involving IDO, including the creation of IDO Virus; the Coronaviruses Infectious Disease Ontology (CIDO); and an extension of CIDO focused on COVID-19 (IDO-CovID-19).We also discuss how these ontologies might assist in information-driven efforts to deal with the ongoing COVID-19 pandemic, to accelerate data discovery in the early stages of future pandemics, and to promote reproducibility of infectious disease research

    The OBO Foundry: Coordinated Evolution of Ontologies to Support Biomedical Data Integration

    Get PDF
    The value of any kind of data is greatly enhanced when it exists in a form that allows it to be integrated with other data. One approach to integration is through the annotation of multiple bodies of data using common controlled vocabularies or ‘ontologies’. Unfortunately, the very success of this approach has led to a proliferation of ontologies, which itself creates obstacles to integration. The Open Biomedical Ontologies (OBO) consortium has set in train a strategy to overcome this problem. Existing OBO ontologies, including the Gene Ontology, are undergoing a process of coordinated reform, and new ontologies being created, on the basis of an evolving set of shared principles governing ontology development. The result is an expanding family of ontologies designed to be interoperable, logically well-formed, and to incorporate accurate representations of biological reality. We describe the OBO Foundry initiative, and provide guidelines for those who might wish to become involved in the future

    A review of the state of the art in Machine Learning on the Semantic Web: Technical Report CSTR-05-003

    Get PDF

    Utilising ontology-based modelling for learning content management

    Get PDF
    Learning content management needs to support a variety of open, multi-format Web-based software applications. We propose multidimensional, model-based semantic annotation as a way to support the management of access to and change of learning content. We introduce an information architecture model as the central contribution that supports multi-layered learning content structures. We discuss interactive query access, but also change management for multi-layered learning content management. An ontology-enhanced traceability approach is the solution

    The devices, experimental scaffolds, and biomaterials ontology (DEB): a tool for mapping, annotation, and analysis of biomaterials' data

    Get PDF
    The size and complexity of the biomaterials literature makes systematic data analysis an excruciating manual task. A practical solution is creating databases and information resources. Implant design and biomaterials research can greatly benefit from an open database for systematic data retrieval. Ontologies are pivotal to knowledge base creation, serving to represent and organize domain knowledge. To name but two examples, GO, the gene ontology, and CheBI, Chemical Entities of Biological Interest ontology and their associated databases are central resources to their respective research communities. The creation of the devices, experimental scaffolds, and biomaterials ontology (DEB), an open resource for organizing information about biomaterials, their design, manufacture, and biological testing, is described. It is developed using text analysis for identifying ontology terms from a biomaterials gold standard corpus, systematically curated to represent the domain's lexicon. Topics covered are validated by members of the biomaterials research community. The ontology may be used for searching terms, performing annotations for machine learning applications, standardized meta-data indexing, and other cross-disciplinary data exploitation. The input of the biomaterials community to this effort to create data-driven open-access research tools is encouraged and welcomed.Preprin

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    Automatically linking MEDLINE abstracts to the Gene Ontology

    Get PDF
    Much has been written recently about the need for effective tools and methods for mining the wealth of information present in biomedical literature (Mack and Hehenberger, 2002; Blagosklonny and Pardee, 2001; Rindflesch et al., 2002)—the activity of conceptual biology. Keyword search engines operating over large electronic document stores (such as PubMed and the PNAS) offer some help, but there are fundamental obstacles that limit their effectiveness. In the first instance, there is no general consensus among scientists about the vernacular to be used when describing research about genes, proteins, drugs, diseases, tissues and therapies, making it very difficult to formulate a search query that retrieves the right documents. Secondly, finding relevant articles is just one aspect of the investigative process. A more fundamental goal is to establish links and relationships between facts existing in published literature in order to “validate current hypotheses or to generate new ones” (Barnes and Robertson, 2002)—something keyword search engines do little to support
    corecore