20 research outputs found

    Logic-based assessment of the compatibility of UMLS ontology sources

    Get PDF
    Background: The UMLS Metathesaurus (UMLS-Meta) is currently the most comprehensive effort for integrating independently-developed medical thesauri and ontologies. UMLS-Meta is being used in many applications, including PubMed and ClinicalTrials.gov. The integration of new sources combines automatic techniques, expert assessment, and auditing protocols. The automatic techniques currently in use, however, are mostly based on lexical algorithms and often disregard the semantics of the sources being integrated. Results: In this paper, we argue that UMLS-Meta’s current design and auditing methodologies could be significantly enhanced by taking into account the logic-based semantics of the ontology sources. We provide empirical evidence suggesting that UMLS-Meta in its 2009AA version contains a significant number of errors; these errors become immediately apparent if the rich semantics of the ontology sources is taken into account, manifesting themselves as unintended logical consequences that follow from the ontology sources together with the information in UMLS-Meta. We then propose general principles and specific logic-based techniques to effectively detect and repair such errors. Conclusions: Our results suggest that the methodologies employed in the design of UMLS-Meta are not only very costly in terms of human effort, but also error-prone. The techniques presented here can be useful for both reducing human effort in the design and maintenance of UMLS-Meta and improving the quality of its contents

    Using natural language processing techniques to inform research on nanotechnology

    Get PDF
    Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics

    Revision in networks of ontologies

    Get PDF
    euzenat2015aInternational audienceNetworks of ontologies are made of a collection of logic theories, called ontologies, related by alignments. They arise naturally in distributed contexts in which theories are developed and maintained independently, such as the semantic web. In networks of ontologies, inconsistency can come from two different sources: local inconsistency in a particular ontology or alignment, and global inconsistency between them. Belief revision is well-defined for dealing with ontologies; we investigate how it can apply to networks of ontologies. We formulate revision postulates for alignments and networks of ontologies based on an abstraction of existing semantics of networks of ontologies. We show that revision operators cannot be simply based on local revision operators on both ontologies and alignments. We adapt the partial meet revision framework to networks of ontologies and show that it indeed satisfies the revision postulates. Finally, we consider strategies based on network characteristics for designing concrete revision operators

    Ontology matching in a distributed environment

    Get PDF

    Building high-quality merged ontologies from multiple sources with requirements customization

    Get PDF
    Ontologies are the prime way of organizing data in the Semantic Web. Often, it is necessary to combine several, independently developed ontologies to obtain a knowledge graph fully representing a domain of interest. Existing approaches scale rather poorly to the merging of multiple ontologies due to using a binary merge strategy. Thus, we aim to investigate the extent to which the n-ary strategy can solve the scalability problem. This thesis contributes to the following important aspects: 1. Our n-ary merge strategy takes as input a set of source ontologies and their mappings and generates a merged ontology. For efficient processing, rather than successively merging complete ontologies pairwise, we group related concepts across ontologies into partitions and merge first within and then across those partitions. 2. We take a step towards parameterizable merge methods. We have identified a set of Generic Merge Requirements (GMRs) that merged ontologies might be expected to meet. We have investigated and developed compatibilities of the GMRs by a graph-based method. 3. When multiple ontologies are merged, inconsistencies can occur due to different world views encoded in the source ontologies To this end, we propose a novel Subjective Logic-based method to handling the inconsistency occurring while merging ontologies. We apply this logic to rank and estimate the trustworthiness of conflicting axioms that cause inconsistencies within a merged ontology. 4. To assess the quality of the merged ontologies systematically, we provide a comprehensive set of criteria in an evaluation framework. The proposed criteria cover a variety of characteristics of each individual aspect of the merged ontology in structural, functional, and usability dimensions. 5. The final contribution of this research is the development of the CoMerger tool that implements all aforementioned aspects accessible via a unified interface

    Perinnetiedon semanttinen annotointi

    Get PDF
    Organisaatioiden perinnejärjestelmät sisältävät valtavan määrän tietoa tallennettuna relaatiotietokantoihin. Tämä perinnetieto on organisaation liiketoiminnan kannalta usein erittäin tärkeää. Perinnejärjestelmiä tai perinnetietoa ei tyypillisesti ole kuvailtu riittävästi. Tällöin tiedon hakijan on osattava etsiä tietoa oikeasta paikasta, oikeilla menetelmillä, tunnettava ennalta haun kohteena olevaa aineistoa ja osattava tulkita hakutuloksia. Tilanne muuttuu, jos perinnetieto annotoidaan semanttisesti eli kuvaillaan semanttisen ja koneymmärrettävän metatiedon avulla. Tämä mahdollistaa nykyistä ilmaisuvoimaisempien kyselykielten käytön. Semanttisen metatiedon avulla koneet voivat ymmärtää tietosisältöjen merkityksiä. Näin ollen semanttisen annotoinnin avulla koneet voidaan valjastaa auttamaan ihmisiä relaatiotietokantoihin kohdistuvassa tiedonhaussa merkittävästi nykyistä paremmilla tavoilla. Semanttinen web on näkemys webin laajennuksesta, jossa tietosisällöt kuvataan semanttisesti ja koneymmärrettävästi. Tämän toteuttamiseksi semanttisen webin tutkimuksessa on kehitetty joukko menetelmiä ja normeja. Tutkielmassa kuvataan, miten näitä menetelmiä ja normeja voidaan hyödyntää perinnetiedon semanttiseen annotointiin. Tutkielmassa vertaillaan eri lähestymistapoja ja esitetään, miten tilanteeseen nähden tarkoituksenmukaisin tapa voidaan valita. Semanttinen annotointi kannattaa toteuttaa muuntamalla relaatiomallin mukainen tieto semanttisen webin tietomallin mukaiseksi tiedon aihealuetta kuvaavan ontologian rakenteiden ilmentymiksi. Tämä voidaan tehdä arkkitehtonisesti kahdella erilaisella lähestymistavalla. Ensimmäisellä tavalla ontologia luodaan automaattisesti relaatiotietokannan rakenteiden perusteella. Toisella tavalla jollakin ulkoisella muunnoskielellä määritellään, mihin ulkoisesti määriteltyjen ontologioiden rakenteisiin relaatiotietokannan rakenteet liittyvät. Muunnettua tietomallia ei tarvitse ylläpitää erillään, vaan muunnos voidaan tehdä ajonaikaisesti. Tällöin perinnejärjestelmät voivat edelleen ylläpitää tietoa alkuperäisissä relaatiotietokannoissa samalla, kun semanttista metatietoa hyödyntävät käyttäjät voivat suorittaa tietoon monipuolisia ja erittäin ilmaisuvoimaisia hakuja
    corecore