44 research outputs found
Harnessing the power of unified metadata in an ontology repository: The case of AgroPortal
As any resources, ontologies, thesaurus, vocabularies and terminologies need to be described with relevant metadata to facilitate their identification, selection and reuse. For ontologies to be FAIR, there is a need for metadata authoring guidelines and for harmonization of existing metadata vocabulariesâtaken independently none of them can completely describe an ontology. Ontology libraries and repositories also have to play an important role. Indeed, some metadata properties are intrinsic to the ontology (name, license, description); other information, such as community feedbacks or relations to other ontologies are typically information that an ontology library shall capture, populate and consolidate to facilitate the processes of identifying and selecting the right ontology(ies) to use. We have studied ontology metadata practices by: (1) analyzing metadata annotations of 805 ontologies; (2) reviewing the most standard and relevant vocabularies (23 totals) currently available to describe metadata for ontologies (such as Dublin Core, Ontology Metadata Vocabulary, VoID, etc.); (3) comparing different metadata implementation in multiple ontology libraries or repositories. We have then built a new metadata model for our AgroPortal vocabulary and ontology repository, a platform dedicated to agronomy based on the NCBO BioPortal technology. AgroPortal now recognizes 346 properties from existing metadata vocabularies that could be used to describe different aspects of ontologies: intrinsic descriptions, people, date, relations, content, metrics, community, administration, and access. We use them to populate an internal model of 127 properties implemented in the portal and harmonized for all the ontologies. Weâand AgroPortal's usersâhave spent a significant amount of time to edit and curate the metadata of the ontologies to offer a better synthetized and harmonized information and enable new ontology identification features. Our goal was also to facilitate the comprehension of the agronomical ontology landscape by displaying diagrams and charts about all the ontologies on the portal. We have evaluated our work with a user appreciation survey which confirms the new features are indeed relevant and helpful to ease the processes of identification and selection of ontologies. This paper presents how to harness the potential of a complete and unified metadata model with dedicated features in an ontology repository; however, the new AgroPortal's model is not a new vocabulary as it relies on preexisting ones. A generalization of this work is studied in a community-driven standardization effort in the context of the RDA Vocabulary and Semantic Services Interest Group
Development and Evaluation of an Ontology-Based Quality Metrics Extraction System
The Institute of Medicine reports a growing demand in recent years for quality improvement within the healthcare industry. In response, numerous organizations have been involved in the development and reporting of quality measurement metrics. However, disparate data models from such organizations shift the burden of accurate and reliable metrics extraction and reporting to healthcare providers. Furthermore, manual abstraction of quality metrics and diverse implementation of Electronic Health Record (EHR) systems deepens the complexity of consistent, valid, explicit, and comparable quality measurement reporting within healthcare provider organizations.
The main objective of this research is to evaluate an ontology-based information extraction framework to utilize unstructured clinical text for defining and reporting quality of care metrics that are interpretable and comparable across different healthcare institutions.
All clinical transcribed notes (48,835) from 2,085 patients who had undergone surgery in 2011 at MD Anderson Cancer Center were extracted from their EMR system and pre- processed for identification of section headers. Subsequently, all notes were analyzed by MetaMap v2012 and one XML file was generated per each note. XML outputs were converted into Resource Description Framework (RDF) format. We also developed three ontologies: section header ontology from extracted section headers using RDF standard, concept ontology comprising entities representing five quality metrics from SNOMED (Diabetes, Hypertension, Cardiac Surgery, Transient Ischemic Attack, CNS tumor), and a clinical note ontology that represented clinical note elements and their relationships. All ontologies (Web Ontology Language format) and patient notes (RDFs) were imported into a triple store (AllegroGraph?) as classes and instances respectively. SPARQL information retrieval protocol was used for reporting extracted concepts under four settings: base Natural Language Processing (NLP) output, inclusion of concept ontology, exclusion of negated concepts, and inclusion of section header ontology. Existing manual abstraction data from surgical clinical reviewers, on the same set of patients and documents, was considered as the gold standard. Micro-average results of statistical agreement tests on the base NLP output showed an increase from 59%, 81%, and 68% to 74%, 91%, and 82% (Precision, Recall, F-Measure) respectively after incremental addition of ontology layers. Our study introduced a framework that may contribute to advances in âcomplementaryâ components for the existing information extraction systems. The application of an ontology-based approach for natural language processing in our study has provided mechanisms for increasing the performance of such tools. The pivot point for extracting more meaningful quality metrics from clinical narratives is the abstraction of contextual semantics hidden in the notes. We have defined some of these semantics and quantified them in multiple complementary layers in order to demonstrate the importance and applicability of an ontology-based approach in quality metric extraction. The application of such ontology layers introduces powerful new ways of querying context dependent entities from clinical texts.
Rigorous evaluation is still necessary to ensure the quality of these âcomplementaryâ NLP systems. Moreover, research is needed for creating and updating evaluation guidelines and criteria for assessment of performance and efficiency of ontology-based information extraction in healthcare and to provide a consistent baseline for the purpose of comparing alternative approaches
Evaluating FAIR Digital Object and Linked Data as distributed object systems
FAIR Digital Object (FDO) is an emerging concept that is highlighted by
European Open Science Cloud (EOSC) as a potential candidate for building a
ecosystem of machine-actionable research outputs. In this work we
systematically evaluate FDO and its implementations as a global distributed
object system, by using five different conceptual frameworks that cover
interoperability, middleware, FAIR principles, EOSC requirements and FDO
guidelines themself.
We compare the FDO approach with established Linked Data practices and the
existing Web architecture, and provide a brief history of the Semantic Web
while discussing why these technologies may have been difficult to adopt for
FDO purposes. We conclude with recommendations for both Linked Data and FDO
communities to further their adaptation and alignment.Comment: 40 pages, submitted to PeerJ C
The evaluation of ontologies: quality, reuse and social factors
Finding a âgoodâ or the ârightâ ontology is a growing challenge in the ontology domain, where one of the main aims is to share and reuse existing semantics and knowledge. Before reusing an ontology, knowledge engineers not only have to find a set of appropriate ontologies for their search query, but they should also be able to evaluate those ontologies according to different internal and external criteria. Therefore, ontology evaluation is at the heart of ontology selection and has received a considerable amount of attention in the literature.Despite the importance of ontology evaluation and selection and the widespread research on these topics, there are still many unanswered questions and challenges when it comes to evaluating and selecting ontologies for reuse. Most of the evaluation metrics and frameworks in the literature are mainly based on a limited set of internal characteristics, e.g., content and structure of ontologies and ignore how they are used and evaluated by communities. This thesis aimed to investigate the notion of quality and reusability in the ontology domain and to explore and identify the set of metrics that can affect the process of ontology evaluation and selection for reuse. [Continues.
Focused categorization power of ontologies: General framework and study on simple existential concept expressions
When reusing existing ontologies for publishing a dataset in RDF (or developing a new ontology), preference may be given to those providing extensive subcategorization for important classes (denoted as focus classes). The subcategories may consist not only of named classes but also of compound class expressions. We define the notion of focused categorization power of a given ontology, with respect to a focus class and a concept expression language, as the (estimated) weighted count of the categories that can be built from the ontologyâs signature, conform to the language, and are subsumed by the focus class. For the sake of tractable initial experiments we then formulate a restricted concept expression language based on existential restrictions, and heuristically map it to syntactic patterns over ontology axioms (so-called FCE patterns). The characteristics of the chosen concept expression language and associated FCE patterns are investigated using three different empirical sources derived from ontology collections: first, the concept expression pattern frequency in class definitions; second, the occurrence of FCE patterns in the Tbox of ontologies; and last, for class expressions generated from the Tbox of ontologies (through the FCE patterns); their âmeaningfulnessâ was assessed by different groups of users, yielding a âquality orderingâ of the concept expression patterns. The complementary analyses are then compared and summarized. To allow for further experimentation, a web-based prototype was also implemented, which covers the whole process of ontology reuse from keyword-based ontology search through the FCP computation to the selection of ontologies and their enrichment with new concepts built from compound expressions
Proceedings of the 15th ISWC workshop on Ontology Matching (OM 2020)
15th International Workshop on Ontology Matching co-located with the 19th International Semantic Web Conference (ISWC 2020)International audienc