87 research outputs found

    Designing novel abstraction networks for ontology summarization and quality assurance

    Get PDF
    Biomedical ontologies are complex knowledge representation systems. Biomedical ontologies support interdisciplinary research, interoperability of medical systems, and Electronic Healthcare Record (EHR) encoding. Ontologies represent knowledge using concepts (entities) linked by relationships. Ontologies may contain hundreds of thousands of concepts and millions of relationships. For users, the size and complexity of ontologies make it difficult to comprehend “the big picture” of an ontology\u27s content. For ontology editors, size and complexity make it difficult to uncover errors and inconsistencies. Errors in an ontology will ultimately affect applications that utilize the ontology. In prior studies abstraction networks (AbNs) were developed to provide a compact summary of an ontology\u27s content and structure. AbNs have been shown to successfully support ontology summarization and quality assurance (QA), e.g., for SNOMED CT and NCIt. Despite the success of these previous studies, several major, unaddressed issues affect the applicability and usability of AbNs. This thesis is broken into five major parts, each addressing one issue. The first part of this dissertation addresses the scalability of AbN-based QA techniques to large SNOMED CT hierarchies. Previous studies focused on relatively small hierarchies. The QA techniques developed for these small hierarchies do not scale to large hierarchies, e.g., Procedure and Clinical finding. A new type of AbN, called a subtaxonomy, is introduced to address this problem. Subtaxonomies summarize a subset of an ontology\u27s content. Several types of subtaxonomies and subtaxonomy-based QA studies are discussed. The second part of this dissertation addresses the need for summarization and QA methods for the twelve SNOMED CT hierarchies with no lateral relationships. Previously developed SNOMED CT AbN derivation methodologies, which require lateral relationships, cannot be applied to these hierarchies. The Tribal Abstraction Network (TAN) is a new type of AbN derived using only hierarchical relationships. A TAN-based QA methodology is introduced and the results of a QA review of the Observable entity hierarchy are reported. The third part focuses on the development of generic AbN derivation methods that are applicable to groups of structurally similar ontologies, e.g., those developed in the Web Ontology Language (OWL) format. Previously, AbN derivation techniques were applicable to only a single ontology at a time. AbNs that are applicable to many OWL ontologies are introduced, a preliminary study on OWL AbN granularity is reported on, and the results of several QA studies are presented. The fourth part describes Diff Abstraction Networks, which summarize and visualize the structural differences between two ontology releases. Diff Area Taxonomy and Diff Partial-area Taxonomy derivation methodologies are introduced and Diff Partial-area taxonomies are derived for three OWL ontologies. The Diff Abstraction Network approach is compared to the traditional ontology diff approach. Lastly, tools for deriving and visualizing AbNs are described. The Biomedical Layout Utility Framework is introduced to support the automatic creation, visualization, and exploration of abstraction networks for SNOMED CT and OWL ontologies

    XML: aplicações e tecnologias associadas: 6th National Conference

    Get PDF
    This volume contains the papers presented at the Sixth Portuguese XML Conference, called XATA (XML, Aplicações e Tecnologias Associadas), held in Évora, Portugal, 14-15 February, 2008. The conference followed on from a successful series held throughout Portugal in the last years: XATA2003 was held in Braga, XATA2004 was held in Porto, XATA2005 was held in Braga, XATA2006 was held in Portalegre and XATA2007 was held in Lisboa. Dued to research evaluation criteria that are being used to evaluate researchers and research centers national conferences are becoming deserted. Many did not manage to gather enough submissions to proceed in this scenario. XATA made it through. However with a large decrease in the number of submissions. In this edition a special meeting will join the steering committee with some interested attendees to discuss XATA's future: internationalization, conference model, ... We think XATA is important in the national context. It has succeeded in gathering and identifying a comunity that shares the same research interests and has promoted some colaborations. We want to keep "the wheel spinning"... This edition has its program distributed by first day's afternoon and next day's morning. This way we are facilitating travel arrangements and we will have one night to meet

    Emerging semantics to link phenotype and environment

    Get PDF
    abstract: Understanding the interplay between environmental conditions and phenotypes is a fundamental goal of biology. Unfortunately, data that include observations on phenotype and environment are highly heterogeneous and thus difficult to find and integrate. One approach that is likely to improve the status quo involves the use of ontologies to standardize and link data about phenotypes and environments. Specifying and linking data through ontologies will allow researchers to increase the scope and flexibility of large-scale analyses aided by modern computing methods. Investments in this area would advance diverse fields such as ecology, phylogenetics, and conservation biology. While several biological ontologies are well-developed, using them to link phenotypes and environments is rare because of gaps in ontological coverage and limits to interoperability among ontologies and disciplines. In this manuscript, we present (1) use cases from diverse disciplines to illustrate questions that could be answered more efficiently using a robust linkage between phenotypes and environments, (2) two proof-of-concept analyses that show the value of linking phenotypes to environments in fishes and amphibians, and (3) two proposed example data models for linking phenotypes and environments using the extensible observation ontology (OBOE) and the Biological Collections Ontology (BCO); these provide a starting point for the development of a data model linking phenotypes and environments.The final version of this article, as published in PeerJ, can be viewed online at: https://peerj.com/articles/1470

    Semantic Publishing: issues, solutions and new trends in scholarly publishing within the Semantic Web era

    Get PDF
    This work is concerned with the increasing relationships between two distinct multidisciplinary research fields, Semantic Web technologies and scholarly publishing, that in this context converge into one precise research topic: Semantic Publishing. In the spirit of the original aim of Semantic Publishing, i.e. the improvement of scientific communication by means of semantic technologies, this thesis proposes theories, formalisms and applications for opening up semantic publishing to an effective interaction between scholarly documents (e.g., journal articles) and their related semantic and formal descriptions. In fact, the main aim of this work is to increase the users' comprehension of documents and to allow document enrichment, discovery and linkage to document-related resources and contexts, such as other articles and raw scientific data. In order to achieve these goals, this thesis investigates and proposes solutions for three of the main issues that semantic publishing promises to address, namely: the need of tools for linking document text to a formal representation of its meaning, the lack of complete metadata schemas for describing documents according to the publishing vocabulary, and absence of effective user interfaces for easily acting on semantic publishing models and theories

    Sustainability of systems interoperability in dynamic business networks

    Get PDF
    Dissertação para obtenção do Grau de Doutor em Engenharia Electrotécnica e de ComputadoresCollaborative networked environments emerged with the spread of the internet, contributing to overcome past communication barriers, and identifying interoperability as an essential property to support businesses development. When achieved seamlessly, efficiency is increased in the entire product life cycle support. However, due to the different sources of knowledge, models and semantics, enterprise organisations are experiencing difficulties exchanging critical information, even when they operate in the same business environments. To solve this issue, most of them try to attain interoperability by establishing peer-to-peer mappings with different business partners, or use neutral data and product standards as the core for information sharing, in optimized networks. In current industrial practice, the model mappings that regulate enterprise communications are only defined once, and most of them are hardcoded in the information systems. This solution has been effective and sufficient for static environments, where enterprise and product models are valid for decades. However, more and more enterprise systems are becoming dynamic, adapting and looking forward to meet further requirements; a trend that is causing new interoperability disturbances and efficiency reduction on existing partnerships. Enterprise Interoperability (EI) is a well established area of applied research, studying these problems, and proposing novel approaches and solutions. This PhD work contributes to that research considering enterprises as complex and adaptive systems, swayed to factors that are making interoperability difficult to sustain over time. The analysis of complexity as a neighbouring scientific domain, in which features of interoperability can be identified and evaluated as a benchmark for developing a new foundation of EI, is here proposed. This approach envisages at drawing concepts from complexity science to analyse dynamic enterprise networks and proposes a framework for sustaining systems interoperability, enabling different organisations to evolve at their own pace, answering the upcoming requirements but minimizing the negative impact these changes can have on their business environment

    Biological knowledge management and gene network analysis: a heuristic road to System Biology

    Get PDF
    In order to understand the molecular basis of living cells and organisms, biologists over the past decades have been studying life's core molecular players: the genes. Most genes have a specific function, a role they play in the collective task of developing a cell and supporting all the aspects of keeping it alive. These genes do not perform their function randomly. Instead, after billions of years of evolution, nature's trial-and-error process, they have become parts of an utterly complex and intricate network, an interconnected mesh of genes that comprises signal detection cascades, enzymatic reactions, control mechanisms, etc. Over several past decades, experimental molecular biologists have sought mainly to study these genes via a one-by-one approach. However, with the advent of high-throughput experimental techniques, the number-crunching power of computers, and the realisation that many biological functions are the result of interactions between genes or their proteins, Biology's related field of Systems Biology has emerged. Here, one tries to combine the dispersed information produced by many researchers, in integrated assemblies called gene networks. Our research comprises the development of two new methods for improved information integration in the field of molecular Systems Biology. The first one aims to support an approach to acquire insights in the dynamics of gene networks (the behaviour of gene activities over time), called 'modelling and simulation' of genetic regulatory networks. Our second new method approaches the problem of how to collect and manage the information necessary to compose such genetic networks in the first place, based on scattered information in a dispersed and increasingly fast growing body of publications. These two methods form two separate parts in this thesis (chapters 2-4, and chapters 5-7). Chapter 1, section 1.3 provides an introductory, complete overview of this thesis. It is intended as a light introduction to my doctoral research, presented in an informal and entertaining way, and mainly addressed to my friends and family. It forms an introduction for the laymen to our work and the concepts that are important for this thesis. Chapters 2, 3 and 4 constitute Part 1 of this thesis. Chapter 2 gives a review of the various formalisms for modelling and simulation of gene networks, as a thorough background for our work presented in the following chapter. Chapter 3 describes SIM-plex, our new software tool that forms a bridge between a mathematical gene network modelling formalism, and the biologist, who usually is more an expert in the biology behind the gene network than a mathematician can ever be. It shields off the mathematics in a new way so as to enable biologists to experiment with modelling and simulation themselves. Chapter 4 describes the various applications that SIM-plex was used for. The research described in Part 2 of this thesis, chapters 5, 6 and 7, emerged from our own need for a better management of biological information. We experienced this necessity while we were building a larger genetic network for the Arabidopsis cell cycle, and it forms a general problem in biology. Chapter 5 gives a background of the currently existing methods for harvesting literature information, but comes to the conclusion that no existing automated or manual method displays sufficient potential to capture the largest part of information from literature in a structured way. In chapter 6, we describe our bold proposal of a new method to tackle this problem: MineMap, a community-based manual text-curation initiative. We describe the various aspects required to make such a project possible, based on our own experiences with our prototype application MineMap. This research is organised in a 'heuristic' way, in the sense that we built a first sketch and a working solution that also generated experiences for improvements in a next design. While chapter 6 describes our new ideas and concrete implementations in considerable detail, chapter 7 then illustrates the core concept behind MineMap

    Combining SOA and BPM Technologies for Cross-System Process Automation

    Get PDF
    This paper summarizes the results of an industry case study that introduced a cross-system business process automation solution based on a combination of SOA and BPM standard technologies (i.e., BPMN, BPEL, WSDL). Besides discussing major weaknesses of the existing, custom-built, solution and comparing them against experiences with the developed prototype, the paper presents a course of action for transforming the current solution into the proposed solution. This includes a general approach, consisting of four distinct steps, as well as specific action items that are to be performed for every step. The discussion also covers language and tool support and challenges arising from the transformation

    Semantic adaptability for the systems interoperability

    Get PDF
    In the current global and competitive business context, it is essential that enterprises adapt their knowledge resources in order to smoothly interact and collaborate with others. However, due to the existent multiculturalism of people and enterprises, there are different representation views of business processes or products, even inside a same domain. Consequently, one of the main problems found in the interoperability between enterprise systems and applications is related to semantics. The integration and sharing of enterprises knowledge to build a common lexicon, plays an important role to the semantic adaptability of the information systems. The author proposes a framework to support the development of systems to manage dynamic semantic adaptability resolution. It allows different organisations to participate in a common knowledge base building, letting at the same time maintain their own views of the domain, without compromising the integration between them. Thus, systems are able to be aware of new knowledge, and have the capacity to learn from it and to manage its semantic interoperability in a dynamic and adaptable way. The author endorses the vision that in the near future, the semantic adaptability skills of the enterprise systems will be the booster to enterprises collaboration and the appearance of new business opportunities
    corecore