9 research outputs found

    Ontology-based instance data validation for high-quality curated biological pathways

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Modeling in systems biology is vital for understanding the complexity of biological systems across scales and predicting system-level behaviors. To obtain high-quality pathway databases, it is essential to improve the efficiency of model validation and model update based on appropriate feedback.</p> <p>Results</p> <p>We have developed a new method to guide creating novel high-quality biological pathways, using a rule-based validation. Rules are defined to correct models against biological semantics and improve models for dynamic simulation. In this work, we have defined 40 rules which constrain event-specific participants and the related features and adding missing processes based on biological events. This approach is applied to data in Cell System Ontology which is a comprehensive ontology that represents complex biological pathways with dynamics and visualization. The experimental results show that the relatively simple rules can efficiently detect errors made during curation, such as misassignment and misuse of ontology concepts and terms in curated models.</p> <p>Conclusions</p> <p>A new rule-based approach has been developed to facilitate model validation and model complementation. Our rule-based validation embedding biological semantics enables us to provide high-quality curated biological pathways. This approach can serve as a preprocessing step for model integration, exchange and extraction data, and simulation.</p

    ORE - A Tool for Repairing and Enriching Knowledge Bases

    Full text link

    Building high-quality merged ontologies from multiple sources with requirements customization

    Get PDF
    Ontologies are the prime way of organizing data in the Semantic Web. Often, it is necessary to combine several, independently developed ontologies to obtain a knowledge graph fully representing a domain of interest. Existing approaches scale rather poorly to the merging of multiple ontologies due to using a binary merge strategy. Thus, we aim to investigate the extent to which the n-ary strategy can solve the scalability problem. This thesis contributes to the following important aspects: 1. Our n-ary merge strategy takes as input a set of source ontologies and their mappings and generates a merged ontology. For efficient processing, rather than successively merging complete ontologies pairwise, we group related concepts across ontologies into partitions and merge first within and then across those partitions. 2. We take a step towards parameterizable merge methods. We have identified a set of Generic Merge Requirements (GMRs) that merged ontologies might be expected to meet. We have investigated and developed compatibilities of the GMRs by a graph-based method. 3. When multiple ontologies are merged, inconsistencies can occur due to different world views encoded in the source ontologies To this end, we propose a novel Subjective Logic-based method to handling the inconsistency occurring while merging ontologies. We apply this logic to rank and estimate the trustworthiness of conflicting axioms that cause inconsistencies within a merged ontology. 4. To assess the quality of the merged ontologies systematically, we provide a comprehensive set of criteria in an evaluation framework. The proposed criteria cover a variety of characteristics of each individual aspect of the merged ontology in structural, functional, and usability dimensions. 5. The final contribution of this research is the development of the CoMerger tool that implements all aforementioned aspects accessible via a unified interface

    Framework para suporte à evolução de ontologias biomédicas

    Get PDF
    Biomedical ontologies grow continuously and therefore need to be refined throughout their life cycle. In Biomedical ontologies, small changes can cause unexpected effects on large parts of the ontology, which can be explained by their high complexity and the fact that they are, in general, larger than the ontologies of other domains. The process of refinement in order for ontologies to be adapted to the changes occurring in their knowledge sources is called evolution of ontologies. In this context, in the present work a conceptual framework was created to assist Biomedical ontology developers during their evolution process. Its main objective is to offer validated suggestions of activities and tools to support the evolution of Biomedical ontologies. The defined framework is divided into 5 comprehensive and well-defined phases called Evolution Planning, Implementation of Changes, Detection of Changes, Treatment of Inconsistencies and Audit of Changes. The framework was validated through a case study where it was applied to an ontology on Chronic Kidney Disease submitted to evolution in three perspectives based on three reasons for evolution: the change of the focus of the ontology, changes to reflect changes in their domain guidelines and changes related to ontology reference terminology. The consistency of the ontology used in the case study was verified through the application of SPARQL queries based on Competency Issues predefined by the ontology domain expert. The developed framework has as its main benefit the provision of resources to support the whole process of evolution, from the representation of the change to the verification of the conformity of the evolved ontology with the evolution plan. Based on the results of the experiments carried out, it is possible to conclude that the framework allowed to keep organized and understanding the evolutionary process of the ontology used. The organization and comprehensiveness provided by the framework are important because it facilitates the retrieval of information if there is a need to redo any changes. In addition to subsidizing the activities for which they are provided as inputs, the reports provided as a result of the activities that make up the subprocesses of the framework can be used as tools for predicting changes. This can be useful in the prior definition of strategies for applying changes and correcting errors identified during ontology validation.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPESAs ontologias Biomédicas crescem continuamente e por isso necessitam ser refinadas ao longo do seu ciclo de vida. Nas ontologias Biomédicas, pequenas alterações podem causar efeitos inesperados em grandes partes da ontologia, isso pode ser explicado pela sua alta complexidade e pelo fato de serem, em geral, maiores que as ontologias de outros domínios. O processo de refinamento para que as ontologias sejam adaptadas às mudanças ocorridas em suas fontes de conhecimento se chama evolução de ontologias. Nesse contexto, no presente trabalho foi construído um framework conceitual para auxiliar desenvolvedores de ontologias Biomédicas durante seu processo de evolução. Seu objetivo principal é oferecer sugestões validadas de atividades e ferramentas para apoiar a evolução de ontologias Biomédicas. O framework definido está dividido em 5 fases compreensivas e bem definidas denominadas Planejamento da Evolução, Implementação de Mudanças, Detecção de Mudanças, Tratamento de Inconsistências e Auditoria de Mudanças. O framework foi validado por meio de um estudo de caso onde o mesmo foi aplicado a uma ontologia sobre a Doença Renal Crônica submetida a evolução em três perspectivas baseadas em três motivos de evolução: a mudança do foco da ontologia, mudanças para refletir alterações em suas diretrizes de domínio e mudanças relativas à terminologia de referência da ontologia. A consistência da ontologia utilizada no estudo de caso foi verificada por meio da aplicação de consultas SPARQL baseadas em Questões de Competência pré-definidas pelo especialista do domínio da ontologia. O framework desenvolvido possui como principal benefício o provimento de recursos para apoiar todo o processo de evolução, desde a representação da mudança até a verificação da conformidade da ontologia evoluída com o plano de evolução. Com base nos resultados dos experimentos realizados pode-se concluir que o framework permitiu manter organizado e compreensivo o processo evolutivo da ontologia utilizada. A organização e compreensividade fornecida pelo framework são importantes pois facilitam a recuperação de informações caso haja a necessidade de refazer alguma mudança. Os relatórios fornecidos como saída das atividades que compõem os subprocessos do framework, além de subsidiar a realização das atividades às quais são dados como entradas, podem ser utilizados como ferramentas de predição de mudanças, por meio das quais as mudanças recorrentes podem ser identificadas, o que pode ajudar na definição prévia de estratégias de aplicação de mudanças bem como de correção de erros identificados durante a validação da ontologia

    Evolution cohérente des ressources termino-ontologiques et des annotations sémantiques

    Get PDF
    Un des enjeux du web sémantique est de produire des caractérisations formelles de contenus documentaires, ou annotations sémantiques, de qualité. Or dans un environnement dynamique, les ressources termino-ontologiques et les annotations sémantiques qu'elles permettent de construire doivent être modifiées régulièrement et en cohérence pour s'adapter à l'évolution du domaine concerné et des collections documentaires annotées. Notre contribution est une méthode qui permet de gérer conjointement l'évolution d'une ressource termino-ontologique et d'annotations sémantiques produites à partir de cette ressource. La méthode définit les types de changements applicables (élémentaires ou complexes), ainsi que des stratégies d'évolution de la ressource et des annotations. Cette méthode est mise en œuvre par le logiciel EvOnto qui s'intègre à l'environnement d'annotation automatique de documents TextViz défini dans le cadre du projet ANR DYNAMO. L'originalité d'EvOnto est de s'adapter à plusieurs scénarios d'évolution suivant que ce soit l'ontologie, la collection documentaire ou les annotations qui soient modifiées. EvOnto assure un support à l'ontologue en le guidant interactivement pour formuler une demande de changement, évaluer son impact (effets supplémentaires) sur la ressource termino-ontologique et aussi sur les annotations sémantiques, et décider ensuite de leur mise en œuvre. Il fournit des informations à l'ontologue pour qu'il décide d'une évolution en connaissant ses conséquences, et qu'il l'adapte pour minimiser les effets négatifs, les impacts non souhaitables ou les coûts correspondants sur la ressource elle-même et son utilisation dans des annotations.One of the challenges of the Semantic Web is to get high quality formal representations that characterize document content, also called semantic annotations. In a dynamic environment, the termino-ontological resources and semantic annotations built thanks to the resources must be regularly and consistently modified to adapt to the evolution of the domain to which they relate and to the annotated document collections. Our contribution is a method to jointly manage the evolution of a termino-ontological resource and semantic annotations built with this resource. The method defines applicable change types (elementary or complex) as well as evolution strategies for both the resource and the document semantic annotations. This method is supported by the EvOnto system that takes place in the TextViz platform for ontology-based automatic document annotation developed in the DYNAMO project. The originality of EvOnto is to preserve the consistency between the termino-ontological resources and the semantic annotations.. EvOnto provides support to the ontologist for different scenarios, and guides him interactively when he requests for a change by assessing its impact (additional effects) on the quality of the termino-ontological resource and also on semantic annotations, and then when he decides on their implementation. EvOnto provides the ontologist with relevant information on the use of ontology so that he takes initiative of a change knowing its consequences, and so that he adapts changes to minimize their negative effects, their undesirable impacts and their related costs on the resource itself and its use in annotations

    ScaleSem (model checking et web sémantique)

    Get PDF
    Le développement croissant des réseaux et en particulier l'Internet a considérablement développé l'écart entre les systèmes d'information hétérogènes. En faisant une analyse sur les études de l'interopérabilité des systèmes d'information hétérogènes, nous découvrons que tous les travaux dans ce domaine tendent à la résolution des problèmes de l'hétérogénéité sémantique. Le W3C (World Wide Web Consortium) propose des normes pour représenter la sémantique par l'ontologie. L'ontologie est en train de devenir un support incontournable pour l'interopérabilité des systèmes d'information et en particulier dans la sémantique. La structure de l'ontologie est une combinaison de concepts, propriétés et relations. Cette combinaison est aussi appelée un graphe sémantique. Plusieurs langages ont été développés dans le cadre du Web sémantique et la plupart de ces langages utilisent la syntaxe XML (eXtensible Meta Language). Les langages OWL (Ontology Web Language) et RDF (Resource Description Framework) sont les langages les plus importants du web sémantique, ils sont basés sur XML.Le RDF est la première norme du W3C pour l'enrichissement des ressources sur le Web avec des descriptions détaillées et il augmente la facilité de traitement automatique des ressources Web. Les descriptions peuvent être des caractéristiques des ressources, telles que l'auteur ou le contenu d'un site web. Ces descriptions sont des métadonnées. Enrichir le Web avec des métadonnées permet le développement de ce qu'on appelle le Web Sémantique. Le RDF est aussi utilisé pour représenter les graphes sémantiques correspondant à une modélisation des connaissances spécifiques. Les fichiers RDF sont généralement stockés dans une base de données relationnelle et manipulés en utilisant le langage SQL ou les langages dérivés comme SPARQL. Malheureusement, cette solution, bien adaptée pour les petits graphes RDF n'est pas bien adaptée pour les grands graphes RDF. Ces graphes évoluent rapidement et leur adaptation au changement peut faire apparaître des incohérences. Conduire l application des changements tout en maintenant la cohérence des graphes sémantiques est une tâche cruciale et coûteuse en termes de temps et de complexité. Un processus automatisé est donc essentiel. Pour ces graphes RDF de grande taille, nous suggérons une nouvelle façon en utilisant la vérification formelle Le Model checking .Le Model checking est une technique de vérification qui explore tous les états possibles du système. De cette manière, on peut montrer qu un modèle d un système donné satisfait une propriété donnée. Cette thèse apporte une nouvelle méthode de vérification et d interrogation de graphes sémantiques. Nous proposons une approche nommé ScaleSem qui consiste à transformer les graphes sémantiques en graphes compréhensibles par le model checker (l outil de vérification de la méthode Model checking). Il est nécessaire d avoir des outils logiciels permettant de réaliser la traduction d un graphe décrit dans un formalisme vers le même graphe (ou une adaptation) décrit dans un autre formalismeThe increasing development of networks and especially the Internet has greatly expanded the gap between heterogeneous information systems. In a review of studies of interoperability of heterogeneous information systems, we find that all the work in this area tends to be in solving the problems of semantic heterogeneity. The W3C (World Wide Web Consortium) standards proposed to represent the semantic ontology. Ontology is becoming an indispensable support for interoperability of information systems, and in particular the semantics. The structure of the ontology is a combination of concepts, properties and relations. This combination is also called a semantic graph. Several languages have been developed in the context of the Semantic Web. Most of these languages use syntax XML (eXtensible Meta Language). The OWL (Ontology Web Language) and RDF (Resource Description Framework) are the most important languages of the Semantic Web, and are based on XML.RDF is the first W3C standard for enriching resources on the Web with detailed descriptions, and increases the facility of automatic processing of Web resources. Descriptions may be characteristics of resources, such as the author or the content of a website. These descriptions are metadata. Enriching the Web with metadata allows the development of the so-called Semantic Web. RDF is used to represent semantic graphs corresponding to a specific knowledge modeling. RDF files are typically stored in a relational database and manipulated using SQL, or derived languages such as SPARQL. This solution is well suited for small RDF graphs, but is unfortunately not well suited for large RDF graphs. These graphs are rapidly evolving, and adapting them to change may reveal inconsistencies. Driving the implementation of changes while maintaining the consistency of a semantic graph is a crucial task, and costly in terms of time and complexity. An automated process is essential. For these large RDF graphs, we propose a new way using formal verification entitled "Model Checking".Model Checking is a verification technique that explores all possible states of the system. In this way, we can show that a model of a given system satisfies a given property. This thesis provides a new method for checking and querying semantic graphs. We propose an approach called ScaleSem which transforms semantic graphs into graphs understood by the Model Checker (The verification Tool of the Model Checking method). It is necessary to have software tools to perform the translation of a graph described in a certain formalism into the same graph (or adaptation) described in another formalismDIJON-BU Doc.électronique (212319901) / SudocSudocFranceF

    Knowledge sharing framework for sustainability of knowledge capital

    Get PDF
    Knowledge sharing is one of the most critical elements in a knowledgebased society. With huge concentration on communication facilities, there is a major shift in world-wide access to codified knowledge. Although communication technologies have made great strides in the development of instruments for accessing required knowledge and improving the level of knowledge sharing, there are still many obstacles which diminish the effectiveness of knowledge sharing in an organization or a community. The current challenges include: identification of the most important variables in knowledge sharing, development of an effective knowledge sharing measurement model, development of an effective mechanism for knowledge sharing reporting and calculating knowledge capital that can be created by knowledge sharing. The ability and willingness of individuals to share both their codified and uncodified knowledge have emerged as significant variables in knowledge sharing in an environment where all people have access to communication instruments and have the choice of either sharing their own knowledge or keeping it to themselves.This thesis addresses knowledge sharing variables and identifies the key variables as: willingness to share or gain knowledge, ability to share or gain knowledge, complexity or transferability of the shared knowledge. Different mechanisms are used to measure these key variables. Trust mechanisms are used to measure the willingness and ability of individuals to share or acquire knowledge. By using trust mechanisms, one can rate the behavior of the parties engaged in knowledge sharing and subsequently assign a value to the willingness and ability of individuals to share or obtain knowledge. Also, ontology mechanisms are used to measure the complexity and transferability of a particular knowledge in the knowledge sharing process. The level of similarity between sender and receiver ontologies is used to measure the transferability of a particular knowledge between knowledge sender and receiver. Ontology structure is used to measure the complexity of the knowledge transmitted between knowledge sharing parties.A knowledge sharing framework provides a measurement model for calculating knowledge sharing levels based on trust and ontology mechanisms. It calculates knowledge sharing levels numerically and also uses a Business Intelligence Simulation Model (BISIM) to simulate a community and report the knowledge sharing level between members of the simulated community. The simulated model is able to calculate and report the knowledge sharing and knowledge acquisition levels of each member in addition to the total knowledge sharing level in the community.Finally, in order to determine the advantages of knowledge sharing for a community, capital that can be created by knowledge sharing is calculated by using intellectual capital measurement mechanisms. Created capital is based on knowledge and is related to the role of knowledge sharing in increasing the embedded knowledge of individuals (human capital), improving connections, and embedding knowledge within connections (social capital). Also, market components (such as customers) play a major role in business, and knowledge sharing improves the embedded knowledge within market components that is defined as market capital in this thesis. All these categories of intellectual capital are measured and reported in the knowledge sharing framework

    Semantic resources in pharmacovigilance: a corpus and an ontology for drug-drug interactions

    Get PDF
    Mención Internacional en el título de doctorNowadays, with the increasing use of several drugs for the treatment of one or more different diseases (polytherapy) in large populations, the risk for drugs combinations that have not been studied in pre-authorization clinical trials has increased. This provides a favourable setting for the occurrence of drug-drug interactions (DDIs), a common adverse drug reaction (ADR) representing an important risk to patients safety, and an increase in healthcare costs. Their early detection is, therefore, a main concern in the clinical setting. Although there are different databases supporting healthcare professionals in the detection of DDIs, the quality of these databases is very uneven, and the consistency of their content is limited. Furthermore, these databases do not scale well to the large and growing number of pharmacovigilance literature in recent years. In addition, large amounts of current and valuable information are hidden in published articles, scientific journals, books, and technical reports. Thus, the large number of DDI information sources has overwhelmed most healthcare professionals because it is not possible to remain up to date on everything published about DDIs. Computational methods can play a key role in the identification, explanation, and prediction of DDIs on a large scale, since they can be used to collect, analyze and manipulate large amounts of biological and pharmacological data. Natural language processing (NLP) techniques can be used to retrieve and extract DDI information from pharmacological texts, supporting researchers and healthcare professionals on the challenging task of searching DDI information among different and heterogeneous sources. However, these methods rely on the availability of specific resources providing the domain knowledge, such as databases, terminological vocabularies, corpora, ontologies, and so forth, which are necessary to address the Information Extraction (IE) tasks. In this thesis, we have developed two semantic resources for the DDI domain that make an important contribution to the research and development of IE systems for DDIs. We have reviewed and analyzed the existing corpora and ontologies relevant to this domain, based on their strengths and weaknesses, we have developed the DDI corpus and the ontology for drug-drug interactions (named DINTO). The DDI corpus has proven to fulfil the characteristics of a high-quality gold-standard, and has demonstrated its usefulness as a benchmark for the training and testing of different IE systems in the SemEval-2013 DDIExtraction shared task. Meanwhile, DINTO has been used and evaluated in two different applications. Firstly, it has been proven that the knowledge represented in the ontology can be used to infer DDIs and their different mechanisms. Secondly, we have provided a proof-of-concept of the contribution of DINTO to NLP, by providing the domain knowledge to be exploited by an IE pilot prototype. From these results, we believe that these two semantic resources will encourage further research into the application of computational methods to the early detection of DDIs. This work has been partially supported by the Regional Government of Madrid under the Research Network MA2VICMR [S2009/TIC-1542], by the Spanish Ministry of Education under the project MULTIMEDICA [TIN2010-20644-C03-01] and by the European Commission Seventh Framework Programme under TrendMiner project [FP7-ICT287863].Hoy en día ha habido un notable aumento del número de pacientes polimedicados que reciben simultáneamente varios fármacos para el tratamiento de una o varias enfermedades. Esta situación proporciona el escenario ideal para la prescripción de combinaciones de fármacos que no han sido estudiadas previamente en ensayos clínicos, y puede dar lugar a un aumento de interacciones farmacológicas (DDIs por sus siglas en inglés). Las interacciones entre fármacos son un tipo de reacción adversa que supone no sólo un riesgo para los pacientes, sino también una importante causa de aumento del gasto sanitario. Por lo tanto, su detección temprana es crucial en la práctica clínica. En la actualidad existen diversos recursos y bases de datos que pueden ayudar a los profesionales sanitarios en la detección de posibles interacciones farmacológicas. Sin embargo, la calidad de su información varía considerablemente de unos a otros, y la consistencia de sus contenidos es limitada. Además, la actualización de estos recursos es difícil debido al aumento que ha experimentado la literatura farmacológica en los últimos años. De hecho, mucha información sobre DDIs se encuentra dispersa en artículos, revistas científicas, libros o informes técnicos, lo que ha hecho que la mayoría de los profesionales sanitarios se hayan visto abrumados al intentar mantenerse actualizados en el dominio de las interacciones farmacológicas. La ingeniería informática puede representar un papel fundamental en este campo permitiendo la identificación, explicación y predicción de DDIs, ya que puede ayudar a recopilar, analizar y manipular grandes cantidades de datos biológicos y farmacológicos. En concreto, las técnicas del procesamiento del lenguaje natural (PLN) pueden ayudar a recuperar y extraer información sobre DDIs de textos farmacológicos, ayudando a los investigadores y profesionales sanitarios en la complicada tarea de buscar esta información en diversas fuentes. Sin embargo, el desarrollo de estos métodos depende de la disponibilidad de recursos específicos que proporcionen el conocimiento del dominio, como bases de datos, vocabularios terminológicos, corpora u ontologías, entre otros, que son necesarios para desarrollar las tareas de extracción de información (EI). En el marco de esta tesis hemos desarrollado dos recursos semánticos en el dominio de las interacciones farmacológicas que suponen una importante contribución a la investigación y al desarrollo de sistemas de EI sobre DDIs. En primer lugar hemos revisado y analizado los corpora y ontologías existentes relevantes para el dominio y, en base a sus potenciales y limitaciones, hemos desarrollado el corpus DDI y la ontología para interacciones farmacológicas DINTO. El corpus DDI ha demostrado cumplir con las características de un estándar de oro de gran calidad, así como su utilidad para el entrenamiento y evaluación de distintos sistemas en la tarea de extracción de información SemEval-2013 DDIExtraction Task. Por su parte, DINTO ha sido utilizada y evaluada en dos aplicaciones diferentes. En primer lugar, hemos demostrado que esta ontología puede ser utilizada para inferir interacciones entre fármacos y los mecanismos por los que ocurren. En segundo lugar, hemos obtenido una primera prueba de concepto de la contribución de DINTO al área del PLN al proporcionar el conocimiento del dominio necesario para ser explotado por un prototipo de un sistema de EI. En vista de estos resultados, creemos que estos dos recursos semánticos pueden estimular la investigación en el desarrollo de métodos computaciones para la detección temprana de DDIs. Este trabajo ha sido financiado parcialmente por el Gobierno Regional de Madrid a través de la red de investigación MA2VICMR [S2009/TIC-1542], por el Ministerio de Educación Español, a través del proyecto MULTIMEDICA [TIN2010-20644-C03-01], y por el Séptimo Programa Macro de la Comisión Europea a través del proyecto TrendMiner [FP7-ICT287863].This work has been partially supported by the Regional Government of Madrid under the Research Network MA2VICMR [S2009/TIC-1542], by the Spanish Ministry of Education under the project MULTIMEDICA [TIN2010-20644-C03-01] and by the European Commission Seventh Framework Programme under TrendMiner project [FP7-ICT287863].Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: Asunción Gómez Pérez.- Secretario: María Belén Ruiz Mezcua.- Vocal: Mariana Neve
    corecore