51 research outputs found

    Application of Semantics to Solve Problems in Life Sciences

    Get PDF
    Fecha de lectura de Tesis: 10 de diciembre de 2018La cantidad de información que se genera en la Web se ha incrementado en los últimos años. La mayor parte de esta información se encuentra accesible en texto, siendo el ser humano el principal usuario de la Web. Sin embargo, a pesar de todos los avances producidos en el área del procesamiento del lenguaje natural, los ordenadores tienen problemas para procesar esta información textual. En este cotexto, existen dominios de aplicación en los que se están publicando grandes cantidades de información disponible como datos estructurados como en el área de las Ciencias de la Vida. El análisis de estos datos es de vital importancia no sólo para el avance de la ciencia, sino para producir avances en el ámbito de la salud. Sin embargo, estos datos están localizados en diferentes repositorios y almacenados en diferentes formatos que hacen difícil su integración. En este contexto, el paradigma de los Datos Vinculados como una tecnología que incluye la aplicación de algunos estándares propuestos por la comunidad W3C tales como HTTP URIs, los estándares RDF y OWL. Haciendo uso de esta tecnología, se ha desarrollado esta tesis doctoral basada en cubrir los siguientes objetivos principales: 1) promover el uso de los datos vinculados por parte de la comunidad de usuarios del ámbito de las Ciencias de la Vida 2) facilitar el diseño de consultas SPARQL mediante el descubrimiento del modelo subyacente en los repositorios RDF 3) crear un entorno colaborativo que facilite el consumo de Datos Vinculados por usuarios finales, 4) desarrollar un algoritmo que, de forma automática, permita descubrir el modelo semántico en OWL de un repositorio RDF, 5) desarrollar una representación en OWL de ICD-10-CM llamada Dione que ofrezca una metodología automática para la clasificación de enfermedades de pacientes y su posterior validación haciendo uso de un razonador OWL

    NCBO Ontology Recommender 2.0: An Enhanced Approach for Biomedical Ontology Recommendation

    Get PDF
    Biomedical researchers use ontologies to annotate their data with ontology terms, enabling better data integration and interoperability. However, the number, variety and complexity of current biomedical ontologies make it cumbersome for researchers to determine which ones to reuse for their specific needs. To overcome this problem, in 2010 the National Center for Biomedical Ontology (NCBO) released the Ontology Recommender, which is a service that receives a biomedical text corpus or a list of keywords and suggests ontologies appropriate for referencing the indicated terms. We developed a new version of the NCBO Ontology Recommender. Called Ontology Recommender 2.0, it uses a new recommendation approach that evaluates the relevance of an ontology to biomedical text data according to four criteria: (1) the extent to which the ontology covers the input data; (2) the acceptance of the ontology in the biomedical community; (3) the level of detail of the ontology classes that cover the input data; and (4) the specialization of the ontology to the domain of the input data. Our evaluation shows that the enhanced recommender provides higher quality suggestions than the original approach, providing better coverage of the input data, more detailed information about their concepts, increased specialization for the domain of the input data, and greater acceptance and use in the community. In addition, it provides users with more explanatory information, along with suggestions of not only individual ontologies but also groups of ontologies. It also can be customized to fit the needs of different scenarios. Ontology Recommender 2.0 combines the strengths of its predecessor with a range of adjustments and new features that improve its reliability and usefulness. Ontology Recommender 2.0 recommends over 500 biomedical ontologies from the NCBO BioPortal platform, where it is openly available.Comment: 29 pages, 8 figures, 11 table

    SIFR BioPortal : Un portail ouvert et générique d’ontologies et de terminologies biomédicales françaises au service de l’annotation sémantique

    Get PDF
    National audienceContexte – Le volume de données en biomédecine ne cesse de croître. En dépit d'une large adoption de l'anglais, une quantité significative de ces données est en français. Dans le do-maine de l’intégration de données, les terminologies et les ontologies jouent un rôle central pour structurer les données biomédicales et les rendre interopérables. Cependant, outre l'existence de nombreuses ressources en anglais, il y a beaucoup moins d'ontologies en français et il manque crucialement d'outils et de services pour les exploiter. Cette lacune contraste avec le montant considérable de données biomédicales produites en français, par-ticulièrement dans le monde clinique (e.g., dossiers médicaux électroniques). Methode & Résultats – Dans cet article, nous présentons certains résultats du projet In-dexation sémantique de ressources biomédicales francophones (SIFR), en particulier le SIFR BioPortal, une plateforme ouverte et générique pour l’hébergement d’ontologies et de terminologies biomédicales françaises, basée sur la technologie du National Center for Biomedical Ontology. Le portail facilite l’usage et la diffusion des ontologies du domaine en offrant un ensemble de services (recherche, alignements, métadonnées, versionnement, vi-sualisation, recommandation) y inclus pour l’annotation sémantique. En effet, le SIFR An-notator est un outil d’annotation basé sur les ontologies pour traiter des données textuelles en français. Une évaluation préliminaire, montre que le service web obtient des résultats équivalents à ceux reportés précedement, tout en étant public, fonctionnel et tourné vers les standards du web sémantique. Nous présentons également de nouvelles fonctionnalités pour les services à base d’ontologies pour l’anglais et le français

    МЕТОД ФОРМИРОВАНИЯ СТРУКТУР ЦИФРОВЫХ ДВОЙНИКОВ ПРЕДМЕТНО-ОРИЕНТИРОВАННЫХ ОБЪЕКТОВ В ПРОСТРАНСТВЕ ОТКРЫТЫХ ИСТОЧНИКОВ НА ОСНОВЕ ФОРМАЛИЗМОВ ТЕОРИИ МНОЖЕСТВ, ГРАФОВ, ТЕОРИИ КАТЕГОРИЙ И ТЕОРИИ ПОРОЖДАЮЩИХ ЯЗЫКОВ ХОМСКОГО

    Get PDF
    For the material of this article, open sources can be presented as information posted on the Internet for multiple and unlimited use in the form of machine-readable systematized data, in formats that allow their separate automated processing. Any of the open sources is partially structured content, cha¬racterized by the fact that it consists of fuzzy overlaps and connections formalized by a number of sustainable rules. Aim. The purpose of the study is to produce a mathematical description of the rules for the decomposition of links of virtual images of content based on statistical data of comparisons in order to formulate a digital analogue of the verbal structure. The research is accompanied by the development of a modular cross-platform predictive analytics system for distributed open sources of the social digital environment based on multi-stream data processing technologies. The system is based on the creation of a prototype of a digital twin, which allows monitoring and subsequent analysis of open sources. A cross-platform digital twin can be created for various open sources. Currently, among them, social networks are a priority for the analysis of the studied data. Materials and methods. Turning to the same type of code content, it becomes necessary to formalize the rules for categorizing the semantic meanings of the structure of connections of the object of study, which can be described in the language of graph theory, have a suitable structure in Lie matrices, obey the laws of transitivity and have properties that allow you to recreate connectivity in a quasi-projection. A method is proposed that makes it possible to produce sequential calculus that does not contradict each other according to the basic rules of axiomatics. Results. The method allows for real-time analysis of the data flow of open sources, identification of digital traces of research objects, identification of the structure of their connections. Conclusion. With the help of the proposed group of algorithmized mathematical iterations, it becomes possible to create a combination of local systems with feedback subsystems of predictive analytics of open sources of Internet resources and local systems for various purposes.Для материала данной статьи открытые источники могут быть представлены как информация, размещенная в сети Интернет для многократного и неограниченного использования в виде машиночитаемых систематизированных данных, в форматах, позволяющих их раздельную автоматизированную обработку. Любой из открытых источников – это частично структурированный контент, характеризующийся тем, что состоит из нечетких перекрытий и связей, формализованных рядом устойчивых правил. Цель исследования состоит в том, чтобы произвести математическое описание правил декомпозиции связей виртуальных образов контента на основании статистических данных сопоставлений с целью формулирования цифрового аналога вербальной структуры. Исследованию сопутствует разработка модульной кроссплатформенной системы предиктивной аналитики распределенных открытых источников социальной цифровой среды на основе технологий многопоточной обработки данных. В основу системы ложится создание прототипа цифрового двойника, позволяющего производить мониторинг и последующий анализ открытых источников. Кроссплаформенный цифровой двойник может быть создан для различных открытых источников. В настоящее время среди них приоритетными для анализа исследуемых данных являются социальные сети. Материалы и методы. При однотипном контенте кода становится необходимым формализовать правила категоризации семантических значений структуры связей объекта исследования, которые могут быть описаны на языке теории графов, иметь подходящую им структуру в матрицах Ли, подчиняться законам транзитивности и обладать свойствами, позволяющими воссоздать связность в квазипроекции. Предлагается метод, позволяющий производить последовательные исчисления, не противоречащие друг другу по основным правилам аксиоматики. Результаты. Метод позволяет производить анализ в режиме реального времени потока данных открытых источников, идентифицировать цифровые следы объектов исследования, выявление структуры их связей. Заключение. С помощью предлагаемой группы алгоритмизированных математических итераций становится возможным создание совокупности локальных систем с подсистемами обратной связи предиктивной аналитики открытых источников интернет-ресурсов и локальных систем различного назначения

    Linked Registries: Connecting Rare Diseases Patient Registries through a Semantic Web Layer

    Get PDF
    Patient registries are an essential tool to increase current knowledge regarding rare diseases. Understanding these data is a vital step to improve patient treatments and to create the most adequate tools for personalized medicine. However, the growing number of disease-specific patient registries brings also new technical challenges. Usually, these systems are developed as closed data silos, with independent formats and models, lacking comprehensive mechanisms to enable data sharing. To tackle these challenges, we developed a Semantic Web based solution that allows connecting distributed and heterogeneous registries, enabling the federation of knowledge between multiple independent environments. This semantic layer creates a holistic view over a set of anonymised registries, supporting semantic data representation, integrated access, and querying. The implemented system gave us the opportunity to answer challenging questions across disperse rare disease patient registries. The interconnection between those registries using Semantic Web technologies benefits our final solution in a way that we can query single or multiple instances according to our needs. The outcome is a unique semantic layer, connecting miscellaneous registries and delivering a lightweight holistic perspective over the wealth of knowledge stemming from linked rare disease patient registries

    Ambient-aware continuous care through semantic context dissemination

    Get PDF
    Background: The ultimate ambient-intelligent care room contains numerous sensors and devices to monitor the patient, sense and adjust the environment and support the staff. This sensor-based approach results in a large amount of data, which can be processed by current and future applications, e. g., task management and alerting systems. Today, nurses are responsible for coordinating all these applications and supplied information, which reduces the added value and slows down the adoption rate. The aim of the presented research is the design of a pervasive and scalable framework that is able to optimize continuous care processes by intelligently reasoning on the large amount of heterogeneous care data. Methods: The developed Ontology-based Care Platform (OCarePlatform) consists of modular components that perform a specific reasoning task. Consequently, they can easily be replicated and distributed. Complex reasoning is achieved by combining the results of different components. To ensure that the components only receive information, which is of interest to them at that time, they are able to dynamically generate and register filter rules with a Semantic Communication Bus (SCB). This SCB semantically filters all the heterogeneous care data according to the registered rules by using a continuous care ontology. The SCB can be distributed and a cache can be employed to ensure scalability. Results: A prototype implementation is presented consisting of a new-generation nurse call system supported by a localization and a home automation component. The amount of data that is filtered and the performance of the SCB are evaluated by testing the prototype in a living lab. The delay introduced by processing the filter rules is negligible when 10 or fewer rules are registered. Conclusions: The OCarePlatform allows disseminating relevant care data for the different applications and additionally supports composing complex applications from a set of smaller independent components. This way, the platform significantly reduces the amount of information that needs to be processed by the nurses. The delay resulting from processing the filter rules is linear in the amount of rules. Distributed deployment of the SCB and using a cache allows further improvement of these performance results

    structured representation of scientific evidence in the biomedical domain using Semantic Web techniques

    Get PDF
    Background Accounts of evidence are vital to evaluate and reproduce scientific findings and integrate data on an informed basis. Currently, such accounts are often inadequate, unstandardized and inaccessible for computational knowledge engineering even though computational technologies, among them those of the semantic web, are ever more employed to represent, disseminate and integrate biomedical data and knowledge. Results We present SEE (Semantic EvidencE), an RDF/OWL based approach for detailed representation of evidence in terms of the argumentative structure of the supporting background for claims even in complex settings. We derive design principles and identify minimal components for the representation of evidence. We specify the Reasoning and Discourse Ontology (RDO), an OWL representation of the model of scientific claims, their subjects, their provenance and their argumentative relations underlying the SEE approach. We demonstrate the application of SEE and illustrate its design patterns in a case study by providing an expressive account of the evidence for certain claims regarding the isolation of the enzyme glutamine synthetase. Conclusions SEE is suited to provide coherent and computationally accessible representations of evidence-related information such as the materials, methods, assumptions, reasoning and information sources used to establish a scientific finding by adopting a consistently claim-based perspective on scientific results and their evidence. SEE allows for extensible evidence representations, in which the level of detail can be adjusted and which can be extended as needed. It supports representation of arbitrary many consecutive layers of interpretation and attribution and different evaluations of the same data. SEE and its underlying model could be a valuable component in a variety of use cases that require careful representation or examination of evidence for data presented on the semantic web or in other formats
    corecore