51 research outputs found
Application of Semantics to Solve Problems in Life Sciences
Fecha de lectura de Tesis: 10 de diciembre de 2018La cantidad de información que se genera en la Web se ha incrementado en los últimos años. La mayor parte de esta información se encuentra accesible en texto, siendo el ser humano el principal usuario de la Web. Sin embargo, a pesar de todos los avances producidos en el área del procesamiento del lenguaje natural, los ordenadores tienen problemas para procesar esta información textual. En este cotexto, existen dominios de aplicación en los que se están publicando grandes cantidades de información disponible como datos estructurados como en el área de las Ciencias de la Vida. El análisis de estos datos es de vital importancia no sólo para el avance de la ciencia, sino para producir avances en el ámbito de la salud. Sin embargo, estos datos están localizados en diferentes repositorios y almacenados en diferentes formatos que hacen difícil su integración. En este contexto, el paradigma de los Datos Vinculados como una tecnología que incluye la aplicación de algunos estándares propuestos por la comunidad W3C tales como HTTP URIs, los estándares RDF y OWL. Haciendo uso de esta tecnología, se ha desarrollado esta tesis doctoral basada en cubrir los siguientes objetivos principales: 1) promover el uso de los datos vinculados por parte de la comunidad de usuarios del ámbito de las Ciencias de la Vida 2) facilitar el diseño de consultas SPARQL mediante el descubrimiento del modelo subyacente en los repositorios RDF 3) crear un entorno colaborativo que facilite el consumo de Datos Vinculados por usuarios finales, 4) desarrollar un algoritmo que, de forma automática, permita descubrir el modelo semántico en OWL de un repositorio RDF, 5) desarrollar una representación en OWL de ICD-10-CM llamada Dione que ofrezca una metodología automática para la clasificación de enfermedades de pacientes y su posterior validación haciendo uso de un razonador OWL
NCBO Ontology Recommender 2.0: An Enhanced Approach for Biomedical Ontology Recommendation
Biomedical researchers use ontologies to annotate their data with ontology
terms, enabling better data integration and interoperability. However, the
number, variety and complexity of current biomedical ontologies make it
cumbersome for researchers to determine which ones to reuse for their specific
needs. To overcome this problem, in 2010 the National Center for Biomedical
Ontology (NCBO) released the Ontology Recommender, which is a service that
receives a biomedical text corpus or a list of keywords and suggests ontologies
appropriate for referencing the indicated terms. We developed a new version of
the NCBO Ontology Recommender. Called Ontology Recommender 2.0, it uses a new
recommendation approach that evaluates the relevance of an ontology to
biomedical text data according to four criteria: (1) the extent to which the
ontology covers the input data; (2) the acceptance of the ontology in the
biomedical community; (3) the level of detail of the ontology classes that
cover the input data; and (4) the specialization of the ontology to the domain
of the input data. Our evaluation shows that the enhanced recommender provides
higher quality suggestions than the original approach, providing better
coverage of the input data, more detailed information about their concepts,
increased specialization for the domain of the input data, and greater
acceptance and use in the community. In addition, it provides users with more
explanatory information, along with suggestions of not only individual
ontologies but also groups of ontologies. It also can be customized to fit the
needs of different scenarios. Ontology Recommender 2.0 combines the strengths
of its predecessor with a range of adjustments and new features that improve
its reliability and usefulness. Ontology Recommender 2.0 recommends over 500
biomedical ontologies from the NCBO BioPortal platform, where it is openly
available.Comment: 29 pages, 8 figures, 11 table
SIFR BioPortal : Un portail ouvert et générique d’ontologies et de terminologies biomédicales françaises au service de l’annotation sémantique
National audienceContexte – Le volume de données en biomédecine ne cesse de croître. En dépit d'une large adoption de l'anglais, une quantité significative de ces données est en français. Dans le do-maine de l’intégration de données, les terminologies et les ontologies jouent un rôle central pour structurer les données biomédicales et les rendre interopérables. Cependant, outre l'existence de nombreuses ressources en anglais, il y a beaucoup moins d'ontologies en français et il manque crucialement d'outils et de services pour les exploiter. Cette lacune contraste avec le montant considérable de données biomédicales produites en français, par-ticulièrement dans le monde clinique (e.g., dossiers médicaux électroniques). Methode & Résultats – Dans cet article, nous présentons certains résultats du projet In-dexation sémantique de ressources biomédicales francophones (SIFR), en particulier le SIFR BioPortal, une plateforme ouverte et générique pour l’hébergement d’ontologies et de terminologies biomédicales françaises, basée sur la technologie du National Center for Biomedical Ontology. Le portail facilite l’usage et la diffusion des ontologies du domaine en offrant un ensemble de services (recherche, alignements, métadonnées, versionnement, vi-sualisation, recommandation) y inclus pour l’annotation sémantique. En effet, le SIFR An-notator est un outil d’annotation basé sur les ontologies pour traiter des données textuelles en français. Une évaluation préliminaire, montre que le service web obtient des résultats équivalents à ceux reportés précedement, tout en étant public, fonctionnel et tourné vers les standards du web sémantique. Nous présentons également de nouvelles fonctionnalités pour les services à base d’ontologies pour l’anglais et le français
МЕТОД ФОРМИРОВАНИЯ СТРУКТУР ЦИФРОВЫХ ДВОЙНИКОВ ПРЕДМЕТНО-ОРИЕНТИРОВАННЫХ ОБЪЕКТОВ В ПРОСТРАНСТВЕ ОТКРЫТЫХ ИСТОЧНИКОВ НА ОСНОВЕ ФОРМАЛИЗМОВ ТЕОРИИ МНОЖЕСТВ, ГРАФОВ, ТЕОРИИ КАТЕГОРИЙ И ТЕОРИИ ПОРОЖДАЮЩИХ ЯЗЫКОВ ХОМСКОГО
For the material of this article, open sources can be presented as information posted on the Internet for multiple and unlimited use in the form of machine-readable systematized data, in formats that allow their separate automated processing. Any of the open sources is partially structured content, cha¬racterized by the fact that it consists of fuzzy overlaps and connections formalized by a number of sustainable rules. Aim. The purpose of the study is to produce a mathematical description of the rules for the decomposition of links of virtual images of content based on statistical data of comparisons in order to formulate a digital analogue of the verbal structure. The research is accompanied by the development of a modular cross-platform predictive analytics system for distributed open sources of the social digital environment based on multi-stream data processing technologies. The system is based on the creation of a prototype of a digital twin, which allows monitoring and subsequent analysis of open sources. A cross-platform digital twin can be created for various open sources. Currently, among them, social networks are a priority for the analysis of the studied data. Materials and methods. Turning to the same type of code content, it becomes necessary to formalize the rules for categorizing the semantic meanings of the structure of connections of the object of study, which can be described in the language of graph theory, have a suitable structure in Lie matrices, obey the laws of transitivity and have properties that allow you to recreate connectivity in a quasi-projection. A method is proposed that makes it possible to produce sequential calculus that does not contradict each other according to the basic rules of axiomatics. Results. The method allows for real-time analysis of the data flow of open sources, identification of digital traces of research objects, identification of the structure of their connections. Conclusion. With the help of the proposed group of algorithmized mathematical iterations, it becomes possible to create a combination of local systems with feedback subsystems of predictive analytics of open sources of Internet resources and local systems for various purposes.Для материала данной статьи открытые источники могут быть представлены как информация, размещенная в сети Интернет для многократного и неограниченного использования в виде машиночитаемых систематизированных данных, в форматах, позволяющих их раздельную автоматизированную обработку. Любой из открытых источников – это частично структурированный контент, характеризующийся тем, что состоит из нечетких перекрытий и связей, формализованных рядом устойчивых правил. Цель исследования состоит в том, чтобы произвести математическое описание правил декомпозиции связей виртуальных образов контента на основании статистических данных сопоставлений с целью формулирования цифрового аналога вербальной структуры. Исследованию сопутствует разработка модульной кроссплатформенной системы предиктивной аналитики распределенных открытых источников социальной цифровой среды на основе технологий многопоточной обработки данных. В основу системы ложится создание прототипа цифрового двойника, позволяющего производить мониторинг и последующий анализ открытых источников. Кроссплаформенный цифровой двойник может быть создан для различных открытых источников. В настоящее время среди них приоритетными для анализа исследуемых данных являются социальные сети. Материалы и методы. При однотипном контенте кода становится необходимым формализовать правила категоризации семантических значений структуры связей объекта исследования, которые могут быть описаны на языке теории графов, иметь подходящую им структуру в матрицах Ли, подчиняться законам транзитивности и обладать свойствами, позволяющими воссоздать связность в квазипроекции. Предлагается метод, позволяющий производить последовательные исчисления, не противоречащие друг другу по основным правилам аксиоматики. Результаты. Метод позволяет производить анализ в режиме реального времени потока данных открытых источников, идентифицировать цифровые следы объектов исследования, выявление структуры их связей. Заключение. С помощью предлагаемой группы алгоритмизированных математических итераций становится возможным создание совокупности локальных систем с подсистемами обратной связи предиктивной аналитики открытых источников интернет-ресурсов и локальных систем различного назначения
Linked Registries: Connecting Rare Diseases Patient Registries through a Semantic Web Layer
Patient registries are an essential tool to increase current knowledge
regarding rare diseases. Understanding these data is a vital step to improve
patient treatments and to create the most adequate tools for personalized
medicine. However, the growing number of disease-specific patient registries
brings also new technical challenges. Usually, these systems are developed as
closed data silos, with independent formats and models, lacking comprehensive
mechanisms to enable data sharing. To tackle these challenges, we developed a
Semantic Web based solution that allows connecting distributed and
heterogeneous registries, enabling the federation of knowledge between
multiple independent environments. This semantic layer creates a holistic view
over a set of anonymised registries, supporting semantic data representation,
integrated access, and querying. The implemented system gave us the
opportunity to answer challenging questions across disperse rare disease
patient registries. The interconnection between those registries using
Semantic Web technologies benefits our final solution in a way that we can
query single or multiple instances according to our needs. The outcome is a
unique semantic layer, connecting miscellaneous registries and delivering a
lightweight holistic perspective over the wealth of knowledge stemming from
linked rare disease patient registries
Ambient-aware continuous care through semantic context dissemination
Background: The ultimate ambient-intelligent care room contains numerous sensors and devices to monitor the patient, sense and adjust the environment and support the staff. This sensor-based approach results in a large amount of data, which can be processed by current and future applications, e. g., task management and alerting systems. Today, nurses are responsible for coordinating all these applications and supplied information, which reduces the added value and slows down the adoption rate. The aim of the presented research is the design of a pervasive and scalable framework that is able to optimize continuous care processes by intelligently reasoning on the large amount of heterogeneous care data.
Methods: The developed Ontology-based Care Platform (OCarePlatform) consists of modular components that perform a specific reasoning task. Consequently, they can easily be replicated and distributed. Complex reasoning is achieved by combining the results of different components. To ensure that the components only receive information, which is of interest to them at that time, they are able to dynamically generate and register filter rules with a Semantic Communication Bus (SCB). This SCB semantically filters all the heterogeneous care data according to the registered rules by using a continuous care ontology. The SCB can be distributed and a cache can be employed to ensure scalability.
Results: A prototype implementation is presented consisting of a new-generation nurse call system supported by a localization and a home automation component. The amount of data that is filtered and the performance of the SCB are evaluated by testing the prototype in a living lab. The delay introduced by processing the filter rules is negligible when 10 or fewer rules are registered.
Conclusions: The OCarePlatform allows disseminating relevant care data for the different applications and additionally supports composing complex applications from a set of smaller independent components. This way, the platform significantly reduces the amount of information that needs to be processed by the nurses. The delay resulting from processing the filter rules is linear in the amount of rules. Distributed deployment of the SCB and using a cache allows further improvement of these performance results
structured representation of scientific evidence in the biomedical domain using Semantic Web techniques
Background Accounts of evidence are vital to evaluate and reproduce scientific
findings and integrate data on an informed basis. Currently, such accounts are
often inadequate, unstandardized and inaccessible for computational knowledge
engineering even though computational technologies, among them those of the
semantic web, are ever more employed to represent, disseminate and integrate
biomedical data and knowledge. Results We present SEE (Semantic EvidencE), an
RDF/OWL based approach for detailed representation of evidence in terms of the
argumentative structure of the supporting background for claims even in
complex settings. We derive design principles and identify minimal components
for the representation of evidence. We specify the Reasoning and Discourse
Ontology (RDO), an OWL representation of the model of scientific claims, their
subjects, their provenance and their argumentative relations underlying the
SEE approach. We demonstrate the application of SEE and illustrate its design
patterns in a case study by providing an expressive account of the evidence
for certain claims regarding the isolation of the enzyme glutamine synthetase.
Conclusions SEE is suited to provide coherent and computationally accessible
representations of evidence-related information such as the materials,
methods, assumptions, reasoning and information sources used to establish a
scientific finding by adopting a consistently claim-based perspective on
scientific results and their evidence. SEE allows for extensible evidence
representations, in which the level of detail can be adjusted and which can be
extended as needed. It supports representation of arbitrary many consecutive
layers of interpretation and attribution and different evaluations of the same
data. SEE and its underlying model could be a valuable component in a variety
of use cases that require careful representation or examination of evidence
for data presented on the semantic web or in other formats
- …