408 research outputs found

    METADATA MANAGEMENT FOR CLINICAL DATA INTEGRATION

    Get PDF
    Clinical data have been continuously collected and growing with the wide adoption of electronic health records (EHR). Clinical data have provided the foundation to facilitate state-of-art researches such as artificial intelligence in medicine. At the same time, it has become a challenge to integrate, access, and explore study-level patient data from large volumes of data from heterogeneous databases. Effective, fine-grained, cross-cohort data exploration, and semantically enabled approaches and systems are needed. To build semantically enabled systems, we need to leverage existing terminology systems and ontologies. Numerous ontologies have been developed recently and they play an important role in semantically enabled applications. Because they contain valuable codified knowledge, the management of these ontologies, as metadata, also requires systematic approaches. Moreover, in most clinical settings, patient data are collected with the help of a data dictionary. Knowledge of the relationships between an ontology and a related data dictionary is important for semantic interoperability. Such relationships are represented and maintained by mappings. Mappings store how data source elements and domain ontology concepts are linked, as well as how domain ontology concepts are linked between different ontologies. While mappings are crucial to the maintenance of relationships between an ontology and a related data dictionary, they are commonly captured by CSV files with limits capabilities for sharing, tracking, and visualization. The management of mappings requires an innovative, interactive, and collaborative approach. Metadata management servers to organize data that describes other data. In computer science and information science, ontology is the metadata consisting of the representation, naming, and definition of the hierarchies, properties, and relations between concepts. A structural, scalable, and computer understandable way for metadata management is critical to developing systems with the fine-grained data exploration capabilities. This dissertation presents a systematic approach called MetaSphere using metadata and ontologies to support the management and integration of clinical research data through our ontology-based metadata management system for multiple domains. MetaSphere is a general framework that aims to manage specific domain metadata, provide fine-grained data exploration interface, and store patient data in data warehouses. Moreover, MetaSphere provides a dedicated mapping interface called Interactive Mapping Interface (IMI) to map the data dictionary to well-recognized and standardized ontologies. MetaSphere has been applied to three domains successfully, sleep domain (X-search), pressure ulcer injuries and deep tissue pressure (SCIPUDSphere), and cancer. Specifically, MetaSphere stores domain ontology structurally in databases. Patient data in the corresponding domains are also stored in databases as data warehouses. MetaSphere provides a powerful query interface to enable interaction between human and actual patient data. Query interface is a mechanism allowing researchers to compose complex queries to pinpoint specific cohort over a large amount of patient data. The MetaSphere framework has been instantiated into three domains successfully and the detailed results are as below. X-search is publicly available at https://www.x-search.net with nine sleep domain datasets consisting of over 26,000 unique subjects. The canonical data dictionary contains over 900 common data elements across the datasets. X-search has received over 1800 cross-cohort queries by users from 16 countries. SCIPUDSphere has integrated a total number of 268,562 records containing 282 ICD9 codes related to pressure ulcer injuries among 36,626 individuals with spinal cord injuries. IMI is publicly available at http://epi-tome.com/. Using IMI, we have successfully mapped the North American Association of Central Cancer Registries (NAACCR) data dictionary to the National Cancer Institute Thesaurus (NCIt) concepts

    A Semantics-based User Interface Model for Content Annotation, Authoring and Exploration

    Get PDF
    The Semantic Web and Linked Data movements with the aim of creating, publishing and interconnecting machine readable information have gained traction in the last years. However, the majority of information still is contained in and exchanged using unstructured documents, such as Web pages, text documents, images and videos. This can also not be expected to change, since text, images and videos are the natural way in which humans interact with information. Semantic structuring of content on the other hand provides a wide range of advantages compared to unstructured information. Semantically-enriched documents facilitate information search and retrieval, presentation, integration, reusability, interoperability and personalization. Looking at the life-cycle of semantic content on the Web of Data, we see quite some progress on the backend side in storing structured content or for linking data and schemata. Nevertheless, the currently least developed aspect of the semantic content life-cycle is from our point of view the user-friendly manual and semi-automatic creation of rich semantic content. In this thesis, we propose a semantics-based user interface model, which aims to reduce the complexity of underlying technologies for semantic enrichment of content by Web users. By surveying existing tools and approaches for semantic content authoring, we extracted a set of guidelines for designing efficient and effective semantic authoring user interfaces. We applied these guidelines to devise a semantics-based user interface model called WYSIWYM (What You See Is What You Mean) which enables integrated authoring, visualization and exploration of unstructured and (semi-)structured content. To assess the applicability of our proposed WYSIWYM model, we incorporated the model into four real-world use cases comprising two general and two domain-specific applications. These use cases address four aspects of the WYSIWYM implementation: 1) Its integration into existing user interfaces, 2) Utilizing it for lightweight text analytics to incentivize users, 3) Dealing with crowdsourcing of semi-structured e-learning content, 4) Incorporating it for authoring of semantic medical prescriptions

    A Learning Health System for Radiation Oncology

    Get PDF
    The proposed research aims to address the challenges faced by clinical data science researchers in radiation oncology accessing, integrating, and analyzing heterogeneous data from various sources. The research presents a scalable intelligent infrastructure, called the Health Information Gateway and Exchange (HINGE), which captures and structures data from multiple sources into a knowledge base with semantically interlinked entities. This infrastructure enables researchers to mine novel associations and gather relevant knowledge for personalized clinical outcomes. The dissertation discusses the design framework and implementation of HINGE, which abstracts structured data from treatment planning systems, treatment management systems, and electronic health records. It utilizes disease-specific smart templates for capturing clinical information in a discrete manner. HINGE performs data extraction, aggregation, and quality and outcome assessment functions automatically, connecting seamlessly with local IT/medical infrastructure. Furthermore, the research presents a knowledge graph-based approach to map radiotherapy data to an ontology-based data repository using FAIR (Findable, Accessible, Interoperable, Reusable) concepts. This approach ensures that the data is easily discoverable and accessible for clinical decision support systems. The dissertation explores the ETL (Extract, Transform, Load) process, data model frameworks, ontologies, and provides a real-world clinical use case for this data mapping. To improve the efficiency of retrieving information from large clinical datasets, a search engine based on ontology-based keyword searching and synonym-based term matching tool was developed. The hierarchical nature of ontologies is leveraged to retrieve patient records based on parent and children classes. Additionally, patient similarity analysis is conducted using vector embedding models (Word2Vec, Doc2Vec, GloVe, and FastText) to identify similar patients based on text corpus creation methods. Results from the analysis using these models are presented. The implementation of a learning health system for predicting radiation pneumonitis following stereotactic body radiotherapy is also discussed. 3D convolutional neural networks (CNNs) are utilized with radiographic and dosimetric datasets to predict the likelihood of radiation pneumonitis. DenseNet-121 and ResNet-50 models are employed for this study, along with integrated gradient techniques to identify salient regions within the input 3D image dataset. The predictive performance of the 3D CNN models is evaluated based on clinical outcomes. Overall, the proposed Learning Health System provides a comprehensive solution for capturing, integrating, and analyzing heterogeneous data in a knowledge base. It offers researchers the ability to extract valuable insights and associations from diverse sources, ultimately leading to improved clinical outcomes. This work can serve as a model for implementing LHS in other medical specialties, advancing personalized and data-driven medicine

    Linked Data based Health Information Representation, Visualization and Retrieval System on the Semantic Web

    Get PDF
    Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.To better facilitate health information dissemination, using flexible ways to represent, query and visualize health data becomes increasingly important. Semantic Web technologies, which provide a common framework by allowing data to be shared and reused between applications, can be applied to the management of health data. Linked open data - a new semantic web standard to publish and link heterogonous data- allows not only human, but also machine to brows data in unlimited way. Through a use case of world health organization HIV data of sub Saharan Africa - which is severely affected by HIV epidemic, this thesis built a linked data based health information representation, querying and visualization system. All the data was represented with RDF, by interlinking it with other related datasets, which are already on the cloud. Over all, the system have more than 21,000 triples with a SPARQL endpoint; where users can download and use the data and – a SPARQL query interface where users can put different type of query and retrieve the result. Additionally, It has also a visualization interface where users can visualize the SPARQL result with a tool of their preference. For users who are not familiar with SPARQL queries, they can use the linked data search engine interface to search and browse the data. From this system we can depict that current linked open data technologies have a big potential to represent heterogonous health data in a flexible and reusable manner and they can serve in intelligent queries, which can support decision-making. However, in order to get the best from these technologies, improvements are needed both at the level of triple stores performance and domain-specific ontological vocabularies

    A Knowledge Graph Framework for Dementia Research Data

    Get PDF
    Dementia disease research encompasses diverse data modalities, including advanced imaging, deep phenotyping, and multi-omics analysis. However, integrating these disparate data sources has historically posed a significant challenge, obstructing the unification and comprehensive analysis of collected information. In recent years, knowledge graphs have emerged as a powerful tool to address such integration issues by enabling the consolidation of heterogeneous data sources into a structured, interconnected network of knowledge. In this context, we introduce DemKG, an open-source framework designed to facilitate the construction of a knowledge graph integrating dementia research data, comprising three core components: a KG-builder that integrates diverse domain ontologies and data annotations, an extensions ontology providing necessary terms tailored for dementia research, and a versatile transformation module for incorporating study data. In contrast with other current solutions, our framework provides a stable foundation by leveraging established ontologies and community standards and simplifies study data integration while delivering solid ontology design patterns, broadening its usability. Furthermore, the modular approach of its components enhances flexibility and scalability. We showcase how DemKG might aid and improve multi-modal data investigations through a series of proof-of-concept scenarios focused on relevant Alzheimer’s disease biomarkers
    corecore