Search CORE

7 research outputs found

METADATA MANAGEMENT FOR CLINICAL DATA INTEGRATION

Author: Zeng Ningzhou
Publication venue: UKnowledge
Publication date: 01/01/2020
Field of study

Clinical data have been continuously collected and growing with the wide adoption of electronic health records (EHR). Clinical data have provided the foundation to facilitate state-of-art researches such as artificial intelligence in medicine. At the same time, it has become a challenge to integrate, access, and explore study-level patient data from large volumes of data from heterogeneous databases. Effective, fine-grained, cross-cohort data exploration, and semantically enabled approaches and systems are needed. To build semantically enabled systems, we need to leverage existing terminology systems and ontologies. Numerous ontologies have been developed recently and they play an important role in semantically enabled applications. Because they contain valuable codified knowledge, the management of these ontologies, as metadata, also requires systematic approaches. Moreover, in most clinical settings, patient data are collected with the help of a data dictionary. Knowledge of the relationships between an ontology and a related data dictionary is important for semantic interoperability. Such relationships are represented and maintained by mappings. Mappings store how data source elements and domain ontology concepts are linked, as well as how domain ontology concepts are linked between different ontologies. While mappings are crucial to the maintenance of relationships between an ontology and a related data dictionary, they are commonly captured by CSV files with limits capabilities for sharing, tracking, and visualization. The management of mappings requires an innovative, interactive, and collaborative approach. Metadata management servers to organize data that describes other data. In computer science and information science, ontology is the metadata consisting of the representation, naming, and definition of the hierarchies, properties, and relations between concepts. A structural, scalable, and computer understandable way for metadata management is critical to developing systems with the fine-grained data exploration capabilities. This dissertation presents a systematic approach called MetaSphere using metadata and ontologies to support the management and integration of clinical research data through our ontology-based metadata management system for multiple domains. MetaSphere is a general framework that aims to manage specific domain metadata, provide fine-grained data exploration interface, and store patient data in data warehouses. Moreover, MetaSphere provides a dedicated mapping interface called Interactive Mapping Interface (IMI) to map the data dictionary to well-recognized and standardized ontologies. MetaSphere has been applied to three domains successfully, sleep domain (X-search), pressure ulcer injuries and deep tissue pressure (SCIPUDSphere), and cancer. Specifically, MetaSphere stores domain ontology structurally in databases. Patient data in the corresponding domains are also stored in databases as data warehouses. MetaSphere provides a powerful query interface to enable interaction between human and actual patient data. Query interface is a mechanism allowing researchers to compose complex queries to pinpoint specific cohort over a large amount of patient data. The MetaSphere framework has been instantiated into three domains successfully and the detailed results are as below. X-search is publicly available at https://www.x-search.net with nine sleep domain datasets consisting of over 26,000 unique subjects. The canonical data dictionary contains over 900 common data elements across the datasets. X-search has received over 1800 cross-cohort queries by users from 16 countries. SCIPUDSphere has integrated a total number of 268,562 records containing 282 ICD9 codes related to pressure ulcer injuries among 36,626 individuals with spinal cord injuries. IMI is publicly available at http://epi-tome.com/. Using IMI, we have successfully mapped the North American Association of Central Cancer Registries (NAACCR) data dictionary to the National Cancer Institute Thesaurus (NCIt) concepts

University of Kentucky

Web-Based Interactive Mapping from Data Dictionaries to Ontologies, with an Application to Cancer Registry

Author: Cui Licong
Durbin Eric B.
Hands Isaac
Hurt-Mueller Joseph
Tao Shiqiang
Zeng Ningzhou
Zhang Guoqiang
Publication venue: UKnowledge
Publication date: 15/12/2020
Field of study

BACKGROUND: The Kentucky Cancer Registry (KCR) is a central cancer registry for the state of Kentucky that receives data about incident cancer cases from all healthcare facilities in the state within 6 months of diagnosis. Similar to all other U.S. and Canadian cancer registries, KCR uses a data dictionary provided by the North American Association of Central Cancer Registries (NAACCR) for standardized data entry. The NAACCR data dictionary is not an ontological system. Mapping between the NAACCR data dictionary and the National Cancer Institute (NCI) Thesaurus (NCIt) will facilitate the enrichment, dissemination and utilization of cancer registry data. We introduce a web-based system, called Interactive Mapping Interface (IMI), for creating mappings from data dictionaries to ontologies, in particular from NAACCR to NCIt. METHOD: IMI has been designed as a general approach with three components: (1) ontology library; (2) mapping interface; and (3) recommendation engine. The ontology library provides a list of ontologies as targets for building mappings. The mapping interface consists of six modules: project management, mapping dashboard, access control, logs and comments, hierarchical visualization, and result review and export. The built-in recommendation engine automatically identifies a list of candidate concepts to facilitate the mapping process. RESULTS: We report the architecture design and interface features of IMI. To validate our approach, we implemented an IMI prototype and pilot-tested features using the IMI interface to map a sample set of NAACCR data elements to NCIt concepts. 47 out of 301 NAACCR data elements have been mapped to NCIt concepts. Five branches of hierarchical tree have been identified from these mapped concepts for visual inspection. CONCLUSIONS: IMI provides an interactive, web-based interface for building mappings from data dictionaries to ontologies. Although our pilot-testing scope is limited, our results demonstrate feasibility using IMI for semantic enrichment of cancer registry data by mapping NAACCR data elements to NCIt concepts

University of Kentucky

X-search: An Open Access Interface for Cross-Cohort Exploration of the National Sleep Research Resource

Author: Cui Licong
Hankosky Emily Ruth
Kim Matthew
Mueller Remo
Redline Susan
Zeng Ningzhou
Zhang Guo-Qiang
Publication venue: UKnowledge
Publication date: 13/11/2018
Field of study

Background: The National Sleep Research Resource (NSRR) is a large-scale, openly shared, data repository of de-identified, highly curated clinical sleep data from multiple NIH-funded epidemiological studies. Although many data repositories allow users to browse their content, few support fine-grained, cross-cohort query and exploration at study-subject level. We introduce a cross-cohort query and exploration system, called X-search, to enable researchers to query patient cohort counts across a growing number of completed, NIH-funded studies in NSRR and explore the feasibility or likelihood of reusing the data for research studies. Methods: X-search has been designed as a general framework with two loosely-coupled components: semantically annotated data repository and cross-cohort exploration engine. The semantically annotated data repository is comprised of a canonical data dictionary, data sources with a data dictionary, and mappings between each individual data dictionary and the canonical data dictionary. The cross-cohort exploration engine consists of five modules: query builder, graphical exploration, case-control exploration, query translation, and query execution. The canonical data dictionary serves as the unified metadata to drive the visual exploration interfaces and facilitate query translation through the mappings. Results: X-search is publicly available at https://www.x-search.net/ with nine NSRR datasets consisting of over 26,000 unique subjects. The canonical data dictionary contains over 900 common data elements across the datasets. X-search has received over 1800 cross-cohort queries by users from 16 countries. Conclusions: X-search provides a powerful cross-cohort exploration interface for querying and exploring heterogeneous datasets in the NSRR data repository, so as to enable researchers to evaluate the feasibility of potential research studies and generate potential hypotheses using the NSRR data

University of Kentucky

Individualized Clinical Practice Guidelines for Pressure Injury Management: Development of an Integrated Multi-Modal Biomedical Information Resource

Author: Bloostein Arielle L.
Bogie Kathie M.
Roggenkamp Steven K.
Seton Jacinta
Sun Jiayang
Tao Shiqiang
Zeng Ningzhou
Zhang Guo-Qiang
Publication venue: UKnowledge
Publication date: 06/09/2018
Field of study

Background: Pressure ulcers (PU) and deep tissue injuries (DTI), collectively known as pressure injuries are serious complications causing staggering costs and human suffering with over 200 reported risk factors from many domains. Primary pressure injury prevention seeks to prevent the first incidence, while secondary PU/DTI prevention aims to decrease chronic recurrence. Clinical practice guidelines (CPG) combine evidence-based practice and expert opinion to aid clinicians in the goal of achieving best practices for primary and secondary prevention. The correction of all risk factors can be both overwhelming and impractical to implement in clinical practice. There is a need to develop practical clinical tools to prioritize the multiple recommendations of CPG, but there is limited guidance on how to prioritize based on individual cases. Bioinformatics platforms enable data management to support clinical decision support and user-interface development for complex clinical challenges such as pressure injury prevention care planning. Objective: The central hypothesis of the study is that the individual’s risk factor profile can provide the basis for adaptive, personalized care planning for PU prevention based on CPG prioritization. The study objective is to develop the Spinal Cord Injury Pressure Ulcer and Deep Tissue Injury (SCIPUD+) Resource to support personalized care planning for primary and secondary PU/DTI prevention. Methods: The study is employing a retrospective electronic health record (EHR) chart review of over 75 factors known to be relevant for pressure injury risk in individuals with a spinal cord injury (SCI) and routinely recorded in the EHR. We also perform tissue health assessments of a selected sub-group. A systems approach is being used to develop and validate the SCIPUD+ Resource incorporating the many risk factor domains associated with PU/DTI primary and secondary prevention, ranging from the individual’s environment to local tissue health. Our multiscale approach will leverage the strength of bioinformatics applied to an established national EHR system. A comprehensive model is being used to relate the primary outcome of interest (PU/DTI development) with over 75 PU/DTI risk factors using a retrospective chart review of 5000 individuals selected from the study cohort of more than 36,000 persons with SCI. A Spinal Cord Injury Pressure Ulcer and Deep Tissue Injury Ontology (SCIPUDO) is being developed to enable robust text-mining for data extraction from free-form notes. Results: The results from this study are pending. Conclusions: PU/DTI remains a highly significant source of morbidity for individuals with SCI. Personalized interactive care plans may decrease both initial PU formation and readmission rates for high-risk individuals. The project is using established EHR data to build a comprehensive, structured model of environmental, social and clinical pressure injury risk factors. The comprehensive SCIPUD+ health care tool will be used to relate the primary outcome of interest (pressure injury development) with covariates including environmental, social, clinical, personal and tissue health profiles as well as possible interactions among some of these covariates. The study will result in a validated tool for personalized implementation of CPG recommendations and has great potential to change the standard of care for PrI clinical practice by enabling clinicians to provide personalized application of CPG priorities tailored to the needs of each at-risk individual with SCI

University of Kentucky

REFORM: REFACTORIZED ELECTRONIC WEB FORMS - LARGE SCALESURVEY DATA CAPTURE AND WORKFLOW CONTROL FRAMEWORK

Author: Zeng Ningzhou
Publication venue: Case Western Reserve University School of Graduate Studies / OhioLINK
Publication date: 30/08/2017
Field of study

OhioLINK Electronic Thesis and Dissertation Center

X-search: an open access interface for cross-cohort exploration of the National Sleep Research Resource

Author: Emily R. Hankosky
Guo-Qiang Zhang
Licong Cui
Matthew Kim
Ningzhou Zeng
Remo Mueller
Susan Redline
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2018
Field of study

Abstract Background The National Sleep Research Resource (NSRR) is a large-scale, openly shared, data repository of de-identified, highly curated clinical sleep data from multiple NIH-funded epidemiological studies. Although many data repositories allow users to browse their content, few support fine-grained, cross-cohort query and exploration at study-subject level. We introduce a cross-cohort query and exploration system, called X-search, to enable researchers to query patient cohort counts across a growing number of completed, NIH-funded studies in NSRR and explore the feasibility or likelihood of reusing the data for research studies. Methods X-search has been designed as a general framework with two loosely-coupled components: semantically annotated data repository and cross-cohort exploration engine. The semantically annotated data repository is comprised of a canonical data dictionary, data sources with a data dictionary, and mappings between each individual data dictionary and the canonical data dictionary. The cross-cohort exploration engine consists of five modules: query builder, graphical exploration, case-control exploration, query translation, and query execution. The canonical data dictionary serves as the unified metadata to drive the visual exploration interfaces and facilitate query translation through the mappings. Results X-search is publicly available at https://www.x-search.net/with nine NSRR datasets consisting of over 26,000 unique subjects. The canonical data dictionary contains over 900 common data elements across the datasets. X-search has received over 1800 cross-cohort queries by users from 16 countries. Conclusions X-search provides a powerful cross-cohort exploration interface for querying and exploring heterogeneous datasets in the NSRR data repository, so as to enable researchers to evaluate the feasibility of potential research studies and generate potential hypotheses using the NSRR data

Directory of Open Access Journals

University of Kentucky

X-search: an open access interface for cross-cohort exploration of the National Sleep Research Resource

Author: AO Force
CL Phillips
DA Dean
DK Sharma
Emily R. Hankosky
FS Collins
G Hripcsak
GM Weber
GQ Zhang
Guo-Qiang Zhang
Guo-Qiang Zhang
J Marc Overhage
JS Ross
Licong Cui
LM Federer
Matthew Kim
MD Wilkinson
NF Noy
Ningzhou Zeng
R Bache
RA Poldrack
Remo Mueller
S Reutrakul
SN Murphy
Susan Redline
UniProt Consortium
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref