15 research outputs found

    A study assessing the characteristics of big data environments that predict high research impact: application of qualitative and quantitative methods

    Full text link
    BACKGROUND: Big data offers new opportunities to enhance healthcare practice. While researchers have shown increasing interest to use them, little is known about what drives research impact. We explored predictors of research impact, across three major sources of healthcare big data derived from the government and the private sector. METHODS: This study was based on a mixed methods approach. Using quantitative analysis, we first clustered peer-reviewed original research that used data from government sources derived through the Veterans Health Administration (VHA), and private sources of data from IBM MarketScan and Optum, using social network analysis. We analyzed a battery of research impact measures as a function of the data sources. Other main predictors were topic clusters and authors’ social influence. Additionally, we conducted key informant interviews (KII) with a purposive sample of high impact researchers who have knowledge of the data. We then compiled findings of KIIs into two case studies to provide a rich understanding of drivers of research impact. RESULTS: Analysis of 1,907 peer-reviewed publications using VHA, IBM MarketScan and Optum found that the overall research enterprise was highly dynamic and growing over time. With less than 4 years of observation, research productivity, use of machine learning (ML), natural language processing (NLP), and the Journal Impact Factor showed substantial growth. Studies that used ML and NLP, however, showed limited visibility. After adjustments, VHA studies had generally higher impact (10% and 27% higher annualized Google citation rates) compared to MarketScan and Optum (p<0.001 for both). Analysis of co-authorship networks showed that no single social actor, either a community of scientists or institutions, was dominating. Other key opportunities to achieve high impact based on KIIs include methodological innovations, under-studied populations and predictive modeling based on rich clinical data. CONCLUSIONS: Big data for purposes of research analytics has grown within the three data sources studied between 2013 and 2016. Despite important challenges, the research community is reacting favorably to the opportunities offered both by big data and advanced analytic methods. Big data may be a logical and cost-efficient choice to emulate research initiatives where RCTs are not possible

    METADATA MANAGEMENT FOR CLINICAL DATA INTEGRATION

    Get PDF
    Clinical data have been continuously collected and growing with the wide adoption of electronic health records (EHR). Clinical data have provided the foundation to facilitate state-of-art researches such as artificial intelligence in medicine. At the same time, it has become a challenge to integrate, access, and explore study-level patient data from large volumes of data from heterogeneous databases. Effective, fine-grained, cross-cohort data exploration, and semantically enabled approaches and systems are needed. To build semantically enabled systems, we need to leverage existing terminology systems and ontologies. Numerous ontologies have been developed recently and they play an important role in semantically enabled applications. Because they contain valuable codified knowledge, the management of these ontologies, as metadata, also requires systematic approaches. Moreover, in most clinical settings, patient data are collected with the help of a data dictionary. Knowledge of the relationships between an ontology and a related data dictionary is important for semantic interoperability. Such relationships are represented and maintained by mappings. Mappings store how data source elements and domain ontology concepts are linked, as well as how domain ontology concepts are linked between different ontologies. While mappings are crucial to the maintenance of relationships between an ontology and a related data dictionary, they are commonly captured by CSV files with limits capabilities for sharing, tracking, and visualization. The management of mappings requires an innovative, interactive, and collaborative approach. Metadata management servers to organize data that describes other data. In computer science and information science, ontology is the metadata consisting of the representation, naming, and definition of the hierarchies, properties, and relations between concepts. A structural, scalable, and computer understandable way for metadata management is critical to developing systems with the fine-grained data exploration capabilities. This dissertation presents a systematic approach called MetaSphere using metadata and ontologies to support the management and integration of clinical research data through our ontology-based metadata management system for multiple domains. MetaSphere is a general framework that aims to manage specific domain metadata, provide fine-grained data exploration interface, and store patient data in data warehouses. Moreover, MetaSphere provides a dedicated mapping interface called Interactive Mapping Interface (IMI) to map the data dictionary to well-recognized and standardized ontologies. MetaSphere has been applied to three domains successfully, sleep domain (X-search), pressure ulcer injuries and deep tissue pressure (SCIPUDSphere), and cancer. Specifically, MetaSphere stores domain ontology structurally in databases. Patient data in the corresponding domains are also stored in databases as data warehouses. MetaSphere provides a powerful query interface to enable interaction between human and actual patient data. Query interface is a mechanism allowing researchers to compose complex queries to pinpoint specific cohort over a large amount of patient data. The MetaSphere framework has been instantiated into three domains successfully and the detailed results are as below. X-search is publicly available at https://www.x-search.net with nine sleep domain datasets consisting of over 26,000 unique subjects. The canonical data dictionary contains over 900 common data elements across the datasets. X-search has received over 1800 cross-cohort queries by users from 16 countries. SCIPUDSphere has integrated a total number of 268,562 records containing 282 ICD9 codes related to pressure ulcer injuries among 36,626 individuals with spinal cord injuries. IMI is publicly available at http://epi-tome.com/. Using IMI, we have successfully mapped the North American Association of Central Cancer Registries (NAACCR) data dictionary to the National Cancer Institute Thesaurus (NCIt) concepts

    An interoperable electronic medical record-based platform for personalized predictive analytics

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Precision medicine refers to the delivering of customized treatment to patients based on their individual characteristics, and aims to reduce adverse events, improve diagnostic methods, and enhance the efficacy of therapies. Among efforts to achieve the goals of precision medicine, researchers have used observational data for developing predictive modeling to best predict health outcomes according to patients’ variables. Although numerous predictive models have been reported in the literature, not all models present high prediction power, and as the result, not all models may reach clinical settings to help healthcare professionals make clinical decisions at the point-of-care. The lack of generalizability stems from the fact that no comprehensive medical data repository exists that has the information of all patients in the target population. Even if the patients’ records were available from other sources, the datasets may need further processing prior to data analysis due to differences in the structure of databases and the coding systems used to record concepts. This project intends to fill the gap by introducing an interoperable solution that receives patient electronic health records via Health Level Seven (HL7) messaging standard from other data sources, transforms the records to observational medical outcomes partnership (OMOP) common data model (CDM) for population health research, and applies predictive models on patient data to make predictions about health outcomes. This project comprises of three studies. The first study introduces CCD-TOOMOP parser, and evaluates OMOP CDM to accommodate patient data transferred by HL7 consolidated continuity of care documents (CCDs). The second study explores how to adopt predictive model markup language (PMML) for standardizing dissemination of OMOP-based predictive models. Finally, the third study introduces Personalized Health Risk Scoring Tool (PHRST), a pilot, interoperable OMOP-based model scoring tool that processes the embedded models and generates risk scores in a real-time manner. The final product addresses objectives of precision medicine, and has the potentials to not only be employed at the point-of-care to deliver individualized treatment to patients, but also can contribute to health outcome research by easing collecting clinical outcomes across diverse medical centers independent of system specifications

    Use of deep learning to develop continuous-risk models for adverse event prediction from electronic health records

    Get PDF
    Early prediction of patient outcomes is important for targeting preventive care. This protocol describes a practical workflow for developing deep-learning risk models that can predict various clinical and operational outcomes from structured electronic health record (EHR) data. The protocol comprises five main stages: formal problem definition, data pre-processing, architecture selection, calibration and uncertainty, and generalizability evaluation. We have applied the workflow to four endpoints (acute kidney injury, mortality, length of stay and 30-day hospital readmission). The workflow can enable continuous (e.g., triggered every 6 h) and static (e.g., triggered at 24 h after admission) predictions. We also provide an open-source codebase that illustrates some key principles in EHR modeling. This protocol can be used by interdisciplinary teams with programming and clinical expertise to build deep-learning prediction models with alternate data sources and prediction tasks

    A Learning Health Sciences Approach to Understanding Clinical Documentation in Pediatric Rehabilitation Settings

    Full text link
    The work presented in this dissertation provides an analysis of clinical documentation that challenges the concepts and thinking surrounding missingness of data from clinical settings and the factors that influence why data are missing. It also foregrounds the critical role of clinical documentation as infrastructure for creating learning health systems (LHS) for pediatric rehabilitation settings. Although completeness of discrete data is limited, the results presented do not reflect the quality of care or the extent of unstructured data that providers document in other locations of the electronic health record (EHR) interface. While some may view imputation and natural language processing as means to address missingness of clinical data, these practices carry biases in their interpretations and issues of validity in results. The factors that influence missingness of discrete clinical data are rooted not just in technical structures, but larger professional, system level and unobservable phenomena that shape provider practices of clinical documentation. This work has implications for how we view clinical documentation as critical infrastructure for LHS, future studies of data quality and health outcomes research, and EHR design and implementation. The overall research questions for this dissertation are: 1) To what extent can data networks be leveraged to build classifiers of patient functional performance and physical disability? 2) How can discrete clinical data on gross motor function be used to draw conclusions about clinical documentation practices in the EHR for cerebral palsy? 3) Why does missingness of discrete data in the EHR occur? To address these questions, a three-pronged approach is used to examine data completeness and the factors that influence missingness of discrete clinical data in an exemplar pediatric data learning network will be used. As a use-case, evaluation of EHR data completeness of gross motor function related data, populated by providers from 2015-2019 for children with cerebral palsy (CP), will be completed. Mixed methods research strategies will be used to achieve the dissertation objectives, including developing an expert-informed and standards-based phenotype model of gross motor function data as a task-based mechanism, conducting quantitative descriptive analyses of completeness of discrete data in the EHR, and performing qualitative thematic analyses to elicit and interpret the latent concepts that contribute to missingness of discrete data in the EHR. The clinical data for this dissertation are sourced from the Shriners Hospitals for Children (SHC) Health Outcomes Network (SHOnet), while qualitative data were collected through interviews and field observations of clinical providers across three care sites in the SHC system.PHDHlth Infrastr & Lrng Systs PhDUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162994/1/njkoscie_1.pd

    Doctor of Philosophy

    Get PDF
    dissertationClinical research plays a vital role in producing knowledge valuable for understanding human disease and improving healthcare quality. Human subject protection is an obligation essential to the clinical research endeavor, much of which is governed by federal regulations and rules. Institutional Review Boards (IRBs) are responsible for overseeing human subject research to protect individuals from harm and to preserve their rights. Researchers are required to submit and maintain an IRB application, which is an important component in the clinical research process that can significantly affect the timeliness and ethical quality of the study. As clinical research has expanded in both volume and scope over recent years, IRBs are facing increasing challenges in providing efficient and effective oversight. The Clinical Research Informatics (CRI) domain has made significant efforts to support various aspects of clinical research through developing information systems and standards. However, information technology use by IRBs has not received much attention from the CRI community. This dissertation project analyzed over 100 IRB application systems currently used at major academic institutions in the United States. The varieties of system types and lack of standardized application forms across institutions are discussed in detail. The need for building an IRB domain analysis model is identified. . iv In this dissertation, I developed an IRB domain analysis model with a special focus on promoting interoperability among CRI systems to streamline the clinical research workflow. The model was evaluated by a comparison with five real-world IRB application systems. Finally, a prototype implementation of the model was demonstrated by the integration of an electronic IRB system with a health data query system. This dissertation project fills a gap in the research of information technology use for the IRB oversight domain. Adoption of the IRB domain analysis model has potential to enhance efficient and high-quality ethics oversight and to streamline the clinical research workflow

    Preface

    Get PDF

    Health Statistics

    Get PDF
    Health statistics have progressed dramatically in Australia since the 1980s when the Australian Government created the (now) Australian Institute of Health and Welfare. The 12 papers in this Special Issue describe developments across a diverse range of topics, as well as providing an overview of the scope of health statistics in Australia and describing some ongoing gaps and problems. The papers will be of interest to international readers seeking to improve statistics about their health systems. Health statistics need to respect individuals’ personal information, be based on common data standards, and have adequate resourcing and committed staffing . The Australian experience provides valuable insights and examples. Australians will benefit from a comprehensive account of what has been achieved and what remains to be addressed. The papers in the Special Issue demonstrate the importance of continuing commitment to the statistical effort. Authors were chosen because of their known expertise in their respective fields

    Usability analysis of contending electronic health record systems

    Get PDF
    In this paper, we report measured usability of two leading EHR systems during procurement. A total of 18 users participated in paired-usability testing of three scenarios: ordering and managing medications by an outpatient physician, medicine administration by an inpatient nurse and scheduling of appointments by nursing staff. Data for audio, screen capture, satisfaction rating, task success and errors made was collected during testing. We found a clear difference between the systems for percentage of successfully completed tasks, two different satisfaction measures and perceived learnability when looking at the results over all scenarios. We conclude that usability should be evaluated during procurement and the difference in usability between systems could be revealed even with fewer measures than were used in our study. © 2019 American Psychological Association Inc. All rights reserved.Peer reviewe
    corecore