63 research outputs found

    Electronic health records to facilitate clinical research

    Get PDF
    Electronic health records (EHRs) provide opportunities to enhance patient care, embed performance measures in clinical practice, and facilitate clinical research. Concerns have been raised about the increasing recruitment challenges in trials, burdensome and obtrusive data collection, and uncertain generalizability of the results. Leveraging electronic health records to counterbalance these trends is an area of intense interest. The initial applications of electronic health records, as the primary data source is envisioned for observational studies, embedded pragmatic or post-marketing registry-based randomized studies, or comparative effectiveness studies. Advancing this approach to randomized clinical trials, electronic health records may potentially be used to assess study feasibility, to facilitate patient recruitment, and streamline data collection at baseline and follow-up. Ensuring data security and privacy, overcoming the challenges associated with linking diverse systems and maintaining infrastructure for repeat use of high quality data, are some of the challenges associated with using electronic health records in clinical research. Collaboration between academia, industry, regulatory bodies, policy makers, patients, and electronic health record vendors is critical for the greater use of electronic health records in clinical research. This manuscript identifies the key steps required to advance the role of electronic health records in cardiovascular clinical research

    The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment

    Get PDF
    OBJECTIVE: Coronavirus disease 2019 (COVID-19) poses societal challenges that require expeditious data and knowledge sharing. Though organizational clinical data are abundant, these are largely inaccessible to outside researchers. Statistical, machine learning, and causal analyses are most successful with large-scale data beyond what is available in any given organization. Here, we introduce the National COVID Cohort Collaborative (N3C), an open science community focused on analyzing patient-level data from many centers. MATERIALS AND METHODS: The Clinical and Translational Science Award Program and scientific community created N3C to overcome technical, regulatory, policy, and governance barriers to sharing and harmonizing individual-level clinical data. We developed solutions to extract, aggregate, and harmonize data across organizations and data models, and created a secure data enclave to enable efficient, transparent, and reproducible collaborative analytics. RESULTS: Organized in inclusive workstreams, we created legal agreements and governance for organizations and researchers; data extraction scripts to identify and ingest positive, negative, and possible COVID-19 cases; a data quality assurance and harmonization pipeline to create a single harmonized dataset; population of the secure data enclave with data, machine learning, and statistical analytics tools; dissemination mechanisms; and a synthetic data pilot to democratize data access. CONCLUSIONS: The N3C has demonstrated that a multisite collaborative learning health network can overcome barriers to rapidly build a scalable infrastructure incorporating multiorganizational clinical data for COVID-19 analytics. We expect this effort to save lives by enabling rapid collaboration among clinicians, researchers, and data scientists to identify treatments and specialized care and thereby reduce the immediate and long-term impacts of COVID-19

    The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment.

    Get PDF
    OBJECTIVE: Coronavirus disease 2019 (COVID-19) poses societal challenges that require expeditious data and knowledge sharing. Though organizational clinical data are abundant, these are largely inaccessible to outside researchers. Statistical, machine learning, and causal analyses are most successful with large-scale data beyond what is available in any given organization. Here, we introduce the National COVID Cohort Collaborative (N3C), an open science community focused on analyzing patient-level data from many centers. MATERIALS AND METHODS: The Clinical and Translational Science Award Program and scientific community created N3C to overcome technical, regulatory, policy, and governance barriers to sharing and harmonizing individual-level clinical data. We developed solutions to extract, aggregate, and harmonize data across organizations and data models, and created a secure data enclave to enable efficient, transparent, and reproducible collaborative analytics. RESULTS: Organized in inclusive workstreams, we created legal agreements and governance for organizations and researchers; data extraction scripts to identify and ingest positive, negative, and possible COVID-19 cases; a data quality assurance and harmonization pipeline to create a single harmonized dataset; population of the secure data enclave with data, machine learning, and statistical analytics tools; dissemination mechanisms; and a synthetic data pilot to democratize data access. CONCLUSIONS: The N3C has demonstrated that a multisite collaborative learning health network can overcome barriers to rapidly build a scalable infrastructure incorporating multiorganizational clinical data for COVID-19 analytics. We expect this effort to save lives by enabling rapid collaboration among clinicians, researchers, and data scientists to identify treatments and specialized care and thereby reduce the immediate and long-term impacts of COVID-19

    Med Care

    Get PDF
    Background:Hypoglycemia related to anti-diabetic drugs (ADDs) is an important iatrogenic harm in hospitalized patients. Electronic identification of ADD-related hypoglycemia may be an efficient, reliable method to inform quality improvement.Objectives:Develop electronic queries of electronic health records (EHRs) for facility-wide and unit-specific inpatient hypoglycemia event rates and validate query findings with manual chart review.Methods:Electronic queries were created to associate blood glucose (BG) values with ADD administration and inpatient location in three tertiary-care hospitals with Patient Centered Outcomes Research Network (PCORnet) databases. Queries were based on National Quality Forum (NQF) criteria with hypoglycemia thresholds <40 mg/dL and <54 mg/dL, and validated using a stratified random sample of 321 BG events. Sensitivity and specificity were calculated with manual chart review as the reference standard.Results:The sensitivity and specificity of queries for hypoglycemia events were 97.3% (95% CI, 90.5%-99.7%) and 100.0% (95% CI, 92.6%-100.0%) respectively for BG <40 mg/dL, and 97.7% (95% CI, 93.3%-99.5%) and 100.0% (95% CI, 95.3%-100.0%) respectively for <54 mg/dL. The sensitivity and specificity of the query for identifying ADD days were 91.8% (95% CI, 89.2%-94.0%) and 99.0% (95% CI, 97.5%-99.7%). Of 48 events missed by the queries, 37 (77.1%) were due to incomplete identification of insulin administered by infusion. Facility-wide hypoglycemia rates were 0.4%-0.8% (BG <40 mg/dL) and 1.9%-3.0% (BG <54 mg/dL); rates varied by patient care unit.Conclusions:Electronic queries can accurately identify inpatient hypoglycemia. Implementation in non-PCORnet-participating facilities should be assessed, with particular attention to patient location and insulin infusions.HHSD200201796208C/ImCDC/Intramural CDC HHS/United States2021-10-01T00:00:00Z32833937PMC7492368834

    Executive Summary

    Get PDF

    pSCANNER: Patient-centered scalable national network for effectiveness research

    Get PDF
    pre-printThis article describes the patient-centered Scalable National Network for Effectiveness Research (pSCANNER), which is part of the recently formed PCORnet, a national network composed of learning healthcare systems and patient-powered research networks funded by the Patient Centered Outcomes Research Institute (PCORI). It is designed to be a stakeholder-governed federated network that uses a distributed architecture to integrate data from three existing networks covering over 21 million patients in all 50 states: (1) VA Informatics and Computing Infrastructure (VINCI), with data from Veteran Health Administration's 151 inpatient and 909 ambulatory care and community-based outpatient clinics; (2) the University of California Research exchange (UC-ReX) network, with data from UC Davis, Irvine, Los Angeles, San Francisco, and San Diego; and (3) SCANNER, a consortium of UCSD, Tennessee VA, and three federally qualified health systems in the Los Angeles area supplemented with claims and health information exchange data, led by the University of Southern California. Initial use cases will focus on three conditions: (1) congestive heart failure; (2) Kawasaki disease; (3) obesity. Stakeholders, such as patients, clinicians, and health service researchers, will be engaged to prioritize research questions to be answered through the network. We will use a privacy-preserving distributed computation model with synchronous and asynchronous modes. The distributed system will be based on a common data model that allows the construction and evaluation of distributed multivariate models for a variety of statistical analyses

    An Evaluation of the Use of a Clinical Research Data Warehouse and I2b2 Infrastructure to Facilitate Replication of Research

    Get PDF
    Replication of clinical research is requisite for forming effective clinical decisions and guidelines. While rerunning a clinical trial may be unethical and prohibitively expensive, the adoption of EHRs and the infrastructure for distributed research networks provide access to clinical data for observational and retrospective studies. Herein I demonstrate a means of using these tools to validate existing results and extend the findings to novel populations. I describe the process of evaluating published risk models as well as local data and infrastructure to assess the replicability of the study. I use an example of a risk model unable to be replicated as well as a study of in-hospital mortality risk I replicated using UNMC’s clinical research data warehouse. In these examples and other studies we have participated in, some elements are commonly missing or under-developed. One such missing element is a consistent and computable phenotype for pregnancy status based on data recorded in the EHR. I survey local clinical data and identify a number of variables correlated with pregnancy as well as demonstrate the data required to identify the temporal bounds of a pregnancy episode. Next, another common obstacle to replicating risk models is the necessity of linking to alternative data sources while maintaining data in a de-identified database. I demonstrate a pipeline for linking clinical data to socioeconomic variables and indices obtained from the American Community Survey (ACS). While these data are location-based, I provide a method for storing them in a HIPAA compliant fashion so as not to identify a patient’s location. While full and efficient replication of all clinical studies is still a future goal, the demonstration of replication as well as beginning the development of a computable phenotype for pregnancy and the incorporation of location based data in a de-identified data warehouse demonstrate how the EHR data and a research infrastructure may be used to facilitate this effort

    Behavior change interventions: the potential of ontologies for advancing science and practice

    Get PDF
    A central goal of behavioral medicine is the creation of evidence-based interventions for promoting behavior change. Scientific knowledge about behavior change could be more effectively accumulated using "ontologies." In information science, an ontology is a systematic method for articulating a "controlled vocabulary" of agreed-upon terms and their inter-relationships. It involves three core elements: (1) a controlled vocabulary specifying and defining existing classes; (2) specification of the inter-relationships between classes; and (3) codification in a computer-readable format to enable knowledge generation, organization, reuse, integration, and analysis. This paper introduces ontologies, provides a review of current efforts to create ontologies related to behavior change interventions and suggests future work. This paper was written by behavioral medicine and information science experts and was developed in partnership between the Society of Behavioral Medicine's Technology Special Interest Group (SIG) and the Theories and Techniques of Behavior Change Interventions SIG. In recent years significant progress has been made in the foundational work needed to develop ontologies of behavior change. Ontologies of behavior change could facilitate a transformation of behavioral science from a field in which data from different experiments are siloed into one in which data across experiments could be compared and/or integrated. This could facilitate new approaches to hypothesis generation and knowledge discovery in behavioral science

    Secondary use of Structured Electronic Health Records Data: From Observational Studies to Deep Learning-based Predictive Modeling

    Get PDF
    With the wide adoption of electronic health records (EHRs), researchers, as well as large healthcare organizations, governmental institutions, insurance, and pharmaceutical companies have been interested in leveraging this rich clinical data source to extract clinical evidence and develop predictive algorithms. Large vendors have been able to compile structured EHR data from sites all over the United States, de-identify these data, and make them available to data science researchers in a more usable format. For this dissertation, we leveraged one of the earliest and largest secondary EHR data sources and conducted three studies of increasing scope. In the first study, which was of limited scope, we conducted a retrospective observational study to compare the effect of three drugs on a specific population of approximately 3,000 patients. Using a novel statistical method, we found evidence that the selection of phenylephrine as the primary vasopressor to induce hypertension for the management of nontraumatic subarachnoid hemorrhage is associated with better outcomes as compared to selecting norepinephrine or dopamine. In the second study, we widened our scope, using a cohort of more than 100,000 patients to train generalizable models for the risk prediction of specific clinical events, such as heart failure in diabetes patients or pancreatic cancer. In this study, we found that recurrent neural network-based predictive models trained on expressive terminologies, which preserve a high level of granularity, are associated with better prediction performance as compared with other baseline methods, such as logistic regression. Finally, we widened our scope again, to train Med-BERT, a foundation model, on more than 20 million patients’ diagnosis data. Med-BERT was found to improve the prediction performance of downstream tasks that have a small sample size, which otherwise would limit the ability of the model to learn good representation. In conclusion, we found that we can extract useful information and train helpful deep learning-based predictive models. Given the limitations of secondary EHR data and taking into consideration that the data were originally collected for administrative and not research purposes, however, the findings need clinical validation. Therefore, clinical trials are warranted to further validate any new evidence extracted from such data sources before updating clinical practice guidelines. The implementability of the developed predictive models, which are in an early development phase, also warrants further evaluation
    • …
    corecore