264 research outputs found

    Design and implementation of serverless architecture for i2b2 on AWS cloud and Snowflake data warehouse

    Get PDF
    Informatics for Integrating Biology and the Beside (i2b2) is an open-source medical tool for cohort discovery that allows researchers to explore and query clinical data. The i2b2 platform is designed to adopt any patient-centric data models and used at over 400 healthcare institutions worldwide for querying patient data. The platform consists of a webclient, core servers and database. Despite having installation guidelines, the complex architecture of the system with numerous dependencies and configuration parameters makes it difficult to install a functional i2b2 platform. On the other hand, maintaining the scalability, security, availability of the application is also challenging and requires lot of resources. Our aim was to deploy the i2b2 for University of Missouri (UM) System in the cloud as well as reduce the complexity and effort of the installation and maintenance process. Our solution encapsulated the complete installation process of each component using docker and deployed the container in the AWS Virtual Private Cloud (VPC) using several AWS PaaS (Platform as a Service), IaaS (Infrastructure as a Service) services. We deployed the application as a service in the AWS FARGATE, an on-demand, serverless, auto scalable compute engine. We also enhanced the functionality of i2b2 services and developed Snowflake JDBC driver support for i2b2 backend services. It enabled i2b2 services to query directly from Snowflake analytical database. In addition, we also created i2b2-data-installer package to load PCORnet CDM and ACT ontology data into i2b2 database. The i2b2 platform in University of Missouri holds 1.26B facts of 2.2M patients of UM Cerner Millennium data.Includes bibliographical references

    Evaluating the informatics for integrating biology and the bedside system for clinical research

    Get PDF
    pre-printBackground: Selecting patient cohorts is a critical, iterative, and often time-consuming aspect of studies involving human subjects; informatics tools for helping streamline the process have been identified as important infrastructure components for enabling clinical and translational research. We describe the evaluation of a free and open source cohort selection tool from the Informatics for Integrating Biology and the Bedside (i2b2) group: the i2b2 hive. Methods: Our evaluation included the usability and functionality of the i2b2 hive using several real world examples of research data requests received electronically at the University of Utah Health Sciences Center between 2006 - 2008. The hive server component and the visual query tool application were evaluated for their suitability as a cohort selection tool on the basis of the types of data elements requested, as well as the effort required to fulfill each research data request using the i2b2 hive alone. Results: We found the i2b2 hive to be suitable for obtaining estimates of cohort sizes and generating research cohorts based on simple inclusion/exclusion criteria, which consisted of about 44% of the clinical research data requests sampled at our institution. Data requests that relied on post-coordinated clinical concepts, aggregate values of clinical findings, or temporal conditions in their inclusion/exclusion criteria could not be fulfilled using the i2b2 hive alone, and required one or more intermediate data steps in the form of pre-or post-processing, modifications to the hive metadata, etc. Conclusion: The i2b2 hive was found to be a useful cohort-selection tool for fulfilling common types of requests for research data, and especially in the estimation of initial cohort sizes. For another institution that might want to use the i2b2 hive for clinical research, we recommend that the institution would need to have structured, coded clinical data and metadata available that can be transformed to fit the logical data models of the i2b2 hive, strategies for extracting relevant clinical data from source systems, and the ability to perform substantial pre- and post-processing of these data

    The Role of Free/Libre and Open Source Software in Learning Health Systems

    Get PDF
    OBJECTIVE: To give an overview of the role of Free/Libre and Open Source Software (FLOSS) in the context of secondary use of patient data to enable Learning Health Systems (LHSs). METHODS: We conducted an environmental scan of the academic and grey literature utilising the MedFLOSS database of open source systems in healthcare to inform a discussion of the role of open source in developing LHSs that reuse patient data for research and quality improvement. RESULTS: A wide range of FLOSS is identified that contributes to the information technology (IT) infrastructure of LHSs including operating systems, databases, frameworks, interoperability software, and mobile and web apps. The recent literature around the development and use of key clinical data management tools is also reviewed. CONCLUSIONS: FLOSS already plays a critical role in modern health IT infrastructure for the collection, storage, and analysis of patient data. The nature of FLOSS systems to be collaborative, modular, and modifiable may make open source approaches appropriate for building the digital infrastructure for a LHS.</p

    Evaluating the informatics for integrating biology and the bedside system for clinical research

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Selecting patient cohorts is a critical, iterative, and often time-consuming aspect of studies involving human subjects; informatics tools for helping streamline the process have been identified as important infrastructure components for enabling clinical and translational research. We describe the evaluation of a free and open source cohort selection tool from the Informatics for Integrating Biology and the Bedside (i2b2) group: the i2b2 hive.</p> <p>Methods</p> <p>Our evaluation included the usability and functionality of the i2b2 hive using several real world examples of research data requests received electronically at the University of Utah Health Sciences Center between 2006 - 2008. The hive server component and the visual query tool application were evaluated for their suitability as a cohort selection tool on the basis of the types of data elements requested, as well as the effort required to fulfill each research data request using the i2b2 hive alone.</p> <p>Results</p> <p>We found the i2b2 hive to be suitable for obtaining estimates of cohort sizes and generating research cohorts based on simple inclusion/exclusion criteria, which consisted of about 44% of the clinical research data requests sampled at our institution. Data requests that relied on post-coordinated clinical concepts, aggregate values of clinical findings, or temporal conditions in their inclusion/exclusion criteria could not be fulfilled using the i2b2 hive alone, and required one or more intermediate data steps in the form of pre- or post-processing, modifications to the hive metadata, etc.</p> <p>Conclusion</p> <p>The i2b2 hive was found to be a useful cohort-selection tool for fulfilling common types of requests for research data, and especially in the estimation of initial cohort sizes. For another institution that might want to use the i2b2 hive for clinical research, we recommend that the institution would need to have structured, coded clinical data and metadata available that can be transformed to fit the logical data models of the i2b2 hive, strategies for extracting relevant clinical data from source systems, and the ability to perform substantial pre- and post-processing of these data.</p

    An Evaluation of the Use of a Clinical Research Data Warehouse and I2b2 Infrastructure to Facilitate Replication of Research

    Get PDF
    Replication of clinical research is requisite for forming effective clinical decisions and guidelines. While rerunning a clinical trial may be unethical and prohibitively expensive, the adoption of EHRs and the infrastructure for distributed research networks provide access to clinical data for observational and retrospective studies. Herein I demonstrate a means of using these tools to validate existing results and extend the findings to novel populations. I describe the process of evaluating published risk models as well as local data and infrastructure to assess the replicability of the study. I use an example of a risk model unable to be replicated as well as a study of in-hospital mortality risk I replicated using UNMC’s clinical research data warehouse. In these examples and other studies we have participated in, some elements are commonly missing or under-developed. One such missing element is a consistent and computable phenotype for pregnancy status based on data recorded in the EHR. I survey local clinical data and identify a number of variables correlated with pregnancy as well as demonstrate the data required to identify the temporal bounds of a pregnancy episode. Next, another common obstacle to replicating risk models is the necessity of linking to alternative data sources while maintaining data in a de-identified database. I demonstrate a pipeline for linking clinical data to socioeconomic variables and indices obtained from the American Community Survey (ACS). While these data are location-based, I provide a method for storing them in a HIPAA compliant fashion so as not to identify a patient’s location. While full and efficient replication of all clinical studies is still a future goal, the demonstration of replication as well as beginning the development of a computable phenotype for pregnancy and the incorporation of location based data in a de-identified data warehouse demonstrate how the EHR data and a research infrastructure may be used to facilitate this effort
    • …
    corecore