954 research outputs found

    The European Institute for Innovation through Health Data

    Get PDF
    The European Institute for Innovation through Health Data (i~HD, www.i-hd.eu) has been formed as one of the key sustainable entities arising from the Electronic Health Records for Clinical Research (IMI-JU-115189) and SemanticHealthNet (FP7-288408) projects, in collaboration with several other European projects and initiatives supported by the European Commission. i~HD is a European not-for-profit body, registered in Belgium through Royal Assent. i~HD has been established to tackle areas of challenge in the successful scaling up of innovations that critically rely on high-quality and interoperable health data. It will specifically address obstacles and opportunities to using health data by collating, developing, and promoting best practices in information governance and in semantic interoperability. It will help to sustain and propagate the results of health information and communication technology (ICT) research that enables better use of health data, assessing and optimizing their novel value wherever possible. i~HD has been formed after wide consultation and engagement of many stakeholders to develop methods, solutions, and services that can help to maximize the value obtained by all stakeholders from health data. It will support innovations in health maintenance, health care delivery, and knowledge discovery while ensuring compliance with all legal prerequisites, especially regarding the insurance of patient's privacy protection. It is bringing multiple stakeholder groups together so as to ensure that future solutions serve their collective needs and can be readily adopted affordably and at scale

    Evaluating openEHR for storing computable representations of electronic health record phenotyping algorithms

    Get PDF
    Electronic Health Records (EHR) are data generated during routine clinical care. EHR offer researchers unprecedented phenotypic breadth and depth and have the potential to accelerate the pace of precision medicine at scale. A main EHR use-case is creating phenotyping algorithms to define disease status, onset and severity. Currently, no common machine-readable standard exists for defining phenotyping algorithms which often are stored in human-readable formats. As a result, the translation of algorithms to implementation code is challenging and sharing across the scientific community is problematic. In this paper, we evaluate openEHR, a formal EHR data specification, for computable representations of EHR phenotyping algorithms.Comment: 30th IEEE International Symposium on Computer-Based Medical Systems - IEEE CBMS 201

    Making EHRs trustable: A quality analysis of EHR-derived datasets for COVID-19 research

    Get PDF
    One approach to verifying the quality of research data obtained from EHRs is auditing how complete and correct the data are in comparison with those collected by manual and controlled methods. This study analyzed data quality of an EHR-derived dataset for COVID-19 research, obtained during the pandemic at Hospital Universitario 12 de Octubre. Data were extracted from EHRs and a manually collected research database, and then transformed into the ISARIC-WHO COVID-19 CRF model. Subsequently, a data analysis was performed, comparing both sources through this convergence model. More concepts and records were obtained from EHRs, and PPV (95% CI) was above 85% in most sections. In future studies, a more detailed analysis of data quality will be carried out

    Development and validation of the DIabetes Severity SCOre (DISSCO) in 139 626 individuals with type 2 diabetes: a retrospective cohort study

    Get PDF
    OBJECTIVE: Clinically applicable diabetes severity measures are lacking, with no previous studies comparing their predictive value with glycated hemoglobin (HbA1c). We developed and validated a type 2 diabetes severity score (the DIabetes Severity SCOre, DISSCO) and evaluated its association with risks of hospitalization and mortality, assessing its additional risk information to sociodemographic factors and HbA1c. RESEARCH DESIGN AND METHODS: We used UK primary and secondary care data for 139 626 individuals with type 2 diabetes between 2007 and 2017, aged ≥35 years, and registered in general practices in England. The study cohort was randomly divided into a training cohort (n=111 748, 80%) to develop the severity tool and a validation cohort (n=27 878). We developed baseline and longitudinal severity scores using 34 diabetes-related domains. Cox regression models (adjusted for age, gender, ethnicity, deprivation, and HbA1c) were used for primary (all-cause mortality) and secondary (hospitalization due to any cause, diabetes, hypoglycemia, or cardiovascular disease or procedures) outcomes. Likelihood ratio (LR) tests were fitted to assess the significance of adding DISSCO to the sociodemographics and HbA1c models. RESULTS: A total of 139 626 patients registered in 400 general practices, aged 63±12 years were included, 45% of whom were women, 83% were White, and 18% were from deprived areas. The mean baseline severity score was 1.3±2.0. Overall, 27 362 (20%) people died and 99 951 (72%) had ≥1 hospitalization. In the training cohort, a one-unit increase in baseline DISSCO was associated with higher hazard of mortality (HR: 1.14, 95% CI 1.13 to 1.15, area under the receiver operating characteristics curve (AUROC)=0.76) and cardiovascular hospitalization (HR: 1.45, 95% CI 1.43 to 1.46, AUROC=0.73). The LR tests showed that adding DISSCO to sociodemographic variables significantly improved the predictive value of survival models, outperforming the added value of HbA1c for all outcomes. Findings were consistent in the validation cohort. CONCLUSIONS: Higher levels of DISSCO are associated with higher risks for hospital admissions and mortality. The new severity score had higher predictive value than the proxy used in clinical practice, HbA1c. This reproducible algorithm can help practitioners stratify clinical care of patients with type 2 diabetes

    Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress

    Get PDF
    Objective: To perform a review of recent research in clinical data reuse or secondary use, and envision future advances in this field. Methods: The review is based on a large literature search in MEDLINE (through PubMed), conference proceedings, and the ACM Digital Library, focusing only on research published between 2005 and early 2016. Each selected publication was reviewed by the authors, and a structured analysis and summarization of its content was developed. Results: The initial search produced 359 publications, reduced after a manual examination of abstracts and full publications. The following aspects of clinical data reuse are discussed: motivations and challenges, privacy and ethical concerns, data integration and interoperability, data models and terminologies, unstructured data reuse, structured data mining, clinical practice and research integration, and examples of clinical data reuse (quality measurement and learning healthcare systems). Conclusion: Reuse of clinical data is a fast-growing field recognized as essential to realize the potentials for high quality healthcare, improved healthcare management, reduced healthcare costs, population health management, and effective clinical research

    EHRtemporalVariability: delineating temporal data-set shifts in electronic health records

    Full text link
    [EN] Background: Temporal variability in health-care processes or protocols is intrinsic to medicine. Such variability can potentially introduce dataset shifts, a data quality issue when reusing electronic health records (EHRs) for secondary purposes. Temporal data-set shifts can present as trends, as well as abrupt or seasonal changes in the statistical distributions of data over time. The latter are particularly complicated to address in multimodal and highly coded data. These changes, if not delineated, can harm population and data-driven research, such as machine learning. Given that biomedical research repositories are increasingly being populated with large sets of historical data from EHRs, there is a need for specific software methods to help delineate temporal data-set shifts to ensure reliable data reuse. Results: EHRtemporalVariability is an open-source R package and Shiny app designed to explore and identify temporal data-set shifts. EHRtemporalVariability estimates the statistical distributions of coded and numerical data over time; projects their temporal evolution through non-parametric information geometric temporal plots; and enables the exploration of changes in variables through data temporal heat maps. We demonstrate the capability of EHRtemporalVariability to delineate data-set shifts in three impact case studies, one of which is available for reproducibility. Conclusions: EHRtemporalVariability enables the exploration and identification of data-set shifts, contributing to the broad examination and repurposing of large, longitudinal data sets. Our goal is to help ensure reliable data reuse for a wide range of biomedical data users. EHRtemporalVariability is designed for technical users who are programmatically utilizing the R package, as well as users who are not familiar with programming via the Shiny user interface.This work was supported by Universitat Politecnica de Valencia grant PAID-00-17, Generalitat Valenciana grant BEST/2018, and projects H2020-SC1-2016-CNECT No. 727560 and H2020-SC1-BHC-2018-2020 No. 825750Sáez Silvestre, C.; Gutiérrez-Sacristán, A.; Kohane, I.; Garcia-Gomez, JM.; Avillach, P. (2020). EHRtemporalVariability: delineating temporal data-set shifts in electronic health records. GigaScience. 9(8):1-7. https://doi.org/10.1093/gigascience/giaa079S1798Gewin, V. (2016). Data sharing: An open mind on open data. Nature, 529(7584), 117-119. doi:10.1038/nj7584-117aKatzan, I. L., & Rudick, R. A. (2012). Time to Integrate Clinical and Research Informatics. Science Translational Medicine, 4(162). doi:10.1126/scitranslmed.3004583Zhu, L., & Zheng, W. J. (2018). Informatics, Data Science, and Artificial Intelligence. JAMA, 320(11), 1103. doi:10.1001/jama.2018.8211Rajkomar, A., Dean, J., & Kohane, I. (2019). Machine Learning in Medicine. New England Journal of Medicine, 380(14), 1347-1358. doi:10.1056/nejmra1814259Andreu-Perez, J., Poon, C. C. Y., Merrifield, R. D., Wong, S. T. C., & Yang, G.-Z. (2015). Big Data for Health. IEEE Journal of Biomedical and Health Informatics, 19(4), 1193-1208. doi:10.1109/jbhi.2015.2450362Sáez, C., Rodrigues, P. P., Gama, J., Robles, M., & García-Gómez, J. M. (2014). Probabilistic change detection and visualization methods for the assessment of temporal stability in biomedical data quality. Data Mining and Knowledge Discovery, 29(4), 950-975. doi:10.1007/s10618-014-0378-6Schlegel, D. R., & Ficheur, G. (2017). Secondary Use of Patient Data: Review of the Literature Published in 2016. Yearbook of Medical Informatics, 26(01), 68-71. doi:10.15265/iy-2017-032Agniel, D., Kohane, I. S., & Weber, G. M. (2018). Biases in electronic health record data due to processes within the healthcare system: retrospective observational study. BMJ, k1479. doi:10.1136/bmj.k1479Sáez, C., & García-Gómez, J. M. (2018). Kinematics of Big Biomedical Data to characterize temporal variability and seasonality of data repositories: Functional Data Analysis of data temporal evolution over non-parametric statistical manifolds. International Journal of Medical Informatics, 119, 109-124. doi:10.1016/j.ijmedinf.2018.09.015Leek, J. T., Scharpf, R. B., Bravo, H. C., Simcha, D., Langmead, B., Johnson, W. E., … Irizarry, R. A. (2010). Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics, 11(10), 733-739. doi:10.1038/nrg2825Goh, W. W. B., Wang, W., & Wong, L. (2017). Why Batch Effects Matter in Omics Data, and How to Avoid Them. Trends in Biotechnology, 35(6), 498-507. doi:10.1016/j.tibtech.2017.02.012Sáez, C., Zurriaga, O., Pérez-Panadés, J., Melchor, I., Robles, M., & García-Gómez, J. M. (2016). Applying probabilistic temporal and multisite data quality control methods to a public health mortality registry in Spain: a systematic approach to quality control of repositories. Journal of the American Medical Informatics Association, 23(6), 1085-1095. doi:10.1093/jamia/ocw010Wright, A., Ash, J. S., Aaron, S., Ai, A., Hickman, T.-T. T., Wiesen, J. F., … Sittig, D. F. (2018). Best practices for preventing malfunctions in rule-based clinical decision support alerts and reminders: Results of a Delphi study. International Journal of Medical Informatics, 118, 78-85. doi:10.1016/j.ijmedinf.2018.08.001Moreno-Torres, J. G., Raeder, T., Alaiz-Rodríguez, R., Chawla, N. V., & Herrera, F. (2012). A unifying view on dataset shift in classification. Pattern Recognition, 45(1), 521-530. doi:10.1016/j.patcog.2011.06.019Svolba, G., & Bauer, P. (1999). Statistical Quality Control in Clinical Trials. Controlled Clinical Trials, 20(6), 519-530. doi:10.1016/s0197-2456(99)00029-xBray, F., & Parkin, D. M. (2009). Evaluation of data quality in the cancer registry: Principles and methods. Part I: Comparability, validity and timeliness. European Journal of Cancer, 45(5), 747-755. doi:10.1016/j.ejca.2008.11.032Springate, D. A., Parisi, R., Olier, I., Reeves, D., & Kontopantelis, E. (2017). rEHR: An R package for manipulating and analysing Electronic Health Record data. PLOS ONE, 12(2), e0171784. doi:10.1371/journal.pone.0171784Choi, L., Carroll, R. J., Beck, C., Mosley, J. D., Roden, D. M., Denny, J. C., & Van Driest, S. L. (2018). Evaluating statistical approaches to leverage large clinical datasets for uncovering therapeutic and adverse medication effects. Bioinformatics, 34(17), 2988-2996. doi:10.1093/bioinformatics/bty306Gutiérrez-Sacristán, A., Bravo, À., Giannoula, A., Mayer, M. A., Sanz, F., & Furlong, L. I. (2018). comoRbidity: an R package for the systematic analysis of disease comorbidities. Bioinformatics, 34(18), 3228-3230. doi:10.1093/bioinformatics/bty315Denny, J. C., Bastarache, L., Ritchie, M. D., Carroll, R. J., Zink, R., Mosley, J. D., … Roden, D. M. (2013). Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nature Biotechnology, 31(12), 1102-1111. doi:10.1038/nbt.2749Khera, R., Dorsey, K. B., & Krumholz, H. M. (2018). Transition to the ICD-10 in the United States. JAMA, 320(2), 133. doi:10.1001/jama.2018.682

    Savana: Re-using Electronic Health Records with Artificial Intelligence

    Get PDF
    Health information grows exponentially (doubling every 5 years), thus generating a sort of inflation of science, i.e. the generation of more knowledge than we can leverage. In an unprecedented data-driven shift, today doctors have no longer time to keep updated. This fact explains why only one in every five medical decisions is based strictly on evidence, which inevitably leads to variability. A good solution lies on clinical decision support systems, based on big data analysis. As the processing of large amounts of information gains relevance, automatic approaches become increasingly capable to see and correlate information further and better than the human mind can. In this context, healthcare professionals are increasingly counting on a new set of tools in order to deal with the growing information that becomes available to them on a daily basis. By allowing the grouping of collective knowledge and prioritizing “mindlines” against “guidelines”, these support systems are among the most promising applications of big data in health. In this demo paper we introduce Savana, an AI-enabled system based on Natural Language Processing (NLP) and Neural Networks, capable of, for instance, the automatic expansion of medical terminologies, thus enabling the re-use of information expressed in natural language in clinical reports. This automatized and precise digital extraction allows the generation of a real time information engine, which is currently being deployed in healthcare institutions, as well as clinical research and management

    A modular multipurpose, parameter centered electronic health record architecture

    Get PDF
    Health Information Technology is playing a key role in healthcare. Specifically, the use of electronic health records has been found to bring about most significant improvements in healthcare quality, mainly as relates to patient management, healthcare delivery and research support. Health record systems adoption has been promoted in many countries to support efficient, high quality integrated healthcare. The objective of this work is the implementation of an Electronic Health Record system based on a relational database. The system architecture is modular and based on the concentration of specific pathology related parameters in one module, therefore the system can be easily applied to different pathologies. Several examples of its application are described. It is intended to extend the system integrating genomic data
    corecore