9 research outputs found

    Application of Clinical Concept Embeddings for Heart Failure Prediction in UK EHR data

    Get PDF
    Electronic health records (EHR) are increasingly being used for constructing disease risk prediction models. Feature engineering in EHR data however is challenging due to their highly dimensional and heterogeneous nature. Low-dimensional representations of EHR data can potentially mitigate these challenges. In this paper, we use global vectors (GloVe) to learn word embeddings for diagnoses and procedures recorded using 13 million ontology terms across 2.7 million hospitalisations in national UK EHR. We demonstrate the utility of these embeddings by evaluating their performance in identifying patients which are at higher risk of being hospitalised for congestive heart failure. Our findings indicate that embeddings can enable the creation of robust EHR-derived disease risk prediction models and address some the limitations associated with manual clinical feature engineering.Comment: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.0721

    Identifying priorities in methodological research using ICD-9-CM and ICD-10 administrative data: report from an international consortium

    Get PDF
    BACKGROUND: Health administrative data are frequently used for health services and population health research. Comparative research using these data has been facilitated by the use of a standard system for coding diagnoses, the International Classification of Diseases (ICD). Research using the data must deal with data quality and validity limitations which arise because the data are not created for research purposes. This paper presents a list of high-priority methodological areas for researchers using health administrative data. METHODS: A group of researchers and users of health administrative data from Canada, the United States, Switzerland, Australia, China and the United Kingdom came together in June 2005 in Banff, Canada to discuss and identify high-priority methodological research areas. The generation of ideas for research focussed not only on matters relating to the use of administrative data in health services and population health research, but also on the challenges created in transitioning from ICD-9 to ICD-10. After the brain-storming session, voting took place to rank-order the suggested projects. Participants were asked to rate the importance of each project from 1 (low priority) to 10 (high priority). Average ranks were computed to prioritise the projects. RESULTS: Thirteen potential areas of research were identified, some of which represented preparatory work rather than research per se. The three most highly ranked priorities were the documentation of data fields in each country's hospital administrative data (average score 8.4), the translation of patient safety indicators from ICD-9 to ICD-10 (average score 8.0), and the development and validation of algorithms to verify the logic and internal consistency of coding in hospital abstract data (average score 7.0). CONCLUSION: The group discussions resulted in a list of expert views on critical international priorities for future methodological research relating to health administrative data. The consortium's members welcome contacts from investigators involved in research using health administrative data, especially in cross-jurisdictional collaborative studies or in studies that illustrate the application of ICD-10

    Vaccine semantics : Automatic methods for recognizing, representing, and reasoning about vaccine-related information

    Get PDF
    Post-marketing management and decision-making about vaccines builds on the early detection of safety concerns and changes in public sentiment, the accurate access to established evidence, and the ability to promptly quantify effects and verify hypotheses about the vaccine benefits and risks. A variety of resources provide relevant information but they use different representations, which makes rapid evidence generation and extraction challenging. This thesis presents automatic methods for interpreting heterogeneously represented vaccine information. Part I evaluates social media messages for monitoring vaccine adverse events and public sentiment in social media messages, using automatic methods for information recognition. Parts II and III develop and evaluate automatic methods and res

    An experimental study and evaluation of a new architecture for clinical decision support - integrating the openEHR specifications for the Electronic Health Record with Bayesian Networks

    Get PDF
    Healthcare informatics still lacks wide-scale adoption of intelligent decision support methods, despite continuous increases in computing power and methodological advances in scalable computation and machine learning, over recent decades. The potential has long been recognised, as evidenced in the literature of the domain, which is extensively reviewed. The thesis identifies and explores key barriers to adoption of clinical decision support, through computational experiments encompassing a number of technical platforms. Building on previous research, it implements and tests a novel platform architecture capable of processing and reasoning with clinical data. The key components of this platform are the now widely implemented openEHR electronic health record specifications and Bayesian Belief Networks. Substantial software implementations are used to explore the integration of these components, guided and supplemented by input from clinician experts and using clinical data models derived in hospital settings at Moorfields Eye Hospital. Data quality and quantity issues are highlighted. Insights thus gained are used to design and build a novel graph-based representation and processing model for the clinical data, based on the openEHR specifications. The approach can be implemented using diverse modern database and platform technologies. Computational experiments with the platform, using data from two clinical domains – a preliminary study with published thyroid metabolism data and a substantial study of cataract surgery – explore fundamental barriers that must be overcome in intelligent healthcare systems developments for clinical settings. These have often been neglected, or misunderstood as implementation procedures of secondary importance. The results confirm that the methods developed have the potential to overcome a number of these barriers. The findings lead to proposals for improvements to the openEHR specifications, in the context of machine learning applications, and in particular for integrating them with Bayesian Networks. The thesis concludes with a roadmap for future research, building on progress and findings to date
    corecore