2,825 research outputs found

    The Laboratory-Based Intermountain Validated Exacerbation (LIVE) Score Identifies Chronic Obstructive Pulmonary Disease Patients at High Mortality Risk.

    Get PDF
    Background: Identifying COPD patients at high risk for mortality or healthcare utilization remains a challenge. A robust system for identifying high-risk COPD patients using Electronic Health Record (EHR) data would empower targeting interventions aimed at ensuring guideline compliance and multimorbidity management. The purpose of this study was to empirically derive, validate, and characterize subgroups of COPD patients based on routinely collected clinical data widely available within the EHR. Methods: Cluster analysis was used in 5,006 patients with COPD at Intermountain to identify clusters based on a large collection of clinical variables. Recursive Partitioning (RP) was then used to determine a preferred tree that assigned patients to clusters based on a parsimonious variable subset. The mortality, COPD exacerbations, and comorbidity profile of the identified groups were examined. The findings were validated in an independent Intermountain cohort and in external cohorts from the United States Veterans Affairs (VA) and University of Chicago Medicine systems. Measurements and Main Results: The RP algorithm identified five LIVE Scores based on laboratory values: albumin, creatinine, chloride, potassium, and hemoglobin. The groups were characterized by increasing risk of mortality. The lowest risk, LIVE Score 5 had 8% 4-year mortality vs. 56% in the highest risk LIVE Score 1 (p < 0.001). These findings were validated in the VA cohort (n = 83,134), an expanded Intermountain cohort (n = 48,871) and in the University of Chicago system (n = 3,236). Higher mortality groups also had higher COPD exacerbation rates and comorbidity rates. Conclusions: In large clinical datasets across different organizations, the LIVE Score utilizes existing laboratory data for COPD patients, and may be used to stratify risk for mortality and COPD exacerbations

    COPD phenotypes and machine learning cluster analysis : A systematic review and future research agenda

    Get PDF
    Funding This research did not receive any specific grant from funding agencies in the public, commercial, or ot-for-profit sectors.Peer reviewedPostprin

    Efficient Replication of Over 180 Genetic Associations with Self-Reported Medical Data

    Get PDF
    While the cost and speed of generating genomic data have come down dramatically in recent years, the slow pace of collecting medical data for large cohorts continues to hamper genetic research. Here we evaluate a novel online framework for amassing large amounts of medical information in a recontactable cohort by assessing our ability to replicate genetic associations using these data. Using web-based questionnaires, we gathered self-reported data on 50 medical phenotypes from a generally unselected cohort of over 20,000 genotyped individuals. Of a list of genetic associations curated by NHGRI, we successfully replicated about 75% of the associations that we expected to (based on the number of cases in our cohort and reported odds ratios, and excluding a set of associations with contradictory published evidence). Altogether we replicated over 180 previously reported associations, including many for type 2 diabetes, prostate cancer, cholesterol levels, and multiple sclerosis. We found significant variation across categories of conditions in the percentage of expected associations that we were able to replicate, which may reflect systematic inflation of the effects in some initial reports, or differences across diseases in the likelihood of misdiagnosis or misreport. We also demonstrated that we could improve replication success by taking advantage of our recontactable cohort, offering more in-depth questions to refine self-reported diagnoses. Our data suggests that online collection of self-reported data in a recontactable cohort may be a viable method for both broad and deep phenotyping in large populations

    Exploring the relationship between age and health conditions using electronic health records: from single diseases to multimorbidities

    Get PDF
    Background Two enormous challenges facing healthcare systems are ageing and multimorbidity. Clinicians, policymakers, healthcare providers and researchers need to know “who gets which diseases when” in order to effectively prevent, detect and manage multiple conditions. Identification of ageing-related diseases (ARDs) is a starting point for research into common biological pathways in ageing. Examining multimorbidity clusters can facilitate a shift from the single-disease paradigm that pervades medical research and practice to models which reflect the reality of the patient population. Aim To examine how age influences an individual’s likelihood of developing single and multiple health conditions over the lifecourse. Methods and Outputs I used primary care and hospital admission electronic health records (EHRs) of 3,872,451 individuals from the Clinical Practice Research Datalink (CPRD) linked to the Hospital Episode Statistics admitted patient care (HES-APC) dataset in England from 1 April 2010 to 31 March 2015. In collaboration with Professor Aroon Hingorani, Dr Osman Bhatti, Dr Shanaz Husain, Dr Shailen Sutaria, Professor Dorothea Nitsch, Mrs Melanie Hingorani, Dr Constantinos Parisinos, Dr Tom Lumbers and Dr Reecha Sofat, I derived the case definitions for 308 clinically important health conditions, by harmonising Read, ICD-10 and OPCS-4 codes across primary and secondary care records in England. I calculated the age-specific incidence rate, period prevalence and median age at first recorded diagnosis for these conditions and described the 50 most common diseases in each decade of life. I developed a protocol for identifying ARDs using machine-learning and actuarial techniques. Finally, I identified highly correlated multimorbidity clusters and created a tool to visualise comorbidity clusters using a network approach. Conclusions I have developed case definitions (with a panel of clinicians) and calculated disease frequency estimates for 308 clinically important health conditions in the NHS in England. I have described patterns of ageing and multimorbidity using these case definitions, and produced an online app for interrogating comorbidities for an index condition. This work facilitates future research into ageing pathways and multimorbidity

    Evaluating the role of COPD in patients with heart failure using multiple electronic health data sources

    Get PDF
    Heart failure (HF) and COPD frequently co-exist. Shared symptoms and risk factors make diagnosis and management difficult and current understanding of the relationship between the diseases is limited. I used several electronic healthcare record (EHR) data sources, from the United States (US) and the United Kingdom (UK) to evaluate the impact of COPD on outcomes in patients with HF. First, I aimed to demonstrate that comorbidity data from EHR can be used to derive meaningful clusters in patients with chronic HF, expecting COPD to be a main driver of this phenotyping endeavour. Second, I compared outcomes (hospitalisation, mortality, healthcare utilisation) in patients with COPD-HF, between left ventricular ejection fraction (LVEF) groups. Third, I pooled data from previously published studies to assess the overall effect of HF management (beta-blockers) on outcomes in COPD. In a fourth study I examined whether COPD was associated with in-hospital mortality and management of patients hospitalised for HF and assessed association with LVEF. Lastly, I investigated whether COPD affected readmission in a population of patients hospitalised for HF. This work provides evidence to suggest that while COPD may not play a major role in determining a HF classification system based on comorbidities only, it affects clinical outcomes in the long-term, particularly for chronic HFpEF patients. Conversely, HF management such as beta-blockers does not appear to worsen outcomes in COPD patients. In the acute setting, coexisting COPD is independently associated with increased in-hospital mortality and decreased HF medication prescription and access to healthcare services amongst patients who survived their first HF admission. Readmission risk is higher amongst those with HF and COPD compared with HF-alone, though the most frequent reason for returning to hospital is still due to a cardiovascular cause.Open Acces

    Evaluation of data processing pipelines on real-world electronic health records data for the purpose of measuring patient similarity

    Get PDF
    BACKGROUND: The ever-growing size, breadth, and availability of patient data allows for a wide variety of clinical features to serve as inputs for phenotype discovery using cluster analysis. Data of mixed types in particular are not straightforward to combine into a single feature vector, and techniques used to address this can be biased towards certain data types in ways that are not immediately obvious or intended. In this context, the process of constructing clinically meaningful patient representations from complex datasets has not been systematically evaluated. AIMS: Our aim was to a) outline and b) implement an analytical framework to evaluate distinct methods of constructing patient representations from routine electronic health record data for the purpose of measuring patient similarity. We applied the analysis on a patient cohort diagnosed with chronic obstructive pulmonary disease. METHODS: Using data from the CALIBER data resource, we extracted clinically relevant features for a cohort of patients diagnosed with chronic obstructive pulmonary disease. We used four different data processing pipelines to construct lower dimensional patient representations from which we calculated patient similarity scores. We described the resulting representations, ranked the influence of each individual feature on patient similarity and evaluated the effect of different pipelines on clustering outcomes. Experts evaluated the resulting representations by rating the clinical relevance of similar patient suggestions with regard to a reference patient. RESULTS: Each of the four pipelines resulted in similarity scores primarily driven by a unique set of features. It was demonstrated that data transformations according to each pipeline prior to clustering can result in a variation of clustering results of over 40%. The most appropriate pipeline was selected on the basis of feature ranking and clinical expertise. There was moderate agreement between clinicians as measured by Cohen's kappa coefficient. CONCLUSIONS: Data transformation has downstream and unforeseen consequences in cluster analysis. Rather than viewing this process as a black box, we have shown ways to quantitatively and qualitatively evaluate and select the appropriate preprocessing pipeline
    corecore