24 research outputs found

    A data-driven approach to identifying sub-types of adult asthma from primary care electronic health records

    Get PDF
    Asthma is increasingly recognised as an umbrella term to describe a number of distinct clinical presentations (phenotypes) with underlying physiological mechanisms (endotypes). Numerous phenotypes and endotypes of asthma have been proposed, both using hypothesis-driven and, more recently, data-driven techniques. However, inconsistencies of findings mean that scientific and clinical consensus are yet to be reached and the pursuit of the true phenotypes and endotypes of asthma is ongoing. In the meantime, there is an unmet clinical need to identify practical subtypes of asthma (for the purposes of this thesis the term subtype refers to any grouping of asthma based on patient characteristics). Such subtypes could facilitate a transition from the current one-size-fits-all approach towards personalised medicine. The aim of this thesis was to identify subtypes of asthma in adults from UK primary care electronic health records using data-driven methods, specifically focusing on an unsupervised machine learning technique called cluster analysis. To inform the application of cluster analysis in this thesis, I reviewed its application in 63 previous studies that derived asthma subtypes from multimodal clinical data. I found that the methods used were often poorly suited to mixed-type clinical data. In addition, I found that studies assessing the stability and validity of findings were often inadequate or missing entirely. The clinical findings of studies in which such limitations are present should be interpreted with caution. The first step in the analysis reported in this thesis was the derivation of datasets from primary care electronic health records, to which data-driven methods could then be applied to identify subtypes. Two sources of primary care electronic health records were used: the Optimum Patient Care Research Database (OPCRD) and the Secure Anonymised Information Linkage (SAIL) Databank. I aimed to be as transparent as possible when describing the dataset derivation process by reporting the results of exploratory data analysis and by making all analysis code publicly available upon completion. This was to facilitate critical appraisal of the process and replication and validation of the findings. To identify subtypes of asthma from the derived datasets, multiple correspondence analysis (MCA) and k-means cluster analysis were applied to a training set of data from 50,000 patients with asthma registered at a primary care practice in England in 2016 (sourced from OPCRD). A novel framework based on the performance of a random forest model to replicate the outputs of the k-means clustering algorithm was used to select the number of dimensions to retain from the MCA. A resampling framework was used to discard unstable cluster solutions, and the number of clusters was selected using average silhouette widths. These methods identified five subtypes of adult asthma that can be tentatively interpreted as follows: (1) low healthcare utilisation; (2) low-to-medium medication use; (3) metabolic comorbidity; (4) high medication use; (5) very high medication use. Finally, a random forest model was trained to replicate the cluster labels using the original features. This model achieved a balanced accuracy of 93% in an unseen dataset comprising 50,000 patients sampled from OPCRD at the same time-point. In the internal validation analysis (unseen OPCRD data from 2017 and 2018) the random forest approximated cluster labels derived at two timepoints with balanced accuracies of 92-93%, and in the external validation analysis (unseen SAIL data from 2016, 2017 and 2018) the balanced accuracies were 74-79%. The asthma subtype characteristics across the unseen data (both the out-of-sample OPCRD and SAIL) were consistent with those in the training data. The investigation of data-driven methods for identifying asthma subtypes presented in this thesis builds on the current evidence in two key areas. First, limitations in the application of methods in previous studies were identified, and a novel framework which mitigates these limitations was proposed. This framework could be extended to other disease areas as a means of exploring patient subgroups and ultimately facilitating precision medicine. Second, this is the first study to derive data-driven subtypes of adult asthma directly from primary care electronic health record data. The result is subtypes that have the potential to be directly translated to clinical practice in a UK primary care setting. This could facilitate asthma patient stratification towards developing more personalised monitoring and treatment regimens

    A Data-Driven Typology of Asthma Medication Adherence using Cluster Analysis

    Get PDF
    Asthma preventer medication non-adherence is strongly associated with poor asthma control. One-dimensional measures of adherence may ignore clinically important patterns of medication-taking behavior. We sought to construct a data-driven multi-dimensional typology of medication non-adherence in children with asthma. We analyzed data from an intervention study of electronic inhaler monitoring devices, comprising 211 patients yielding 35,161 person-days of data. Five adherence measures were extracted: the percentage of doses taken, the percentage of days on which zero doses were taken, the percentage of days on which both doses were taken, the number of treatment intermissions per 100 study days, and the duration of treatment intermissions per 100 study days. We applied principal component analysis on the measures and subsequently applied k-means to determine cluster membership. Decision trees identified the measure that could predict cluster assignment with the highest accuracy, increasing interpretability and increasing clinical utility. We demonstrate the use of adherence measures towards a three-group categorization of medication non-adherence, which succinctly describes the diversity of patient medication taking patterns in asthma. The percentage of prescribed doses taken during the study contributed to the prediction of cluster assignment most accurately (84% in out-of-sample data)

    Defining clinical subtypes of adult asthma using electronic health records : analysis of a large UK primary care database with external validation

    Get PDF
    Acknowledgments EMFH was supported by a Medical Research Council PhD Studentship (eHERC/Farr). This work is carried out with the support of the Asthma UK Centre for Applied Research [AUKAC-2012-01] and Health Data Research UK which receives its funding from HDR UK Ltd (HDR-5012) funded by the UK Medical Research Council, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation and the Wellcome Trust. The funders had no role in the study and the decision to submit this work to be considered for publication. This Project is based in part/wholly on Data from the Optimum Patient Care Research Database (opcrd.co.uk) obtained under licence from Optimum Patient Care Limited and its execution is approved by recognised experts affiliated to the Respiratory Effectiveness Group. However, the interpretation and conclusion contained in this report are those of the author/s alone. This study makes use of anonymised data held in the Secure Anonymised Information Linkage (SAIL) Databank. We would like to acknowledge all the data providers who make anonymised data available for research. SAIL is not responsible for the interpretation of these data.Peer reviewedPublisher PD

    Pilot trials in physical activity journals:a review of reporting and editorial policy

    Get PDF
    Background: Since the early 2000s, a number of publications in the medical literature have highlighted inadequacies in the design, conduct and reporting of pilot trials. This work led to two notable publications in 2016: a conceptual framework for defining feasibility studies and an extension to the CONSORT 2010 statement to include pilot trials. It was hoped that these publications would educate researchers, leading to better use of pilot trials and thus more rigorously planned and informed randomised controlled trials. The aim of the present work is to evaluate the impact of these publications in the field of physical activity by reviewing the literature pre- and post-2016. This first article presents the pre-2016 review of the reporting and the current editorial policy applied to pilot trials published in physical activity journals. Methods: Fourteen physical activity journals were screened for pilot and feasibility studies published between 2012 and 2015. The CONSORT 2010 extension to pilot and feasibility studies was used as a framework to assess the reporting quality of the studies. Editors of the eligible physical activity journals were canvassed regarding their editorial policy for pilot and feasibility studies. Results: Thirty-one articles across five journals met the eligibility criteria. These articles fell into three distinct categories: trials that were carried out in preparation for a future definitive trial (23%), trials that evaluated the feasibility of a novel intervention but did not explicitly address a future definitive trial (23%) and trials that did not have any clear objectives to address feasibility (55%). Editors from all five journals stated that they generally do not accept pilot trials, and none gave reference to the CONSORT 2010 extension as a guideline for submissions. Conclusion: The result that over half of the studies did not have feasibility objectives is in line with previous research findings, demonstrating that these findings are not being disseminated effectively to researchers in the field of physical activity. The low standard of reporting across most reviewed articles and the neglect of the extended CONSORT 2010 statement by the journal editors highlight the need to actively disseminate these guidelines to ensure their impact

    Comparative effectiveness of BNT162b2 versus mRNA-1273 covid-19 vaccine boosting in England: matched cohort study in OpenSAFELY-TPP.

    Get PDF
    OBJECTIVE: To compare the effectiveness of the BNT162b2 mRNA (Pfizer-BioNTech) and mRNA-1273 (Moderna) covid-19 vaccines during the booster programme in England. DESIGN: Matched cohort study, emulating a comparative effectiveness trial. SETTING: Linked primary care, hospital, and covid-19 surveillance records available within the OpenSAFELY-TPP research platform, covering a period when the SARS-CoV-2 delta and omicron variants were dominant. PARTICIPANTS: 3 237 918 adults who received a booster dose of either vaccine between 29 October 2021 and 25 February 2022 as part of the national booster programme in England and who received a primary course of BNT162b2 or ChAdOx1. INTERVENTION: Vaccination with either BNT162b2 or mRNA-1273 as a booster vaccine dose. MAIN OUTCOME MEASURES: Recorded SARS-CoV-2 positive test, covid-19 related hospital admission, covid-19 related death, and non-covid-19 related death at 20 weeks after receipt of the booster dose. RESULTS: 1 618 959 people were matched in each vaccine group, contributing a total 64 546 391 person weeks of follow-up. The 20 week risks per 1000 for a positive SARS-CoV-2 test were 164.2 (95% confidence interval 163.3 to 165.1) for BNT162b2 and 159.9 (159.0 to 160.8) for mRNA-1273; the hazard ratio comparing mRNA-1273 with BNT162b2 was 0.95 (95% confidence interval 0.95 to 0.96). The 20 week risks per 1000 for hospital admission with covid-19 were 0.75 (0.71 to 0.79) for BNT162b2 and 0.65 (0.61 to 0.69) for mRNA-1273; the hazard ratio was 0.89 (0.82 to 0.95). Covid-19 related deaths were rare: the 20 week risks per 1000 were 0.028 (0.021 to 0.037) for BNT162b2 and 0.024 (0.018 to 0.033) for mRNA-1273; hazard ratio 0.83 (0.58 to 1.19). Comparative effectiveness was generally similar within subgroups defined by the primary course vaccine brand, age, previous SARS-CoV-2 infection, and clinical vulnerability. Relative benefit was similar when vaccines were compared separately in the delta and omicron variant eras. CONCLUSIONS: This matched observational study of adults estimated a modest benefit of booster vaccination with mRNA-1273 compared with BNT162b2 in preventing positive SARS-CoV-2 tests and hospital admission with covid-19 20 weeks after vaccination, during a period of delta followed by omicron variant dominance

    Factors associated with COVID-19 vaccine uptake in people with kidney disease: an OpenSAFELY cohort study.

    Get PDF
    OBJECTIVE: To characterise factors associated with COVID-19 vaccine uptake among people with kidney disease in England. DESIGN: Retrospective cohort study using the OpenSAFELY-TPP platform, performed with the approval of NHS England. SETTING: Individual-level routine clinical data from 24 million people across GPs in England using TPP software. Primary care data were linked directly with COVID-19 vaccine records up to 31 August 2022 and with renal replacement therapy (RRT) status via the UK Renal Registry (UKRR). PARTICIPANTS: A cohort of adults with stage 3-5 chronic kidney disease (CKD) or receiving RRT at the start of the COVID-19 vaccine roll-out was identified based on evidence of reduced estimated glomerular filtration rate (eGFR) or inclusion in the UKRR. MAIN OUTCOME MEASURES: Dose-specific vaccine coverage over time was determined from 1 December 2020 to 31 August 2022. Individual-level factors associated with receipt of a 3-dose or 4-dose vaccine series were explored via Cox proportional hazards models. RESULTS: 992 205 people with stage 3-5 CKD or receiving RRT were included. Cumulative vaccine coverage as of 31 August 2022 was 97.5%, 97.0% and 93.9% for doses 1, 2 and 3, respectively, and 81.9% for dose 4 among individuals with one or more indications for eligibility. Delayed 3-dose vaccine uptake was associated with younger age, minority ethnicity, social deprivation and severe mental illness-associations that were consistent across CKD severity subgroups, dialysis patients and kidney transplant recipients. Similar associations were observed for 4-dose uptake. CONCLUSION: Although high primary vaccine and booster dose coverage has been achieved among people with kidney disease in England, key disparities in vaccine uptake remain across clinical and demographic groups and 4-dose coverage is suboptimal. Targeted interventions are needed to identify barriers to vaccine uptake among under-vaccinated subgroups identified in the present study

    COVID-19 trajectories among 57 million adults in England: a cohort study using electronic health records

    Get PDF
    BACKGROUND: Updatable estimates of COVID-19 onset, progression, and trajectories underpin pandemic mitigation efforts. To identify and characterise disease trajectories, we aimed to define and validate ten COVID-19 phenotypes from nationwide linked electronic health records (EHR) using an extensible framework. METHODS: In this cohort study, we used eight linked National Health Service (NHS) datasets for people in England alive on Jan 23, 2020. Data on COVID-19 testing, vaccination, primary and secondary care records, and death registrations were collected until Nov 30, 2021. We defined ten COVID-19 phenotypes reflecting clinically relevant stages of disease severity and encompassing five categories: positive SARS-CoV-2 test, primary care diagnosis, hospital admission, ventilation modality (four phenotypes), and death (three phenotypes). We constructed patient trajectories illustrating transition frequency and duration between phenotypes. Analyses were stratified by pandemic waves and vaccination status. FINDINGS: Among 57 032 174 individuals included in the cohort, 13 990 423 COVID-19 events were identified in 7 244 925 individuals, equating to an infection rate of 12·7% during the study period. Of 7 244 925 individuals, 460 737 (6·4%) were admitted to hospital and 158 020 (2·2%) died. Of 460 737 individuals who were admitted to hospital, 48 847 (10·6%) were admitted to the intensive care unit (ICU), 69 090 (15·0%) received non-invasive ventilation, and 25 928 (5·6%) received invasive ventilation. Among 384 135 patients who were admitted to hospital but did not require ventilation, mortality was higher in wave 1 (23 485 [30·4%] of 77 202 patients) than wave 2 (44 220 [23·1%] of 191 528 patients), but remained unchanged for patients admitted to the ICU. Mortality was highest among patients who received ventilatory support outside of the ICU in wave 1 (2569 [50·7%] of 5063 patients). 15 486 (9·8%) of 158 020 COVID-19-related deaths occurred within 28 days of the first COVID-19 event without a COVID-19 diagnoses on the death certificate. 10 884 (6·9%) of 158 020 deaths were identified exclusively from mortality data with no previous COVID-19 phenotype recorded. We observed longer patient trajectories in wave 2 than wave 1. INTERPRETATION: Our analyses illustrate the wide spectrum of disease trajectories as shown by differences in incidence, survival, and clinical pathways. We have provided a modular analytical framework that can be used to monitor the impact of the pandemic and generate evidence of clinical and policy relevance using multiple EHR sources. FUNDING: British Heart Foundation Data Science Centre, led by Health Data Research UK

    opensafely/ckd-coverage-ve

    No full text
    This is the code and configuration for OpenSAFELY. The study characterises factors associated with COVID-19 vaccine uptake among people with kidney disease in England
    corecore