35 research outputs found

    Deriving research-quality phenotypes from national electronic health records to advance precision medicine: a UK Biobank case-study

    Get PDF
    High-throughput genotyping and increased availability of electronic health records (EHR) are giving scientists the unprecedented opportunity to exploit routinely generated clinical data to advance precision medicine. The extent to which national structured EHR in the United Kingdom can be utilized in genome-wide association studies (GWAS) has not been systematically examined. In this study, we evaluate the performance of an EHR-derived acute myocardial infarction phenotype (AMI) for performing GWAS in the UK Biobank

    A novel metadata management model to capture consent for record linkage in longitudinal health research studies

    Get PDF
    Background: Informed consent is an important feature of longitudinal research studies as it enables the linking of administrative data of participants to data collected at baseline such as survey answers. The lack of standardised models to capture consent elements can lead to substantial challenges and a structured approach to capturing consent-related metadata can address these. Objectives: The aims were to: a) explore the state-of-art for recording consent; b) identify key elements of consent required for record linkage; and c) create and evaluate a novel metadata management model to capture consent-related metadata. Methods: The main methodological components of our work were: a) a systematic literature review and qualitative analysis of consent forms; b) the development and evaluation of a novel metadata model. Discussion: We qualitatively analyzed 61 manuscripts and 30 consent forms. We extracted data elements related to obtaining consent for linkage. We created a novel metadata management model for consent and evaluated it by comparison with existing standards and by iteratively applying it to case studies. Conclusion: The developed model can facilitate the standardised recording of consent for linkage in longitudinal research studies and enable the linkage of external participant data. Furthermore, it can provide a structured way of recording consent-related metadata and facilitate the harmonization and streamlining of processes

    White cell count in the normal range and short-term and long-term mortality: international comparisons of electronic health record cohorts in England and New Zealand

    Get PDF
    OBJECTIVES: Electronic health records offer the opportunity to discover new clinical implications for established blood tests, but international comparisons have been lacking. We tested the association of total white cell count (WBC) with all-cause mortality in England and New Zealand. SETTING: Primary care practices in England (ClinicAl research using LInked Bespoke studies and Electronic health Records (CALIBER)) and New Zealand (PREDICT). DESIGN: Analysis of linked electronic health record data sets: CALIBER (primary care, hospitalisation, mortality and acute coronary syndrome registry) and PREDICT (cardiovascular risk assessments in primary care, hospitalisations, mortality, dispensed medication and laboratory results). PARTICIPANTS: People aged 30-75 years with no prior cardiovascular disease (CALIBER: N=686 475, 92.0% white; PREDICT: N=194 513, 53.5% European, 14.7% Pacific, 13.4% Maori), followed until death, transfer out of practice (in CALIBER) or study end. PRIMARY OUTCOME MEASURE: HRs for mortality were estimated using Cox models adjusted for age, sex, smoking, diabetes, systolic blood pressure, ethnicity and total:high-density lipoprotein (HDL) cholesterol ratio. RESULTS: We found 'J'-shaped associations between WBC and mortality; the second quintile was associated with lowest risk in both cohorts. High WBC within the reference range (8.65-10.05×10(9)/L) was associated with significantly increased mortality compared to the middle quintile (6.25-7.25×10(9)/L); adjusted HR 1.51 (95% CI 1.43 to 1.59) in CALIBER and 1.33 (95% CI 1.06 to 1.65) in PREDICT. WBC outside the reference range was associated with even greater mortality. The association was stronger over the first 6 months of follow-up, but similar across ethnic groups. CONCLUSIONS: Clinically recorded WBC within the range considered 'normal' is associated with mortality in ethnically different populations from two countries, particularly within the first 6 months. Large-scale international comparisons of electronic health record cohorts might yield new insights from widely performed clinical tests. TRIAL REGISTRATION NUMBER: NCT02014610

    Accuracy of probabilistic record linkage applied to the Brazilian 100 million cohort project

    Get PDF
    This paper presents some current results obtained from our probabilistic record linkage methods applied to the integration of a 100 million cohort composed by socioeconomic data with health databases

    Machine learning models in electronic health records can outperform conventional survival models for predicting patient mortality in coronary artery disease

    Get PDF
    Prognostic modelling is important in clinical practice and epidemiology for patient management and research. Electronic health records (EHR) provide large quantities of data for such models, but conventional epidemiological approaches require significant researcher time to implement. Expert selection of variables, fine-tuning of variable transformations and interactions, and imputing missing values are time-consuming and could bias subsequent analysis, particularly given that missingness in EHR is both high, and may carry meaning. Using a cohort of 80,000 patients from the CALIBER programme, we compared traditional modelling and machine-learning approaches in EHR. First, we used Cox models and random survival forests with and without imputation on 27 expert-selected, preprocessed variables to predict all-cause mortality. We then used Cox models, random forests and elastic net regression on an extended dataset with 586 variables to build prognostic models and identify novel prognostic factors without prior expert input. We observed that data-driven models used on an extended dataset can outperform conventional models for prognosis, without data preprocessing or imputing missing values. An elastic net Cox regression based with 586 unimputed variables with continuous values discretised achieved a C-index of 0.801 (bootstrapped 95% CI 0.799 to 0.802), compared to 0.793 (0.791 to 0.794) for a traditional Cox model comprising 27 expert-selected variables with imputation for missing values. We also found that data-driven models allow identification of novel prognostic variables; that the absence of values for particular variables carries meaning, and can have significant implications for prognosis; and that variables often have a nonlinear association with mortality, which discretised Cox models and random forests can elucidate. This demonstrates that machine-learning approaches applied to raw EHR data can be used to build models for use in research and clinical practice, and identify novel predictive variables and their effects to inform future research

    Data Resource Profile: Cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER)

    Get PDF
    The goal of cardiovascular disease (CVD) research using linked bespoke studies and electronic health records (CALIBER) is to provide evidence to inform health care and public health policy for CVDs across different stages of translation, from discovery, through evaluation in trials to implementation, where linkages to electronic health records provide new scientific opportunities. The initial approach of the CALIBER programme is characterized as follows: (i) Linkages of multiple electronic heath record sources: examples include linkages between the longitudinal primary care data from the Clinical Practice Research Datalink, the national registry of acute coronary syndromes (Myocardial Ischaemia National Audit Project), hospitalization and procedure data from Hospital Episode Statistics and cause-specific mortality and social deprivation data from the Office of National Statistics. Current cohort analyses involve a million people in initially healthy populations and disease registries with ∼105 patients. (ii) Linkages of bespoke investigator-led cohort studies (e.g. UK Biobank) to registry data (e.g. Myocardial Ischaemia National Audit Project), providing new means of ascertaining, validating and phenotyping disease. (iii) A common data model in which routine electronic health record data are made research ready, and sharable, by defining and curating with meta-data >300 variables (categorical, continuous, event) on risk factors, CVDs and non-cardiovascular comorbidities. (iv) Transparency: all CALIBER studies have an analytic protocol registered in the public domain, and data are available (safe haven model) for use subject to approvals. For more information, e-mail [email protected]

    Prolonged dual anti-platelet therapy in stable coronary disease: a comparative observational study of benefits and harms in unselected versus trial populations

    Get PDF
    Objective: To estimate the potential magnitude in unselected patients of the benefits and harms of prolonged dual antiplatelet therapy after acute myocardial infarction seen in selected patients with high risk characteristics in trials. Design: Observational population based cohort study. Setting: PEGASUS-TIMI-54 trial population and CALIBER (ClinicAl research using LInked Bespoke studies and Electronic health Records). Participants: 7238 patients who survived a year or more after acute myocardial infarction. Interventions: Prolonged dual antiplatelet therapy after acute myocardial infarction. Main outcome measures: Recurrent acute myocardial infarction, stroke, or fatal cardiovascular disease. Fatal, severe, or intracranial bleeding. Results: 1676/7238 (23.1%) patients met trial inclusion and exclusion criteria (“target” population). Compared with the placebo arm in the trial population, in the target population the median age was 12 years higher, there were more women (48.6% v 24.3%), and there was a substantially higher cumulative three year risk of both the primary (benefit) trial endpoint of recurrent acute myocardial infarction, stroke, or fatal cardiovascular disease (18.8% (95% confidence interval 16.3% to 21.8%) v 9.04%) and the primary (harm) endpoint of fatal, severe, or intracranial bleeding (3.0% (2.0% to 4.4%) v 1.26% (TIMI major bleeding)). Application of intention to treat relative risks from the trial (ticagrelor 60 mg daily arm) to CALIBER’s target population showed an estimated 101 (95% confidence interval 87 to 117) ischaemic events prevented per 10 000 treated per year and an estimated 75 (50 to 110) excess fatal, severe, or intracranial bleeds caused per 10 000 patients treated per year. Generalisation from CALIBER’s target subgroup to all 7238 real world patients who were stable at least one year after acute myocardial infarction showed similar three year risks of ischaemic events (17.2%, 16.0% to 18.5%), with an estimated 92 (86 to 99) events prevented per 10 000 patients treated per year, and similar three year risks of bleeding events (2.3%, 1.8% to 2.9%), with an estimated 58 (45 to 73) events caused per 10 000 patients treated per year. Conclusions: This novel use of primary-secondary care linked electronic health records allows characterisation of “healthy trial participant” effects and confirms the potential absolute benefits and harms of dual antiplatelet therapy in representative patients a year or more after acute myocardial infarction

    Comparing and Contrasting A Priori and A Posteriori Generalizability Assessment of Clinical Trials on Type 2 Diabetes Mellitus

    Get PDF
    Clinical trials are indispensable tools for evidence-based medicine. However, they are often criticized for poor generalizability. Traditional trial generalizability assessment can only be done after the trial results are published, which compares the enrolled patients with a convenience sample of real-world patients. However, the proliferation of electronic data in clinical trial registries and clinical data warehouses offer a great opportunity to assess the generalizability during the design phase of a new trial. In this work, we compared and contrasted a priori (based on eligibility criteria) and a posteriori (based on enrolled patients) generalizability of Type 2 diabetes clinical trials. Further, we showed that comparing the study population selected by the clinical trial eligibility criteria to the real- world patient population is a good indicator of the generalizability of trials. Our findings demonstrate that the a priori generalizability of a trial is comparable to its a posteriori generalizability in identifying restrictive quantitative eligibility criteria

    Time spent at blood pressure target and the risk of death and cardiovascular diseases

    Get PDF
    Background: The time a patient spends with blood pressure at target level is an intuitive measure of successful BP management, but population studies on its effectiveness are as yet unavailable. Method: We identified a population-based cohort of 169,082 individuals with newly identified high blood pressure who were free of cardiovascular disease from January 1997 to March 2010. We used 1.64 million clinical blood pressure readings to calculate the TIme at TaRgEt (TITRE) based on current target blood pressure levels. Result: The median (Inter-quartile range) TITRE among all patients was 2.8 (0.3, 5.6) months per year, only 1077 (0.6%) patients had a TITRE ≥11 months. Compared to people with a 0% TITRE, patients with a TITRE of 3–5.9 months, and 6–8.9 months had 75% and 78% lower odds of the composite of cardiovascular death, myocardial infarction and stroke (adjusted odds ratios, 0.25 (95% confidence interval: 0.21, 0.31) and 0.22 (0.17, 0.27), respectively). These associations were consistent for heart failure and any cardiovascular disease and death (comparing a 3–5.9 month to 0% TITRE, 63% and 60% lower in odds, respectively), among people who did or did not have blood pressure ‘controlled’ on a single occasion during the first year of follow-up, and across groups defined by number of follow-up BP measure categories. Conclusion: Based on the current frequency of measurement of blood pressure this study suggests that few newly hypertensive patients sustained a complete, year-round on target blood pressure over time. The inverse associations between a higher TITRE and lower risk of incident cardiovascular diseases were independent of widely-used blood pressure ‘control’ indicators. Randomized trials are required to evaluate interventions to increase a person’s time spent at blood pressure target
    corecore