45 research outputs found

    Can analyses of electronic patient records be independently and externally validated? The effect of statins on the mortality of patients with ischaemic heart disease: a cohort study with nested case-control analysis

    Get PDF
    Objective To conduct a fully independent and external validation of a research study based on one electronic health record database, using a different electronic database sampling the same population. Design Using the Clinical Practice Research Datalink (CPRD), we replicated a published investigation into the effects of statins in patients with ischaemic heart disease (IHD) by a different research team using QResearch. We replicated the original methods and analysed all-cause mortality using: (1) a cohort analysis and (2) a case-control analysis nested within the full cohort. Setting Electronic health record databases containing longitudinal patient consultation data from large numbers of general practices distributed throughout the UK. Participants CPRD data for 34 925 patients with IHD from 224 general practices, compared to previously published results from QResearch for 13 029 patients from 89 general practices. The study period was from January 1996 to December 2003. Results We successfully replicated the methods of the original study very closely. In a cohort analysis, risk of death was lower by 55% for patients on statins, compared with 53% for QResearch (adjusted HR 0.45, 95% CI 0.40 to 0.50; vs 0.47, 95% CI 0.41 to 0.53). In case-control analyses, patients on statins had a 31% lower odds of death, compared with 39% for QResearch (adjusted OR 0.69, 95% CI 0.63 to 0.75; vs OR 0.61, 95% CI 0.52 to 0.72). Results were also close for individual statins. Conclusions Database differences in population characteristics and in data definitions, recording, quality and completeness had a minimal impact on key statistical outputs. The results uphold the validity of research using CPRD and QResearch by providing independent evidence that both datasets produce very similar estimates of treatment effect, leading to the same clinical and policy decisions. Together with other non-independent replication studies, there is a nascent body of evidence for wider validity

    rEHR: An R package for manipulating and analysing Electronic Health Record data

    Get PDF
    Research with structured Electronic Health Records (EHRs) is expanding as data becomes more accessible; analytic methods advance; and the scientific validity of such studies is increasingly accepted. However, data science methodology to enable the rapid searching/extraction, cleaning and analysis of these large, often complex, datasets is less well developed. In addition, commonly used software is inadequate, resulting in bottlenecks in research workflows and in obstacles to increased transparency and reproducibility of the research. Preparing a research-ready dataset from EHRs is a complex and time consuming task requiring substantial data science skills, even for simple designs. In addition, certain aspects of the workflow are computationally intensive, for example extraction of longitudinal data and matching controls to a large cohort, which may take days or even weeks to run using standard software. The rEHR package simplifies and accelerates the process of extracting ready-for-analysis datasets from EHR databases. It has a simple import function to a database backend that greatly accelerates data access times. A set of generic query functions allow users to extract data efficiently without needing detailed knowledge of SQL queries. Longitudinal data extractions can also be made in a single command, making use of parallel processing. The package also contains functions for cutting data by time-varying covariates, matching controls to cases, unit conversion and construction of clinical code lists. There are also functions to synthesise dummy EHR. The package has been tested with one for the largest primary care EHRs, the Clinical Practice Research Datalink (CPRD), but allows for a common interface to other EHRs. This simplified and accelerated work flow for EHR data extraction results in simpler, cleaner scripts that are more easily debugged, shared and reproduced

    Calculating association indices in captive animals : controlling for enclosure size and shape

    Get PDF
    Indices of association are used to quantify and evaluate social affiliation among animals living in groups. Association models assume that physical proximity is an indication of social affiliation; however, individuals seen associating might simply be together by chance. This problem is particularly pronounced in studies of captive animals, whose movements are sometimes severely spatially restricted relative to the wild. Few attempts have been made to estimate – and thus control for – chance encounters based on enclosure size and shape. Using geometric probability and Geographic Information Systems, we investigated the likely effect of chance encounters on association indices within dyads (pairs of animals), when different distance criteria for defining associations are used in shapes of a given area. We developed a simple R script, which can be used to provide a robust estimate of the probability of a chance encounter in a square of any area. We used Monte Carlo methods to determine that this provided acceptable estimates of the probability of chance encounters in rectangular shapes and the shapes of six actual zoo enclosures, and we present an example of its use to correct observed indices of association. Applying this correction controls for differences in enclosure size and shape, and allows association indices between dyads housed in different enclosures to be compared

    Modelling Conditions and Health Care Processes in Electronic Health Records : An Application to Severe Mental Illness with the Clinical Practice Research Datalink

    Get PDF
    BACKGROUND: The use of Electronic Health Records databases for medical research has become mainstream. In the UK, increasing use of Primary Care Databases is largely driven by almost complete computerisation and uniform standards within the National Health Service. Electronic Health Records research often begins with the development of a list of clinical codes with which to identify cases with a specific condition. We present a methodology and accompanying Stata and R commands (pcdsearch/Rpcdsearch) to help researchers in this task. We present severe mental illness as an example. METHODS: We used the Clinical Practice Research Datalink, a UK Primary Care Database in which clinical information is largely organised using Read codes, a hierarchical clinical coding system. Pcdsearch is used to identify potentially relevant clinical codes and/or product codes from word-stubs and code-stubs suggested by clinicians. The returned code-lists are reviewed and codes relevant to the condition of interest are selected. The final code-list is then used to identify patients. RESULTS: We identified 270 Read codes linked to SMI and used them to identify cases in the database. We observed that our approach identified cases that would have been missed with a simpler approach using SMI registers defined within the UK Quality and Outcomes Framework. CONCLUSION: We described a framework for researchers of Electronic Health Records databases, for identifying patients with a particular condition or matching certain clinical criteria. The method is invariant to coding system or database and can be used with SNOMED CT, ICD or other medical classification code-lists

    Risk factors for self-harm in people with epilepsy

    Get PDF
    Objective:To estimate the risk of self-harm in people with epilepsy and identify factors which influence this risk.Methods: We identified people with incident epilepsy in the Clinical Practice Research Datalink (CPRD), linked to hospitalization and mortality data, in England (01/01/1998-03/31/2014). In Phase 1, we estimated risk of self-harm among people with epilepsy, versus those without, in a matched cohort study using a stratified-Cox proportional hazards model. In Phase 2, we delineated a nested case-control study from the incident epilepsy cohort. People who had self-harmed (cases) were matched with up to 20 controls. From conditional logistic regression models, we estimated relative risk of self-harm associated with mental and physical illness comorbidity, contact with healthcare services and antiepileptic drug (AED) use.Results: Phase 1 included 11,690 people with epilepsy and 215,569 individuals without. We observed an adjusted hazard ratio of 5.31 (95% CI 4.08-6.89) for self-harm in the first year following epilepsy diagnosis and 3.31 (95% CI 2.85-3.84) in subsequent years. In Phase 2, there were 273 cases and 3,790 controls. Elevated self-harm risk was associated with mental illness (OR 4.08, 95% CI 3.06-5.42), multiple General Practitioner consultations, treatment with two AEDs versus monotherapy (OR 1.84, 95% CI 1.33-2.55) and AED treatment augmentation (OR 2.12, 95% CI 1.38-3.26). Conclusion: People with epilepsy have elevated self-harm risk, especially in the first year following diagnosis. Clinicians should adequately monitor these individuals and be especially vigilant to self-harm risk in people with epilepsy and comorbid mental illness, frequent healthcare service contact, those taking multiple AEDs and during treatment augmentation

    Evaluation of appendicitis risk prediction models in adults with suspected appendicitis

    Get PDF
    Background Appendicitis is the most common general surgical emergency worldwide, but its diagnosis remains challenging. The aim of this study was to determine whether existing risk prediction models can reliably identify patients presenting to hospital in the UK with acute right iliac fossa (RIF) pain who are at low risk of appendicitis. Methods A systematic search was completed to identify all existing appendicitis risk prediction models. Models were validated using UK data from an international prospective cohort study that captured consecutive patients aged 16–45 years presenting to hospital with acute RIF in March to June 2017. The main outcome was best achievable model specificity (proportion of patients who did not have appendicitis correctly classified as low risk) whilst maintaining a failure rate below 5 per cent (proportion of patients identified as low risk who actually had appendicitis). Results Some 5345 patients across 154 UK hospitals were identified, of which two‐thirds (3613 of 5345, 67·6 per cent) were women. Women were more than twice as likely to undergo surgery with removal of a histologically normal appendix (272 of 964, 28·2 per cent) than men (120 of 993, 12·1 per cent) (relative risk 2·33, 95 per cent c.i. 1·92 to 2·84; P < 0·001). Of 15 validated risk prediction models, the Adult Appendicitis Score performed best (cut‐off score 8 or less, specificity 63·1 per cent, failure rate 3·7 per cent). The Appendicitis Inflammatory Response Score performed best for men (cut‐off score 2 or less, specificity 24·7 per cent, failure rate 2·4 per cent). Conclusion Women in the UK had a disproportionate risk of admission without surgical intervention and had high rates of normal appendicectomy. Risk prediction models to support shared decision‐making by identifying adults in the UK at low risk of appendicitis were identified

    Clinicalcodes: an online clinical codes repository to improve the validity and reproducibility of research using electronic medical records

    Get PDF
    Lists of clinical codes are the foundation for research undertaken using electronic medical records (EMRs). If clinical code lists are not available, reviewers are unable to determine the validity of research, full study replication is impossible, researchers are unable to make effective comparisons between studies, and the construction of new code lists is subject to much duplication of effort. Despite this, the publication of clinical codes is rarely if ever a requirement for obtaining grants, validating protocols, or publishing research. In a representative sample of 450 EMR primary research articles indexed on PubMed, we found that only 19 (5.1%) were accompanied by a full set of published clinical codes and 32 (8.6%) stated that code lists were available on request. To help address these problems, we have built an online repository where researchers using EMRs can upload and download lists of clinical codes. The repository will enable clinical researchers to better validate EMR studies, build on previous code lists and compare disease definitions across studies. It will also assist health informaticians in replicating database studies, tracking changes in disease definitions or clinical coding practice through time and sharing clinical code information across platforms and data sources as research objects
    corecore