179 research outputs found

    Data linkage in medical research

    Get PDF

    Synthetic data in medical research

    Get PDF
    Introduction Demand to access high quality data at the individual level for medical and healthcare research is growing. Electronic health record data collected on whole populations can help to generate real world evidence and can be used for a range of secondary purposes, including testing new hypotheses and developing and evaluating different methodological and statistical approaches. Secondary analysis of primary research data, such as from clinical trials,1 is also valuable—for example, to conduct meta-analyses of individual participant data. However, several complex privacy requirements make accessing these data challenging.2 Information contained in electronic health records or in clinical trial data are highly sensitive and access to these datasets can be an expensive and lengthy process.3 Data privacy and protection regulations are the main barriers to accessing these data for healthcare and medical research.4 Anonymisation (where potentially identifiable variables are removed) is one way to make data available; however, intensive anonymisation can degrade the data to the extent that it is no longer fit for purpose.5 For example, adding random noise to the data reduces precision and leads to larger confidence intervals. Several reidentification attempts on anonymised data have been successful and have harmed public and regulators’ trust in such methods.6 7 For instance, one study showed that patients could be identified by matching information from patient level data that was publicly available, attributing information obtained from newspapers, and contacting those patients directly.6 Use of information from clinical trials and electronic health records of large populations has the potential to benefit medical and healthcare research and makes seeking new approaches to data access imperative. One solution is to use so-called synthetic data, or artificial data, which provide a realistic representation of the original data source. Synthetic data look like the original data source, without containing any information on any real individuals. Synthetic data can attempt to preserve some of the statistical properties of the original data source (eg, distributions of continuous data, proportions of categorical data, correlations between variables, and other model parameters)

    Using administrative data to assess early-life policies

    Get PDF

    Which people are most affected by changes to data linkage methodology? An exploration of patient, organisational and spatiotemporal characteristics in administrative hospital data in England

    Get PDF
    Objectives In 2021, NHS Digital changed the process used to link records belonging to the same person across and within data collections. Our objectives were to identify patient, organisational and spatiotemporal characteristics associated with records impacted by this change and the implications for researchers using this data. Methods We used an observational cohort study of patients, aged 55 or less, with a secondary care contact recorded in any of the NHS Digital (now part of NHS England) curated Hospital Episode Statistics (HES) datasets between April 1997 and March 2021. We compared clusters of records assigned to each patient using the HES ID (old methodology using a three-step deterministic algorithm) and the Person ID (new methodology using a master patient spine). We used multivariable logistic regression to identify patient, organisational and spatiotemporal (such as area-level deprivation and year of first contact) characteristics associated with patients whose cluster had changed. Results Of 88 million hospital records in 2019, there were 18,968,711 distinct HES IDs and 18,717,142 distinct TPIs. Of the 12,701,169 HES IDs with more than one record, 145,948 (1.1%) were split into multiple Person IDs. Of the 12,999,671 Person IDs with more than one record, 483,091 (3.7%) were associated with two or more merged HES IDs. We will present an analysis using data covering the period April 1997 to March 2021 - 1.25 billion records - and present the characteristics associated with changes between linkage methods. Conclusion Our findings indicate that this change consolidated clusters, resulting in fewer distinct individuals in the data. Our findings will inform researchers about which groups of individuals are most likely to be affected by changes to linkage methodology. This is vital for understanding potential sources of bias due to linkage error

    'Pseudonymisation at source' undermines accuracy of record linkage

    Get PDF

    Using linked administrative data for monitoring and evaluating the Family Nurse Partnership in England: A scoping report

    Get PDF
    This report, commissioned by the FNP National Unit and undertaken by researchers at UCL and the London School of Hygiene and Tropical Medicine, presents a scoping review of how population-based linkage between data from the Family Nurse Partnership (FNP) in England and administrative datasets from other services could be used to generate evidence for commissioning, service evaluation and research. It addresses the methodological considerations, permission pathways and technical challenges of using data from the FNP linked with routinely collected, administrative data from other public services for population-based analyses, at a national and local authority level. Our ambition, when commissioning this work, was to explore whether linking data from FNP with administrative datasets might help provide a richer view about how the FNP intervention is affecting different cohorts of clients and their child after they have graduated. The report suggests that the potential for data linkage to support ongoing evaluation of a wide range of interventions including FNP at a national level is promising and an important area to explore. It makes a significant contribution to understanding the possibilities and constraints for doing this, which include barriers to data linkage at a local level (which we know is crucial for local commissioners) and the significant investment required to realise the potential of this project. We believe this report offers valuable insights other organisations interested in the delivery of evidence based policy may want to pursue
    • …
    corecore