136 research outputs found
A novel framework for assessing metadata quality in epidemiological and public health research settings
Metadata are critical in epidemiological and public health research. However, a lack of biomedical metadata quality frameworks and limited awareness of the implications of poor quality metadata renders data analyses problematic. In this study, we created and evaluated a novel framework to assess metadata quality of epidemiological and public health research datasets. We performed a literature review and surveyed stakeholders to enhance our understanding of biomedical metadata quality assessment. The review identified 11 studies and nine quality dimensions; none of which were specifically aimed at biomedical metadata. 96 individuals completed the survey; of those who submitted data, most only assessed metadata quality sometimes, and eight did not at all. Our framework has four sections: a) general information; b) tools and technologies; c) usability; and d) management and curation. We evaluated the framework using three test cases and sought expert feedback. The framework can assess biomedical metadata quality systematically and robustly
Analyzing the heterogeneity of rule-based EHR phenotyping algorithms in CALIBER and the UK Biobank
Electronic Health Records (EHR) are data
generated during routine interactions across
healthcare settings and contain rich, longitudinal
information on diagnoses, symptoms, medications,
investigations and tests. A primary use-case for
EHR is the creation of phenotyping algorithms
used to identify disease status, onset and
progression or extraction of information on risk
factors or biomarkers. Phenotyping however is
challenging since EHR are collected for different
purposes, have variable data quality and often
require significant harmonization. While
considerable effort goes into the phenotyping
process, no consistent methodology for
representing algorithms exists in the UK. Creating
a national repository of curated algorithms can
potentially enable algorithm dissemination and
reuse by the wider community. A critical first step
is the creation of a robust minimum information
standard for phenotyping algorithm components
(metadata, implementation logic, validation
evidence) which involves identifying and
reviewing the complexity and heterogeneity of
current UK EHR algorithms. In this study, we
analyzed all available EHR phenotyping algorithms
(n=70) from two large-scale contemporary EHR
resources in the UK (CALIBER and UK Biobank).
We documented EHR sources, controlled clinical
terminologies, evidence of algorithm validation,
representation and implementation logic patterns.
Understanding the heterogeneity of UK EHR
algorithms and identifying common implementation patterns will facilitate the design of
a minimum information standard for representing
and curating algorithms nationally and
internationally
Selective recruitment designs for improving observational studies using electronic health records
Large‐scale electronic health records (EHRs) present an opportunity to quickly identify suitable individuals in order to directly invite them to participate in an observational study. EHRs can contain data from millions of individuals, raising the question of how to optimally select a cohort of size n from a larger pool of size N . In this article, we propose a simple selective recruitment protocol that selects a cohort in which covariates of interest tend to have a uniform distribution. We show that selectively recruited cohorts potentially offer greater statistical power and more accurate parameter estimates than randomly selected cohorts. Our protocol can be applied to studies with multiple categorical and continuous covariates. We apply our protocol to a numerically simulated prospective observational study using an EHR database of stable acute coronary disease patients from 82 089 individuals in the U.K. Selective recruitment designs require a smaller sample size, leading to more efficient and cost‐effective studies
Allergic disease, corticosteroid use and risk of Hodgkin's lymphoma: A UK Nationwide case-control study
BACKGROUND: Immunodeficiency syndromes (acquired/congenital/iatrogenic) are known to increase Hodgkin's lymphoma (HL) risk, but the effect of allergic immune dysregulation and corticosteroids are poorly understood. OBJECTIVE: To assess the risk of HL associated with allergic disease (asthma, eczema and allergic rhinitis) and corticosteroid use. METHODS: We conducted a case-control study using the UK Clinical Practice Research Datalink (CPRD) linked to hospital data. Multivariable logistic regression investigated associations between allergic diseases and HL after adjusting for established risk factors. Potential confounding or effect modification by steroid treatment were examined. RESULTS: 1,236 cases of HL were matched to 7,416 controls. Immunosuppression was associated with 6-fold greater odds of HL (Adjusted Odds Ratio (AOR), 6.18; 95%CI, 3.04-12.57), with minimal change after adjusting for steroids. Any prior allergic disease or eczema alone were associated with 1.4-fold increased odds of HL (AOR, 1.41; 95%CI, 1.24-1.60; AOR, 1.41; 95%CI, 1.20-1.65, respectively). These associations decreased but remained significant after adjustment for steroids (AOR, 1.25; 95%CI, 1.09-1.43; AOR, 1.27; 95%CI, 1.08-1.49, respectively). There was no effect modification by steroid use. Previous steroid treatment was associated with 1.4-fold greater HL odds (AOR, 1.38; 95%CI, 1.20-1.59). CONCLUSIONS: In addition to established risk factors (immunosuppression and infectious mononucleosis), allergic disease and eczema are risk factors for developing HL. This association is only partially explained by steroids, which are associated with increased HL risk. These findings add to the growing evidence that immune system malfunction, following allergic disease or immunosuppression, is central to HL development
Accuracy of probabilistic record linkage applied to the Brazilian 100 million cohort project
This paper presents some current results obtained
from our probabilistic record linkage methods applied to the
integration of a 100 million cohort composed by socioeconomic
data with health databases
Polygenic risk scores for coronary artery disease and subsequent event risk amongst established cases
BACKGROUND: There is growing evidence that polygenic risk scores (PRS) can identify individuals with elevated lifetime risk of coronary artery disease (CAD). Whether they can also be used to stratify risk of subsequent events among those surviving a first CAD event remains uncertain, with possible biological differences between CAD onset and progression, and the potential for index event bias. METHODS: Using two baseline subsamples of UK Biobank; prevalent CAD cases (N = 10 287) and individuals without CAD (N = 393 108), we evaluated associations between a CAD PRS and incident cardiovascular and fatal outcomes. RESULTS: A 1 S.D. higher PRS was associated with increased risk of incident MI in participants without CAD (OR 1.33; 95% C.I. 1.29, 1.38), but the effect estimate was markedly attenuated in those with prevalent CAD (OR 1.15; 95% C.I. 1.06, 1.25); heterogeneity P = 0.0012. Additionally, among prevalent CAD cases, we found evidence of an inverse association between the CAD PRS and risk of all-cause death (OR 0.91; 95% C.I. 0.85, 0.98) compared to those without CAD (OR 1.01; 95% C.I. 0.99, 1.03); heterogeneity P = 0.0041. A similar inverse association was found for ischaemic stroke (Prevalent CAD (OR 0.78; 95% C.I. 0.67, 0.90); without CAD (OR 1.09; 95% C.I. 1.04, 1.15), heterogeneity P < 0.001). CONCLUSIONS: Bias induced by case stratification and survival into UK Biobank may distort associations of polygenic risk scores derived from case-control studies or populations initially free of disease. Differentiating between effects of possible biases and genuine biological heterogeneity is a major challenge in disease progression research
The association between mechanical ventilator compatible bed occupancy and mortality risk in intensive care patients with COVID-19: a national retrospective cohort study
BACKGROUND:
The literature paints a complex picture of the association between mortality risk and ICU strain.
In this study, we sought to determine if there is an association between mortality risk in intensive care units (ICU) and occupancy of beds compatible with mechanical ventilation, as a proxy for strain.
METHODS:
A national retrospective observational cohort study of 89 English hospital trusts (i.e. groups of hospitals functioning as single operational units). Seven thousand one hundred thirty-three adults admitted to an ICU in England between 2 April and 1 December, 2020 (inclusive), with presumed or confirmed COVID-19, for whom data was submitted to the national surveillance programme and met study inclusion criteria. A Bayesian hierarchical approach was used to model the association between hospital trust level (mechanical ventilation compatible), bed occupancy, and in-hospital all-cause mortality. Results were adjusted for unit characteristics (pre-pandemic size), individual patient-level demographic characteristics (age, sex, ethnicity, deprivation index, time-to-ICU admission), and recorded chronic comorbidities (obesity, diabetes, respiratory disease, liver disease, heart disease, hypertension, immunosuppression, neurological disease, renal disease).
RESULTS:
One hundred thirty-five thousand six hundred patient days were observed, with a mortality rate of 19.4 per 1000 patient days. Adjusting for patient-level factors, mortality was higher for admissions during periods of high occupancy (> 85% occupancy versus the baseline of 45 to 85%) [OR 1.23 (95% posterior credible interval (PCI): 1.08 to 1.39)]. In contrast, mortality was decreased for admissions during periods of low occupancy (< 45% relative to the baseline) [OR 0.83 (95% PCI 0.75 to 0.94)].
CONCLUSIONS:
Increasing occupancy of beds compatible with mechanical ventilation, a proxy for operational strain, is associated with a higher mortality risk for individuals admitted to ICU. Further research is required to establish if this is a causal relationship or whether it reflects strain on other operational factors such as staff. If causal, the result highlights the importance of strategies to keep ICU occupancy low to mitigate the impact of this type of resource saturation
Identification and mapping real-world data sources for heart failure, acute coronary syndrome, and atrial fibrillation
BACKGROUND: Transparent and robust real-world evidence sources are increasingly important for global health, including cardiovascular diseases. We aimed to identify global real-world data (RWD) sources for heart failure (HF), acute coronary syndrome (ACS), and atrial fibrillation (AF). METHODS: We conducted a systematic review of publications with RWD pertaining to HF, ACS, and AF (2010-2018), generating a list of unique data sources. Metadata were extracted based on the source type (e.g. electronic health records, genomics, clinical data), study design, population size, clinical characteristics, follow-up duration, outcomes, and assessment of data availability for future studies and linkage. RESULTS: Overall, 11,889 publications were retrieved for HF, 10,729 for ACS, and 6,262 for AF. From these, 322 (HF), 287 (ACS), and 220 (AF) data sources were selected for detailed review. Majority of data sources had near complete data on demographic variables (HF: 94%, ACS: 99%, and AF: 100%) and considerable data on comorbidities (HF: 77%, ACS: 93%, and AF: 97%). The least reported data categories were drug codes (HF, ACS, and AF: 10%) and caregiver involvement (HF: 6%, ACS: 1%, and AF: 1%). Only a minority of data sources provided information on access to data for other researchers (11%) or whether data could be linked to other data sources to maximize clinical impact (20%). The list and metadata for the RWD sources are publicly available at www.escardio.org/bigdata. CONCLUSIONS: This review has created a comprehensive resource of cardiovascular data sources, providing new avenues to improve future real-world research and to achieve better patient outcomes
Impact of baseline cases of cough and fever on UK COVID-19 diagnostic testing rates: estimates from the Bug Watch community cohort study [version 2; peer review: 1 approved, 1 approved with reservations]
Background: Diagnostic testing forms a major part of the UK’s response to the current coronavirus disease 2019 (COVID-19) pandemic with tests offered to anyone with a continuous cough, high temperature or anosmia. Testing capacity must be sufficient during the winter respiratory season when levels of cough and fever are high due to non-COVID-19 causes. This study aims to make predictions about the contribution of baseline cough or fever to future testing demand in the UK. /
Methods: In this analysis of the Bug Watch community cohort study, we estimated the incidence of cough or fever in England in 2018-2019. We then estimated the COVID-19 diagnostic testing rates required in the UK for baseline cough or fever cases for the period July 2020-June 2021. This was explored for different rates of the population requesting tests, four COVID-19 second wave scenarios and high and low baseline cough or fever incidence scenarios. /
Results: Under the high baseline cough or fever scenario, incidence in the UK is expected to rise rapidly from 250,708 (95%CI 181,095 - 347,080) cases per day in September to a peak of 444,660 (95%CI 353,084 - 559,988) in December. If 80% of these cases request tests, testing demand would exceed 1.4 million tests per week for five consecutive months. Demand was significantly lower in the low cough or fever incidence scenario, with 129,115 (95%CI 111,596 - 151,679) tests per day in January 2021, compared to 340,921 (95%CI 276,039 - 424,491) tests per day in the higher incidence scenario. /
Conclusions: Our results show that national COVID-19 testing demand is highly dependent on background cough or fever incidence. This study highlights that the UK’s response to the COVID-19 pandemic must ensure that a high proportion of people with symptoms request tests, and that testing capacity is sufficient to meet the high predicted demand
- …