13 research outputs found

    How much data are required to develop and validate a risk prediction model?

    Get PDF
    It has been suggested that when developing risk prediction models using regression, the number of events in the dataset should be at least 10 times the number of parameters being estimated by the model. This rule was originally proposed to ensure the unbiased estimation of regression coefficients with confidence intervals that have correct coverage. However, only limited research has been conducted to assess the adequacy of this rule with regards to predictive performance. Furthermore, there is only limited guidance regarding the number of events required to develop risk prediction models using hierarchical data, for example when one has observations from several hospitals. One of the aims of this dissertation is to determine the number of events required to obtain reliable predictions from standard or hierarchical models for binary outcomes. This will be achieved by conducting several simulation studies based on real clinical data. It has also been suggested that when validating risk prediction models, there should be at least 100 events in the validation dataset. However, few studies have examined the adequacy of this recommendation. Furthermore, there are no guidelines regarding the sample size requirements when validating a risk prediction model based on hierarchical data. The second main aim of this dissertation is to investigate the sample size requirements for model validation using both simulation and analytical methods. In particular we will derive the relationship between sample size and the precision of some common measures of model performance such as the C statistic, D statistic, and calibration slope. The results from this dissertation will enable researchers to better assess their sample size requirements when developing and validating prediction models using both standard (independent) and clustered data

    A tutorial comparing different covariate balancing methods with an application evaluating the causal effects of substance use treatment programs for adolescents

    Get PDF
    Randomized controlled trials are the gold standard for measuring causal effects. However, they are often not always feasible, and causal treatment effects must be estimated from observational data. Observational studies do not allow robust conclusions about causal relationships unless statistical techniques account for the imbalance of pretreatment confounders across groups and key assumptions hold. Propensity score and balance weighting (PSBW) are useful techniques that aim to reduce the observed imbalances between treatment groups by weighting the groups to look alike on the observed confounders. Notably, there are many methods available to estimate PSBW. However, it is unclear a priori which will achieve the best trade-off between covariate balance and effective sample size for a given application. Moreover, it is critical to assess the validity of key assumptions required for robust estimation of the needed treatment effects, including the overlap and no unmeasured confounding assumptions. We present a step-by-step guide to the use of PSBW for estimation of causal treatment effects that includes steps on how to evaluate overlap before the analysis, obtain estimates of PSBW using multiple methods and select the optimal one, check for covariate balance on multiple metrics, and assess sensitivity of findings (both the estimated treatment effect and statistical significance) to unobserved confounding. We illustrate the key steps using a case study examining the relative effectiveness of substance use treatment programs and provide a user-friendly Shiny application that can implement the proposed steps for any application with binary treatments

    A tutorial comparing different covariate balancing methods with an application evaluating the causal effect of exercise on the progression of Huntington’s disease

    Get PDF
    Randomized controlled trials are the gold standard for measuring the causal effects of treatments on clinical outcomes. However, randomized trials are not always feasible, and causal treatment effects must, therefore, often be inferred from observational data. Observational study designs do not allow conclusions about causal relationships to be drawn unless statistical techniques are used to account for the imbalance of confounders across groups while key assumptions hold. Propensity score (PS) and balance weighting are two useful techniques that aim to reduce the imbalances between treatment groups by weighting the groups to look alike on the observed confounders. There are many methods available to estimate PSand balancing weights. However, it is unclear a priori which will achieve the best trade-off between covariate balance and effective sample size. Weighted analyses are further complicated by small studies with limited sample sizes, which is common when studying rare diseases. To address these issues, we present a step-by-step guide to covariate balancing strategies, including how to evaluate overlap, obtain estimates of PS and balancing weights, check for covariate balance, and assess sensitivity to unobserved confounding. We compare the performance of a number of commonly used estimation methods on a synthetic data set based on the Physical Activity and Exercise Outcomes in Huntington Disease (PACE-HD) study, which explored whether enhanced physical activity affects the progression and severity of the disease. We provide general guidelines for the choice of method for estimation of PS and balancing weights, interpretation, and sensitivity analysis of results. We also present R code for implementing the different methods and assessing balanc

    In-depth phenotyping for clinical stratification of Gaucher disease.

    Get PDF
    BackgroundThe Gaucher Investigative Therapy Evaluation is a national clinical cohort of 250 patients aged 5-87 years with Gaucher disease in the United Kingdom-an ultra-rare genetic disorder. To inform clinical decision-making and improve pathophysiological understanding, we characterized the course of Gaucher disease and explored the influence of costly innovative medication and other interventions. Retrospective and prospective clinical, laboratory and radiological information including molecular analysis of the GBA1 gene and comprising > 2500 variables were collected systematically into a relational database with banking of collated biological samples in a central bioresource. Data for deep phenotyping and life-quality evaluation, including skeletal, visceral, haematological and neurological manifestations were recorded for a median of 17.3 years; the skeletal and neurological manifestations are the main focus of this study.ResultsAt baseline, 223 of the 250 patients were classified as type 1 Gaucher disease. Skeletal manifestations occurred in most patients in the cohort (131 of 201 specifically reported bone pain). Symptomatic osteonecrosis and fragility fractures occurred respectively in 76 and 37 of all 250 patients and the first osseous events occurred significantly earlier in those with neuronopathic disease. Intensive phenotyping in a subgroup of 40 patients originally considered to have only systemic features, revealed neurological involvement in 18: two had Parkinson disease and 16 had clinical signs compatible with neuronopathic Gaucher disease-indicating a greater than expected prevalence of neurological features. Analysis of longitudinal real-world data enabled Gaucher disease to be stratified with respect to advanced therapies and splenectomy. Splenectomy was associated with an increased hazard of fragility fractures, in addition to osteonecrosis and orthopaedic surgery; there were marked gender differences in fracture risk over time since splenectomy. Skeletal disease was a heavy burden of illness, especially where access to specific therapy was delayed and in patients requiring orthopaedic surgery.ConclusionGaucher disease has been explored using real-world data obtained in an era of therapeutic transformation. Introduction of advanced therapies and repeated longitudinal measures enabled this heterogeneous condition to be stratified into obvious clinical endotypes. The study reveals diverse and changing phenotypic manifestations with systemic, skeletal and neurological disease as inter-related sources of disability

    Cohort profile: The UK COVID-19 Public Experiences (COPE) prospective longitudinal mixed-methods study of health and well-being during the SARSCoV2 coronavirus pandemic

    Get PDF
    Public perceptions of pandemic viral threats and government policies can influence adherence to containment, delay, and mitigation policies such as physical distancing, hygienic practices, use of physical barriers, uptake of testing, contact tracing, and vaccination programs. The UK COVID-19 Public Experiences (COPE) study aims to identify determinants of health behaviour using the Capability, Opportunity, Motivation (COM-B) model using a longitudinal mixed-methods approach. Here, we provide a detailed description of the demographic and self-reported health characteristics of the COPE cohort at baseline assessment, an overview of data collected, and plans for follow-up of the cohort. The COPE baseline survey was completed by 11,113 UK adult residents (18+ years of age). Baseline data collection started on the 13th of March 2020 (10-days before the introduction of the first national COVID-19 lockdown in the UK) and finished on the 13th of April 2020. Participants were recruited via the HealthWise Wales (HWW) research registry and through social media snowballing and advertising (Facebook®, Twitter®, Instagram®). Participants were predominantly female (69%), over 50 years of age (68%), identified as white (98%), and were living with their partner (68%). A large proportion (67%) had a college/university level education, and half reported a pre-existing health condition (50%). Initial follow-up plans for the cohort included in-depth surveys at 3-months and 12-months after the first UK national lockdown to assess short and medium-term effects of the pandemic on health behaviour and subjective health and well-being. Additional consent will be sought from participants at follow-up for data linkage and surveys at 18 and 24-months after the initial UK national lockdown. A large non-random sample was recruited to the COPE cohort during the early stages of the COVID-19 pandemic, which will enable longitudinal analysis of the determinants of health behaviour and changes in subjective health and well-being over the course of the pandemic

    In-depth phenotyping for clinical stratification of Gaucher disease

    Get PDF
    Abstract: Background: The Gaucher Investigative Therapy Evaluation is a national clinical cohort of 250 patients aged 5–87 years with Gaucher disease in the United Kingdom—an ultra-rare genetic disorder. To inform clinical decision-making and improve pathophysiological understanding, we characterized the course of Gaucher disease and explored the influence of costly innovative medication and other interventions. Retrospective and prospective clinical, laboratory and radiological information including molecular analysis of the GBA1 gene and comprising > 2500 variables were collected systematically into a relational database with banking of collated biological samples in a central bioresource. Data for deep phenotyping and life-quality evaluation, including skeletal, visceral, haematological and neurological manifestations were recorded for a median of 17.3 years; the skeletal and neurological manifestations are the main focus of this study. Results: At baseline, 223 of the 250 patients were classified as type 1 Gaucher disease. Skeletal manifestations occurred in most patients in the cohort (131 of 201 specifically reported bone pain). Symptomatic osteonecrosis and fragility fractures occurred respectively in 76 and 37 of all 250 patients and the first osseous events occurred significantly earlier in those with neuronopathic disease. Intensive phenotyping in a subgroup of 40 patients originally considered to have only systemic features, revealed neurological involvement in 18: two had Parkinson disease and 16 had clinical signs compatible with neuronopathic Gaucher disease—indicating a greater than expected prevalence of neurological features. Analysis of longitudinal real-world data enabled Gaucher disease to be stratified with respect to advanced therapies and splenectomy. Splenectomy was associated with an increased hazard of fragility fractures, in addition to osteonecrosis and orthopaedic surgery; there were marked gender differences in fracture risk over time since splenectomy. Skeletal disease was a heavy burden of illness, especially where access to specific therapy was delayed and in patients requiring orthopaedic surgery. Conclusion: Gaucher disease has been explored using real-world data obtained in an era of therapeutic transformation. Introduction of advanced therapies and repeated longitudinal measures enabled this heterogeneous condition to be stratified into obvious clinical endotypes. The study reveals diverse and changing phenotypic manifestations with systemic, skeletal and neurological disease as inter-related sources of disability

    Characterization of antimicrobial resistant Gram-negative bacteria that cause neonatal sepsis in seven low and middle-income countries

    Get PDF
    Antimicrobial resistance in neonatal sepsis is rising, yet mechanisms of resistance that often spread between species via mobile genetic elements, ultimately limiting treatments in low- and middle-income countries (LMICs), are poorly characterized. The Burden of Antibiotic Resistance in Neonates from Developing Societies (BARNARDS) network was initiated to characterize the cause and burden of antimicrobial resistance in neonatal sepsis for seven LMICs in Africa and South Asia. A total of 36,285 neonates were enrolled in the BARNARDS study between November 2015 and December 2017, of whom 2,483 were diagnosed with culture-confirmed sepsis. Klebsiella pneumoniae (n = 258) was the main cause of neonatal sepsis, with Serratia marcescens (n = 151), Klebsiella michiganensis (n = 117), Escherichia coli (n = 75) and Enterobacter cloacae complex (n = 57) also detected. We present whole-genome sequencing, antimicrobial susceptibility and clinical data for 916 out of 1,038 neonatal sepsis isolates (97 isolates were not recovered from initial isolation at local sites). Enterobacterales (K. pneumoniae, E. coli and E. cloacae) harboured multiple cephalosporin and carbapenem resistance genes. All isolated pathogens were resistant to multiple antibiotic classes, including those used to treat neonatal sepsis. Intraspecies diversity of K. pneumoniae and E. coli indicated that multiple antibiotic-resistant lineages cause neonatal sepsis. Our results will underpin research towards better treatments for neonatal sepsis in LMICs

    Neonatal sepsis and mortality in low-income and middle-income countries from a facility-based birth cohort: an international multisite prospective observational study

    Get PDF
    Background Neonatal sepsis is a primary cause of neonatal mortality and is an urgent global health concern, especially within low-income and middle-income countries (LMICs), where 99% of global neonatal mortality occurs. The aims of this study were to determine the incidence and associations with neonatal sepsis and all-cause mortality in facility-born neonates in LMICs. Methods The Burden of Antibiotic Resistance in Neonates from Developing Societies (BARNARDS) study recruited mothers and their neonates into a prospective observational cohort study across 12 clinical sites from Bangladesh, Ethiopia, India, Pakistan, Nigeria, Rwanda, and South Africa. Data for sepsis-associated factors in the four domains of health care, maternal, birth and neonatal, and living environment were collected for all mothers and neonates enrolled. Primary outcomes were clinically suspected sepsis, laboratory-confirmed sepsis, and all-cause mortality in neonates during the first 60 days of life. Incidence proportion of livebirths for clinically suspected sepsis and laboratory-confirmed sepsis and incidence rate per 1000 neonate-days for all-cause mortality were calculated. Modified Poisson regression was used to investigate factors associated with neonatal sepsis and parametric survival models for factors associated with all-cause mortality. Findings Between Nov 12, 2015 and Feb 1, 2018, 29 483 mothers and 30 557 neonates were enrolled. The incidence of clinically suspected sepsis was 166·0 (95% CI 97·69–234·24) per 1000 livebirths, laboratory-confirmed sepsis was 46·9 (19·04–74·79) per 1000 livebirths, and all-cause mortality was 0·83 (0·37–2·00) per 1000 neonate-days. Maternal hypertension, previous maternal hospitalisation within 12 months, average or higher monthly household income, ward size (>11 beds), ward type (neonatal), living in a rural environment, preterm birth, perinatal asphyxia, and multiple births were associated with an increased risk of clinically suspected sepsis, laboratory-confirmed sepsis, and all-cause mortality. The majority (881 [72·5%] of 1215) of laboratory-confirmed sepsis cases occurred within the first 3 days of life. Interpretation Findings from this study highlight the substantial proportion of neonates who develop neonatal sepsis, and the high mortality rates among neonates with sepsis in LMICs. More efficient and effective identification of neonatal sepsis is needed to target interventions to reduce its incidence and subsequent mortality in LMICs. Funding Bill & Melinda Gates Foundation
    corecore