56 research outputs found

    A note on obtaining correct marginal predictions from a random intercepts model for binary outcomes.

    Get PDF
    BACKGROUND: Clustered data with binary outcomes are often analysed using random intercepts models or generalised estimating equations (GEE) resulting in cluster-specific or 'population-average' inference, respectively. METHODS: When a random effects model is fitted to clustered data, predictions may be produced for a member of an existing cluster by using estimates of the fixed effects (regression coefficients) and the random effect for the cluster (conditional risk calculation), or for a member of a new cluster (marginal risk calculation). We focus on the second. Marginal risk calculation from a random effects model is obtained by integrating over the distribution of random effects. However, in practice marginal risks are often obtained, incorrectly, using only estimates of the fixed effects (i.e. by effectively setting the random effects to zero). We compare these two approaches to marginal risk calculation in terms of model calibration. RESULTS: In simulation studies, it has been seen that use of the incorrect marginal risk calculation from random effects models results in poorly calibrated overall marginal predictions (calibration slope <1 and calibration in the large ≠ 0) with mis-calibration becoming worse with higher degrees of clustering. We clarify that this was due to the incorrect calculation of marginal predictions from a random intercepts model and explain intuitively why this approach is incorrect. We show via simulation that the correct calculation of marginal risks from a random intercepts model results in predictions with excellent calibration. CONCLUSION: The logistic random intercepts model can be used to obtain valid marginal predictions by integrating over the distribution of random effects

    Sample size calculation for a stepped wedge trial.

    Get PDF
    BACKGROUND: Stepped wedge trials (SWTs) can be considered as a variant of a clustered randomised trial, although in many ways they embed additional complications from the point of view of statistical design and analysis. While the literature is rich for standard parallel or clustered randomised clinical trials (CRTs), it is much less so for SWTs. The specific features of SWTs need to be addressed properly in the sample size calculations to ensure valid estimates of the intervention effect. METHODS: We critically review the available literature on analytical methods to perform sample size and power calculations in a SWT. In particular, we highlight the specific assumptions underlying currently used methods and comment on their validity and potential for extensions. Finally, we propose the use of simulation-based methods to overcome some of the limitations of analytical formulae. We performed a simulation exercise in which we compared simulation-based sample size computations with analytical methods and assessed the impact of varying the basic parameters to the resulting sample size/power, in the case of continuous and binary outcomes and assuming both cross-sectional data and the closed cohort design. RESULTS: We compared the sample size requirements for a SWT in comparison to CRTs based on comparable number of measurements in each cluster. In line with the existing literature, we found that when the level of correlation within the clusters is relatively high (for example, greater than 0.1), the SWT requires a smaller number of clusters. For low values of the intracluster correlation, the two designs produce more similar requirements in terms of total number of clusters. We validated our simulation-based approach and compared the results of sample size calculations to analytical methods; the simulation-based procedures perform well, producing results that are extremely similar to the analytical methods. We found that usually the SWT is relatively insensitive to variations in the intracluster correlation, and that failure to account for a potential time effect will artificially and grossly overestimate the power of a study. CONCLUSIONS: We provide a framework for handling the sample size and power calculations of a SWT and suggest that simulation-based procedures may be more effective, especially in dealing with the specific features of the study at hand. In selected situations and depending on the level of intracluster correlation and the cluster size, SWTs may be more efficient than comparable CRTs. However, the decision about the design to be implemented will be based on a wide range of considerations, including the cost associated with the number of clusters, number of measurements and the trial duration

    The accuracy of clinician predictions of survival in the Prognosis in Palliative care Study II (PiPS2): A prospective observational study

    Get PDF
    BACKGROUND: Prognostic information is important for patients with cancer, their families, and clinicians. In practice, survival predictions are made by clinicians based on their experience, judgement, and intuition. Previous studies have reported that clinicians' survival predictions are often inaccurate. This study reports a secondary analysis of data from the Prognosis in Palliative care Study II (PiPS2) to assess the accuracy of survival estimates made by doctors and nurses. METHODS AND FINDINGS: Adult patients (n = 1833) with incurable, locally advanced or metastatic cancer, recently referred to palliative care services (community teams, hospital teams, and inpatient palliative care units) were recruited. Doctors (n = 431) and nurses (n = 777) provided independent prognostic predictions and an agreed multi-professional prediction for each patient. Clinicians provided prognostic estimates in several formats including predictions about length of survival and probability of surviving to certain time points. There was a minimum follow up of three months or until death (whichever was sooner; maximum follow-up 783 days). Agreed multi-professional predictions about whether patients would survive for days, weeks or months+ were accurate on 61.9% of occasions. The positive predictive value of clinicians' predictions about imminent death (within one week) was 77% for doctors and 79% for nurses. The sensitivity of these predictions was low (37% and 35% respectively). Specific predictions about how many weeks patients would survive were not very accurate but showed good discrimination (patients estimated to survive for shorted periods had worse outcomes). The accuracy of clinicians' probabilistic predictions (assessed using Brier's scores) was consistently better than chance, improved with proximity to death and showed good discrimination between groups of patients with different survival outcomes. CONCLUSIONS: Using a variety of different approaches, this study found that clinicians predictions of survival show good discrimination and accuracy, regardless of whether the predictions are about how long or how likely patients are to survive. Accuracy improves with proximity to death. Although the positive predictive value of estimates of imminent death are relatively high, the sensitivity of such predictions is relatively low. Despite limitations, the clinical prediction of survival should remain the benchmark against which any innovations in prognostication are judged. STUDY REGISTRATION: ISRCTN13688211. http://www.isrctn.com/ISRCTN13688211

    Quality research in healthcare: are researchers getting enough statistical support?

    Get PDF
    BACKGROUND: Reviews of peer-reviewed health studies have highlighted problems with their methodological quality. As published health studies form the basis of many clinical decisions including evaluation and provisions of health services, this has scientific and ethical implications. The lack of involvement of methodologists (defined as statisticians or quantitative epidemiologists) has been suggested as one key reason for this problem and this has been linked to the lack of access to methodologists. This issue was highlighted several years ago and it was suggested that more investments were needed from health care organisations and Universities to alleviate this problem. METHODS: To assess the current level of methodological support available for health researchers in England, we surveyed the 25 National Health Services Trusts in England, that are the major recipients of the Department of Health's research and development (R&D) support funding. RESULTS AND DISCUSSION: The survey shows that the earmarking of resources to provide appropriate methodological support to health researchers in these organisations is not widespread. Neither the level of R&D support funding received nor the volume of research undertaken by these organisations showed any association with the amount they spent in providing a central resource for methodological support for their researchers. CONCLUSION: The promotion and delivery of high quality health research requires that organisations hosting health research and their academic partners put in place funding and systems to provide appropriate methodological support to ensure valid research findings. If resources are limited, health researchers may have to rely on short courses and/or a limited number of advisory sessions which may not always produce satisfactory results

    Stepped wedge randomised controlled trials: systematic review of studies published between 2010 and 2014.

    Get PDF
    BACKGROUND: In a stepped wedge, cluster randomised trial, clusters receive the intervention at different time points, and the order in which they received it is randomised. Previous systematic reviews of stepped wedge trials have documented a steady rise in their use between 1987 and 2010, which was attributed to the design's perceived logistical and analytical advantages. However, the interventions included in these systematic reviews were often poorly reported and did not adequately describe the analysis and/or methodology used. Since 2010, a number of additional stepped wedge trials have been published. This article aims to update previous systematic reviews, and consider what interventions were tested and the rationale given for using a stepped wedge design. METHODS: We searched PubMed, PsychINFO, the Cumulative Index to Nursing and Allied Health Literature (CINAHL), the Web of Science, the Cochrane Library and the Current Controlled Trials Register for articles published between January 2010 and May 2014. We considered stepped wedge randomised controlled trials in all fields of research. We independently extracted data from retrieved articles and reviewed them. Interventions were then coded using the functions specified by the Behaviour Change Wheel, and for behaviour change techniques using a validated taxonomy. RESULTS: Our review identified 37 stepped wedge trials, reported in 10 articles presenting trial results, one conference abstract, 21 protocol or study design articles and five trial registrations. These were mostly conducted in developed countries (n = 30), and within healthcare organisations (n = 28). A total of 33 of the interventions were educationally based, with the most commonly used behaviour change techniques being 'instruction on how to perform a behaviour' (n = 32) and 'persuasive source' (n = 25). Authors gave a wide range of reasons for the use of the stepped wedge trial design, including ethical considerations, logistical, financial and methodological. The adequacy of reporting varied across studies: many did not provide sufficient detail regarding the methodology or calculation of the required sample size. CONCLUSIONS: The popularity of stepped wedge trials has increased since 2010, predominantly in high-income countries. However, there is a need for further guidance on their reporting and analysis

    Estimation of required sample size for external validation of risk models for binary outcomes

    Get PDF
    Risk-prediction models for health outcomes are used in practice as part of clinical decision-making, and it is essential that their performance be externally validated. An important aspect in the design of a validation study is choosing an adequate sample size. In this paper, we investigate the sample size requirements for validation studies with binary outcomes to estimate measures of predictive performance (C-statistic for discrimination and calibration slope and calibration in the large). We aim for sufficient precision in the estimated measures. In addition, we investigate the sample size to achieve sufficient power to detect a difference from a target value. Under normality assumptions on the distribution of the linear predictor, we obtain simple estimators for sample size calculations based on the measures above. Simulation studies show that the estimators perform well for common values of the C-statistic and outcome prevalence when the linear predictor is marginally Normal. Their performance deteriorates only slightly when the normality assumptions are violated. We also propose estimators which do not require normality assumptions but require specification of the marginal distribution of the linear predictor and require the use of numerical integration. These estimators were also seen to perform very well under marginal normality. Our sample size equations require a specified standard error (SE) and the anticipated C-statistic and outcome prevalence. The sample size requirement varies according to the prognostic strength of the model, outcome prevalence, choice of the performance measure and study objective. For example, to achieve an SE < 0.025 for the C-statistic, 60–170 events are required if the true C-statistic and outcome prevalence are between 0.64–0.85 and 0.05–0.3, respectively. For the calibration slope and calibration in the large, achieving SE < 0.15 would require 40–280 and 50–100 events, respectively. Our estimators may also be used for survival outcomes when the proportion of censored observations is high

    Prediction of thrombo-embolic risk in patients with hypertrophic cardiomyopathy (HCM Risk-CVA)

    Get PDF
    Aims Atrial fibrillation (AF) and thrombo-embolism (TE) are associated with reduced survival in hypertrophic cardiomyopathy (HCM), but the absolute risk of TE in patients with and without AF is unclear. The primary aim of this study was to derive and validate a model for estimating the risk of TE in HCM. Exploratory analyses were performed to determine predictors of TE, the performance of the CHA2DS2-VASc score, and outcome with vitamin K antagonists (VKAs). Methods and results A retrospective, longitudinal cohort of seven institutions was used to develop multivariable Cox regression models fitted with pre-selected predictors. Bootstrapping was used for validation. Of 4821 HCM patients recruited between 1986 and 2008, 172 (3.6%) reached the primary endpoint of cerebrovascular accident (CVA), transient ischaemic attack (TIA), or systemic peripheral embolus within 10 years. A total of 27.5% of patients had a CHA2DS2-VASc score of 0, of whom 9.8% developed TE during follow-up. Cox regression revealed an association between TE and age, AF, the interaction between age and AF, TE prior to first evaluation, NYHA class, left atrial (LA) diameter, vascular disease, and maximal LV wall thickness. There was a curvilinear relationship between LA size and TE risk. The model predicted TE with a C-index of 0.75 [95% confidence interval (CI) 0.70-0.80] and the D-statistic was 1.30 (95% CI 1.05-1.56). VKA treatment was associated with a 54.8% (95% CI 31-97%, P = 0.037) relative risk reduction in HCM patients with AF. Conclusions The study shows that the risk of TE in HCM patients can be identified using a small number of simple clinical features. LA size, in particular, should be monitored closely, and the assessment and treatment of conventional vascular risk factors should be routine practice in older patients. Exploratory analyses show for the first time evidence for a reduction of TE with VKA treatment. The CHA2DS2-VASc score does not appear to correlate well with the clinical outcome in patients with HCM and should not be used to assess TE risk in this population

    Safety and Efficacy of Liraglutide, 3.0 mg, Once Daily vs Placebo in Patients With Poor Weight Loss Following Metabolic Surgery: The BARI-OPTIMISE Randomized Clinical Trial

    Get PDF
    IMPORTANCE: Metabolic surgery leads to weight loss and improved health, but these outcomes are highly variable. Poor weight loss is associated with lower circulating levels of glucagon-like peptide-1 (GLP-1). OBJECTIVE: To assess the efficacy and safety of the GLP-1 receptor agonist, liraglutide, 3.0 mg, on percentage body weight reduction in patients with poor weight loss and suboptimal GLP-1 response after metabolic surgery. DESIGN, SETTING, AND PARTICIPANTS: The Evaluation of Liraglutide 3.0 mg in Patients With Poor Weight Loss and a Suboptimal Glucagon-Like Peptide-1 Response (BARI-OPTIMISE) randomized placebo-controlled trial recruited adult patients at least 1 year after metabolic surgery who had experienced 20% or less body weight loss from the day of surgery and a suboptimal nutrient-stimulated GLP-1 response from 2 hospitals in London, United Kingdom, between October 2018 and November 2019. Key exclusion criteria were type 1 diabetes; severe concomitant psychiatric, gastrointestinal, cardiac, kidney or metabolic disease; and use of insulin, GLP-1 receptor analogues, and medication that can affect weight. The study period was 24 weeks followed by a 4-week follow-up period. Last participant follow-up was completed in June 2020. All participants and clinical study personnel were blinded to treatment allocation. Of 154 assessed for eligibility, 70 met trial criteria and were included in the study, and 57 completed follow-up. INTERVENTIONS: Liraglutide, 3.0 mg, once daily or placebo as an adjunct to lifestyle intervention with a 500-kcal daily energy deficit for 24 weeks, on a 1:1 allocation by computer-generated randomization sequence, stratified by surgery type (Roux-en-Y gastric bypass [RYGB] or sleeve gastrectomy [SG]) and type 2 diabetes status. MAIN OUTCOME AND MEASURES: The primary outcome was change in percentage body weight from baseline to the end of the 24-week study period based on an intention-to-treat analysis. Participant safety was assessed through monitoring of biochemical parameters, including kidney and liver function, physical examination, and assessment for adverse events. RESULTS: A total of 70 participants (mean [SD] age, 47.6 [10.7] years; 52 [74%] female) with a poor weight loss response following RYGB or SG were randomized to receive 3.0-mg liraglutide (n = 35) or placebo (n = 35). All participants received at least 1 dose of the trial drug. Eight participants discontinued treatment (4 per group), and 2 in the 3.0-mg liraglutide group and 1 in the placebo group were lost to follow-up. Due to COVID-19 restrictions, 3 participants in the 3.0-mg liraglutide group and 7 in the placebo group were unable to attend their final in-person assessment. Estimated change in mean (SD) percentage body weight from baseline to week 24 was -8.82 (4.94) with liraglutide, 3.0 mg (n = 31), vs -0.54 (3.32) with placebo (n = 26). The mean difference in percentage body weight change for liraglutide, 3.0 mg, vs placebo was -8.03 (95% CI, -10.39 to -5.66; P < .001). Adverse events, predominantly gastrointestinal, were more frequent with liraglutide, 3.0 mg (28 events [80%]), than placebo (20 events [57%]). There were no serious adverse events and no treatment-related deaths. CONCLUSION AND RELEVANCE: These findings support the use of adjuvant liraglutide, 3.0 mg, for weight management in patients with poor weight loss and suboptimal GLP-1 response after metabolic surgery. TRIAL REGISTRATION: ClinicalTrials.gov Identifier: NCT03341429

    Community Occupational Therapy for people with dementia and family carers (COTiD-UK) versus treatment as usual (Valuing Active Life in Dementia [VALID]) study: A single-blind, randomised controlled trial.

    Get PDF
    BACKGROUND: We aimed to estimate the clinical effectiveness of Community Occupational Therapy for people with dementia and family carers-UK version (Community Occupational Therapy in Dementia-UK version [COTiD-UK]) relative to treatment as usual (TAU). We hypothesised that COTiD-UK would improve the ability of people with dementia to perform activities of daily living (ADL), and family carers' sense of competence, compared with TAU. METHODS AND FINDINGS: The study design was a multicentre, 2-arm, parallel-group, assessor-masked, individually randomised controlled trial (RCT) with internal pilot. It was conducted in 15 sites across England from September 2014 to January 2018. People with a diagnosis of mild to moderate dementia living in their own home were recruited in pairs with a family carer who provided domestic or personal support for at least 4 hours per week. Pairs were randomised to either receive COTiD-UK, which comprised 10 hours of occupational therapy delivered over 10 weeks in the person with dementia's home or TAU, which comprised the usual local service provision that may or may not include standard occupational therapy. The primary outcome was the Bristol Activities of Daily Living Scale (BADLS) score at 26 weeks. Secondary outcomes for the person with dementia included the following: the BADLS scores at 52 and 78 weeks, cognition, quality of life, and mood; and for the family carer: sense of competence and mood; plus the number of social contacts and leisure activities for both partners. Participants were analysed by treatment allocated. A total of 468 pairs were recruited: people with dementia ranged from 55 to 97 years with a mean age of 78.6 and family carers ranged from 29 to 94 with a mean of 69.1 years. Of the people with dementia, 74.8% were married and 19.2% lived alone. Of the family carers, 72.6% were spouses, and 22.2% were adult children. On randomisation, 249 pairs were assigned to COTiD-UK (62% people with dementia and 23% carers were male) and 219 to TAU (52% people with dementia and 32% carers were male). At the 26 weeks follow-up, data were available for 364 pairs (77.8%). The BADLS score at 26 weeks did not differ significantly between groups (adjusted mean difference estimate 0.35, 95% CI -0.81 to 1.51; p = 0.55). Secondary outcomes did not differ between the groups. In total, 91% of the activity-based goals set by the pairs taking part in the COTiD-UK intervention were fully or partially achieved by the final COTiD-UK session. Study limitations include the following: Intervention fidelity was moderate but varied across and within sites, and the reliance on primarily proxy data focused on measuring the level of functional or cognitive impairment which may not truly reflect the actual performance and views of the person living with dementia. CONCLUSIONS: Providing community occupational therapy as delivered in this study did not improve ADL performance, cognition, quality of life, or mood in people with dementia nor sense of competence or mood in family carers. Future research should consider measuring person-centred outcomes that are more meaningful and closely aligned to participants' priorities, such as goal achievement or the quantity and quality of activity engagement and participation. TRIAL REGISTRATION: Current Controlled Trials ISRCTN10748953
    • …
    corecore