381 research outputs found

    The analysis of data where response or selection is dependent on the variable of interest

    Get PDF
    In surveys of sensitive subjects non response may be dependent on the variable of interest, both at the unit and item levels. In some clinical and epidemiological studies, units are selected for entry on the basis of the outcome variable of interest. Both of these scenarios pose problems for statistical analysis, and standard techniques may be invalid or inefficient, except in some special cases. A new approach to the analysis of surveys of sensitive topics is developed, central to which is at least one variable which represents the enthusiasm to participate. This variable is included along with demographic variables in the calculation of a response propensity score. The score is derived as the fitted probabilities of item non-response to the question of interest. The distribution of the score for the unit non-responders is assumed equal to that of item non-responders. Response is assumed independent of the variable of interest, conditional on the score. Weights based on the score can be used to derive unbiased estimates of the distribution of the variable of interest. The bootstrap is recommended for confidence interval construction. The technique is applied to data from the National Survey of Sexual Attitudes and Lifestyles. A simplification of the technique is developed that does not use the bootstrap, and which enables users to analyse the data without knowledge of the factors affecting non-response, and using standard statistical software. To analyse the time from an initiating event to illness, a prospective study may be regarded as the optimal design. However, additional data from those already with the illness and still alive may also be available. A standard technique would be to ignore the additional data, and left-truncate the times to illness at study entry. We develop a full likelihood approach, and a weighted pseudo likelihood approach, and compare these with the standard truncated data approach. The techniques are used to fit simple models of time to illness based on data from a study of time to AIDS from HIV seroconversion

    What type of cluster randomized trial for which setting?

    Get PDF
    The cluster randomized trial allows a randomized evaluation when it is either not possible to randomize the individual or randomizing individuals would put the trial at high risk of contamination across treatment arms. There are many variations of the cluster randomized design, including the parallel design with or without baseline measures, the cluster randomized cross-over design, the stepped-wedge cluster randomized design, and more recently-developed variants such as the batched stepped-wedge design and the staircase design. Once it has been clearly established that there is a need for cluster randomization, one ever important question is which form the cluster design should take. If a design in which time is split into multiple trial periods is to be adopted (e.g. as in a stepped-wedge), researchers must decide whether the same participants should be measured in multiple trial periods (cohort sampling); or if different participants should be measured in each period (continual recruitment or cross-sectional sampling). Here we outline the different possible options and weigh up the pros and cons of the different design choices, which revolve around statistical efficiency, study logistics and the assumptions required.</p

    A review of methodology for sample size calculations in cluster randomised trials

    Get PDF
    PMCID: PMC3287737This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.PMCID: PMC3287737PMCID: PMC3287737PMCID: PMC3287737PMCID: PMC3287737PMCID: PMC3287737PMCID: PMC328773

    Review of methods for handling confounding by cluster and informative cluster size in clustered data.

    Get PDF
    Clustered data are common in medical research. Typically, one is interested in a regression model for the association between an outcome and covariates. Two complications that can arise when analysing clustered data are informative cluster size (ICS) and confounding by cluster (CBC). ICS and CBC mean that the outcome of a member given its covariates is associated with, respectively, the number of members in the cluster and the covariate values of other members in the cluster. Standard generalised linear mixed models for cluster-specific inference and standard generalised estimating equations for population-average inference assume, in general, the absence of ICS and CBC. Modifications of these approaches have been proposed to account for CBC or ICS. This article is a review of these methods. We express their assumptions in a common format, thus providing greater clarity about the assumptions that methods proposed for handling CBC make about ICS and vice versa, and about when different methods can be used in practice. We report relative efficiencies of methods where available, describe how methods are related, identify a previously unreported equivalence between two key methods, and propose some simple additional methods. Unnecessarily using a method that allows for ICS/CBC has an efficiency cost when ICS and CBC are absent. We review tools for identifying ICS/CBC. A strategy for analysis when CBC and ICS are suspected is demonstrated by examining the association between socio-economic deprivation and preterm neonatal death in Scotland

    The optimal design of stepped wedge trials with equal allocation to sequences and a comparison to other trial designs.

    Get PDF
    Background/Aims We sought to optimise the design of stepped wedge trials with an equal allocation of clusters to sequences and explored sample size comparisons with alternative trial designs. Methods We developed a new expression for the design effect for a stepped wedge trial, assuming that observations are equally correlated within clusters and an equal number of observations in each period between sequences switching to the intervention. We minimised the design effect with respect to (1) the fraction of observations before the first and after the final sequence switches (the periods with all clusters in the control or intervention condition, respectively) and (2) the number of sequences. We compared the design effect of this optimised stepped wedge trial to the design effects of a parallel cluster-randomised trial, a cluster-randomised trial with baseline observations, and a hybrid trial design (a mixture of cluster-randomised trial and stepped wedge trial) with the same total cluster size for all designs. Results We found that a stepped wedge trial with an equal allocation to sequences is optimised by obtaining all observations after the first sequence switches and before the final sequence switches to the intervention; this means that the first sequence remains in the control condition and the last sequence remains in the intervention condition for the duration of the trial. With this design, the optimal number of sequences is [Formula: see text], where [Formula: see text] is the cluster-mean correlation, [Formula: see text] is the intracluster correlation coefficient, and m is the total cluster size. The optimal number of sequences is small when the intracluster correlation coefficient and cluster size are small and large when the intracluster correlation coefficient or cluster size is large. A cluster-randomised trial remains more efficient than the optimised stepped wedge trial when the intracluster correlation coefficient or cluster size is small. A cluster-randomised trial with baseline observations always requires a larger sample size than the optimised stepped wedge trial. The hybrid design can always give an equally or more efficient design, but will be at most 5% more efficient. We provide a strategy for selecting a design if the optimal number of sequences is unfeasible. For a non-optimal number of sequences, the sample size may be reduced by allowing a proportion of observations before the first or after the final sequence has switched. Conclusion The standard stepped wedge trial is inefficient. To reduce sample sizes when a hybrid design is unfeasible, stepped wedge trial designs should have no observations before the first sequence switches or after the final sequence switches

    Methods for observed-cluster inference when cluster size is informative: a review and clarifications.

    Get PDF
    Clustered data commonly arise in epidemiology. We assume each cluster member has an outcome Y and covariates X. When there are missing data in Y, the distribution of Y given X in all cluster members ("complete clusters") may be different from the distribution just in members with observed Y ("observed clusters"). Often the former is of interest, but when data are missing because in a fundamental sense Y does not exist (e.g., quality of life for a person who has died), the latter may be more meaningful (quality of life conditional on being alive). Weighted and doubly weighted generalized estimating equations and shared random-effects models have been proposed for observed-cluster inference when cluster size is informative, that is, the distribution of Y given X in observed clusters depends on observed cluster size. We show these methods can be seen as actually giving inference for complete clusters and may not also give observed-cluster inference. This is true even if observed clusters are complete in themselves rather than being the observed part of larger complete clusters: here methods may describe imaginary complete clusters rather than the observed clusters. We show under which conditions shared random-effects models proposed for observed-cluster inference do actually describe members with observed Y. A psoriatic arthritis dataset is used to illustrate the danger of misinterpreting estimates from shared random-effects models.SRS is funded by MRC grants U1052 60558 and MC_US_A030_0015, AJC and MP by MRC grant G0600657

    Investigating the relationship between HIV testing and risk behaviour in Britain: National Survey of Sexual Attitudes and Lifestyles 2000.

    No full text
    OBJECTIVES: To estimate the prevalence of, and identify factors associated with, HIV testing in Britain. DESIGN: A large, stratified probability sample survey of sexual attitudes and lifestyles. METHODS: A total of 12,110 16-44 year olds completed a computer-assisted face-to-face interview and self-interview. Self-reports of HIV testing, i.e. the timing, reasons for and location of testing, were included. RESULTS: A total of 32.4% of men and 31.7% of women reported ever having had an HIV test, the majority of whom were tested through blood donation. When screening for blood donation and pregnancy were excluded, 9.0% of men and 4.6% of women had had a voluntary confidential HIV test (VCT) in the past 5 years. However, one third of injecting drug users and men who have sex with men had a VCT in the past 5 years. VCT in the past 5 years was significantly associated with age, residence, ethnicity, self-perceived HIV risk, reporting greater numbers of sexual partners, new sexual partners from abroad, previous sexually transmitted infection diagnosis, and injecting non-prescribed drugs for men and women, and same-sex partners (men only). Whereas sexually transmitted disease clinics were important sites for VCT, general practice accounted for almost a quarter of VCT. CONCLUSION: HIV testing is relatively common in Britain; however, it remains largely associated with population-based blood donation and antenatal screening programmes. In contrast, VCT remains highly associated with high-risk (sexual or drug-injecting) behaviours or population sub-groups at high risk. Strategies to reduce undiagnosed prevalent HIV infection will require further normalization and wider uptake of HIV testing

    COVID-19 related mortality and hospital admissions in the VIVALDI study cohort: October 2020-March 2023

    Get PDF
    Background: Long-term care facilities (LTCFs) were heavily affected by COVID-19 early in the pandemic, but the impact of the virus has reduced over time with vaccination campaigns and build-up of immunity from prior infection. // Objectives: To evaluate the mortality and hospital admissions associated with SARS-CoV-2 in LTCFs in England over the course of the VIVALDI study, from October 2020 to March 2023. // Methods: We included residents aged ≥65 years of participating LTCFs who had available follow-up time within the analysis period. We calculated incidence rates (IR) of COVID-19 linked mortality and hospital admissions per calendar quarter, along with infection fatality ratios (IFR, within 28d) and infection hospitalisation ratios (IHR, within 14d) following positive SARS-CoV-2 test. // Results: A total of 26286 residents were included, with at least one positive test for SARS-CoV-2 in 8513 (32.4%). The IR of COVID-19 related mortality peaked in the first quarter (Q1) 2021 at 0.47 per 1000 person-days (1kpd) (around a third of all deaths), in comparison to 0.10 per 1kpd for Q1 2023 which had a similar IR of SARS-CoV-2 infections. There was a fall in observed IFR for SARS-CoV-2 infections from 24.9% to 6.7% between these periods, with a fall in IHR from 12.1% to 8.8%. The population had high overall IRs for mortality for each quarter evaluated, corresponding to annual mortality probability of 28.8-41.3%. // Conclusions: Standardised real-time monitoring of hospitalisation and mortality following infection in LTCFs could inform policy on the need for non-pharmaceutical interventions to prevent transmission

    A systematic review of the screening accuracy of the HIV Dementia Scale and International HIV Dementia Scale.

    Get PDF
    BACKGROUND: The HIV Dementia Scale (HDS) and International HIV Dementia Scale (IHDS) are brief tools that have been developed to screen for and aid diagnosis of HIV-associated dementia (HAD). They are increasingly being used in clinical practice for minor neurocognitive disorder (MND) as well as HAD, despite uncertainty about their accuracy. METHODS AND FINDINGS: A systematic review of the accuracy of the HDS and IHDS was conducted. Studies were assessed on Standards for Reporting Diagnostic Accuracy criteria. Pooled sensitivity, specificity, likelihood ratios (LR) and diagnostic odds ratios (DOR) were calculated for each scale as a test for HAD or MND. We retrieved 15 studies of the HDS, 10 of the IHDS, and 1 of both scales. Thirteen studies of the HDS were conducted in North America, and 7 of the IHDS studies were conducted in sub-Saharan Africa. Estimates of accuracy were highly heterogeneous between studies for the HDS but less so for the IHDS. Pooled DOR for the HDS was 7.52 (95% confidence interval 3.75-15.11), sensitivity and specificity for HAD were estimated at 68.1% and 77.9%, and sensitivity and specificity for MND were estimated at 42.0% and 91.2%. Pooled DOR for the IHDS was 3.49 (2.12-5.73), sensitivity and specificity for HAD were 74.3% and 54.7%, and sensitivity and specificity for MND were 64.3% and 66.0%. CONCLUSION: Both scales were low in accuracy. The literature is limited by the lack of a gold standard, and variation in estimates of accuracy is likely to be due to differences in reference standard. There is a lack of studies comparing both scales, and they have been studied in different populations, but the IHDS may be less specific than the HDS. These rapid tests are not recommended for diagnostic use, and further research is required to inform their use in asymptomatic screening
    • …
    corecore