113 research outputs found

    Misclassified group-tested current status data.

    Get PDF
    Group testing, introduced by Dorfman (1943), has been used to reduce costs when estimating the prevalence of a binary characteristic based on a screening test of [Formula: see text] groups that include [Formula: see text] independent individuals in total. If the unknown prevalence is low and the screening test suffers from misclassification, it is also possible to obtain more precise prevalence estimates than those obtained from testing all [Formula: see text] samples separately (Tu et al., 1994). In some applications, the individual binary response corresponds to whether an underlying time-to-event variable [Formula: see text] is less than an observed screening time [Formula: see text], a data structure known as current status data. Given sufficient variation in the observed [Formula: see text] values, it is possible to estimate the distribution function [Formula: see text] of [Formula: see text] nonparametrically, at least at some points in its support, using the pool-adjacent-violators algorithm (Ayer et al., 1955). Here, we consider nonparametric estimation of [Formula: see text] based on group-tested current status data for groups of size [Formula: see text] where the group tests positive if and only if any individual's unobserved [Formula: see text] is less than the corresponding observed [Formula: see text]. We investigate the performance of the group-based estimator as compared to the individual test nonparametric maximum likelihood estimator, and show that the former can be more precise in the presence of misclassification for low values of [Formula: see text]. Potential applications include testing for the presence of various diseases in pooled samples where interest focuses on the age-at-incidence distribution rather than overall prevalence. We apply this estimator to the age-at-incidence curve for hepatitis C infection in a sample of U.S. women who gave birth to a child in 2014, where group assignment is done at random and based on maternal age. We discuss connections to other work in the literature, as well as potential extensions

    On a general structure for hazard-based regression models: An application to population-based cancer research

    Get PDF
    The proportional hazards model represents the most commonly assumed hazard structure when analysing time to event data using regression models. We study a general hazard structure which contains, as particular cases, proportional hazards, accelerated hazards, and accelerated failure time structures, as well as combinations of these. We propose an approach to apply these different hazard structures, based on a flexible parametric distribution (exponentiated Weibull) for the baseline hazard. This distribution allows us to cover the basic hazard shapes of interest in practice: constant, bathtub, increasing, decreasing, and unimodal. In an extensive simulation study, we evaluate our approach in the context of excess hazard modelling, which is the main quantity of interest in descriptive cancer epidemiology. This study exhibits good inferential properties of the proposed model, as well as good performance when using the Akaike Information Criterion for selecting the hazard structure. An application on lung cancer data illustrates the usefulness of the proposed model

    Pitfalls of using the risk ratio in meta‐analysis

    Get PDF
    For meta-analysis of studies that report outcomes as binomial proportions, the most popular measure of effect is the odds ratio (OR), usually analyzed as log(OR). Many meta-analyses use the risk ratio (RR) and its logarithm, because of its simpler interpretation. Although log(OR) and log(RR) are both unbounded, use of log(RR) must ensure that estimates are compatible with study-level event rates in the interval (0, 1). These complications pose a particular challenge for random-effects models, both in applications and in generating data for simulations. As background we review the conventional random-effects model and then binomial generalized linear mixed models (GLMMs) with the logit link function, which do not have these complications. We then focus on log-binomial models and explore implications of using them; theoretical calculations and simulation show evidence of biases. The main competitors to the binomial GLMMs use the beta-binomial (BB) distribution, either in BB regression or by maximizing a BB likelihood; a simulation produces mixed results. Two examples and an examination of Cochrane meta-analyses that used RR suggest bias in the results from the conventional inverse-variance-weighted approach. Finally, we comment on other measures of effect that have range restrictions, including risk difference, and outline further research

    Cluster randomized test-negative design (CR-TND) trials: a novel and efficient method to assess the efficacy of community level dengue interventions

    No full text
    Cluster randomized trials are the gold standard for assessing efficacy of community-level interventions, such as vector control strategies against dengue. We describe a novel cluster randomized trial methodology with a test-negative design, which offers advantages over traditional approaches. It utilizes outcome-based sampling of patients presenting with a syndrome consistent with the disease of interest, who are subsequently classified as test-positive cases or test-negative controls on the basis of diagnostic testing. We use simulations of a cluster trial to demonstrate validity of efficacy estimates under the test-negative approach. This demonstrates that, provided study arms are balanced for both test-negative and test-positive illness at baseline and that other test-negative design assumptions are met, the efficacy estimates closely match true efficacy. We also briefly discuss analytical considerations for an odds ratio-based effect estimate arising from clustered data, and outline potential approaches to analysis. We conclude that application of the test-negative design to certain cluster randomized trials could increase their efficiency and ease of implementation

    Simpson's Paradox, Lord's Paradox, and Suppression Effects are the same phenomenon – the reversal paradox

    Get PDF
    This article discusses three statistical paradoxes that pervade epidemiological research: Simpson's paradox, Lord's paradox, and suppression. These paradoxes have important implications for the interpretation of evidence from observational studies. This article uses hypothetical scenarios to illustrate how the three paradoxes are different manifestations of one phenomenon – the reversal paradox – depending on whether the outcome and explanatory variables are categorical, continuous or a combination of both; this renders the issues and remedies for any one to be similar for all three. Although the three statistical paradoxes occur in different types of variables, they share the same characteristic: the association between two variables can be reversed, diminished, or enhanced when another variable is statistically controlled for. Understanding the concepts and theory behind these paradoxes provides insights into some controversial or contradictory research findings. These paradoxes show that prior knowledge and underlying causal theory play an important role in the statistical modelling of epidemiological data, where incorrect use of statistical models might produce consistent, replicable, yet erroneous results

    Crude incidence in two-phase designs in the presence of competing risks.

    Get PDF
    BackgroundIn many studies, some information might not be available for the whole cohort, some covariates, or even the outcome, might be ascertained in selected subsamples. These studies are part of a broad category termed two-phase studies. Common examples include the nested case-control and the case-cohort designs. For two-phase studies, appropriate weighted survival estimates have been derived; however, no estimator of cumulative incidence accounting for competing events has been proposed. This is relevant in the presence of multiple types of events, where estimation of event type specific quantities are needed for evaluating outcome.MethodsWe develop a non parametric estimator of the cumulative incidence function of events accounting for possible competing events. It handles a general sampling design by weights derived from the sampling probabilities. The variance is derived from the influence function of the subdistribution hazard.ResultsThe proposed method shows good performance in simulations. It is applied to estimate the crude incidence of relapse in childhood acute lymphoblastic leukemia in groups defined by a genotype not available for everyone in a cohort of nearly 2000 patients, where death due to toxicity acted as a competing event. In a second example the aim was to estimate engagement in care of a cohort of HIV patients in resource limited setting, where for some patients the outcome itself was missing due to lost to follow-up. A sampling based approach was used to identify outcome in a subsample of lost patients and to obtain a valid estimate of connection to care.ConclusionsA valid estimator for cumulative incidence of events accounting for competing risks under a general sampling design from an infinite target population is derived

    Persistent inequalities in unplanned hospitalisation among colon cancer patients across critical phases of their care pathway, England, 2011-13.

    Get PDF
    BACKGROUND: Reducing hospital emergency admissions is a key target for all modern health systems. METHODS: We analysed colon cancer patients diagnosed in 2011-13 in England. We screened their individual Hospital Episode Statistics records in the 90 days pre-diagnosis, the 90 days post-diagnosis, and the 90 days pre-death (in the year following diagnosis), for the occurrence of hospital emergency admissions (HEAs). RESULTS: Between a quarter and two thirds of patients experience HEA in the three 90-day periods examined: pre-diagnosis, post-diagnosis and before death. Patients with tumour stage I-III from more deprived backgrounds had higher proportions of HEAs than less deprived patients during all studied periods. This remains even after adjusting for differing distributions of risk factors such as age, sex, comorbidity and stage at diagnosis. CONCLUSIONS: Although in some cases HEAs might be unavoidable or even appropriate, the proportion of HEAs varies by socioeconomic status, even after controlling for the usual patient factors, suggestive of remediable causes of excess emergency healthcare utilisation in patients belonging to higher deprivation groups. Future inquiries should address the potential role of clinical complications, sub-optimal healthcare administration, premature discharge or a lack of social support as potential explanations for these patterns of inequality

    Early Epidemiological Assessment of the Virulence of Emerging Infectious Diseases: A Case Study of an Influenza Pandemic

    Get PDF
    Background: The case fatality ratio (CFR), the ratio of deaths from an infectious disease to the number of cases, provides an assessment of virulence. Calculation of the ratio of the cumulative number of deaths to cases during the course of an epidemic tends to result in a biased CFR. The present study develops a simple method to obtain an unbiased estimate of confirmed CFR (cCFR), using only the confirmed cases as the denominator, at an early stage of epidemic, even when there have been only a few deaths. Methodology/Principal Findings: Our method adjusts the biased cCFR by a factor of underestimation which is informed by the time from symptom onset to death. We first examine the approach by analyzing an outbreak of severe acute respiratory syndrome in Hong Kong (2003) with known unbiased cCFR estimate, and then investigate published epidemiological datasets of novel swine-origin influenza A (H1N1) virus infection in the USA and Canada (2009). Because observation of a few deaths alone does not permit estimating the distribution of the time from onset to death, the uncertainty is addressed by means of sensitivity analysis. The maximum likelihood estimate of the unbiased cCFR for influenza may lie in the range of 0.16-4.48% within the assumed parameter space for a factor of underestimation. The estimates for influenza suggest that the virulence is comparable to the early estimate in Mexico. Even when there have been no deaths, our model permits estimating a conservative upper bound of the cCFR. Conclusions: Although one has to keep in mind that the cCFR for an entire population is vulnerable to its variations among sub-populations and underdiagnosis, our method is useful for assessing virulence at the early stage of an epidemic and for informing policy makers and the public. © 2009 Nishiura et al.published_or_final_versio
    corecore