85 research outputs found

    Designs in Partially Controlled Studies: Messages from a Review

    Get PDF
    The ability to evaluate effects of factors on outcomes is increasingly important for a class of studies that control some but not all of the factors. Although important advances have been made in methods of analysis for such partially controlled studies,work on designs for such studies has been relatively limited. To help understand why, we review main designs that have been used for such partially controlled studies. Based on the review, we give two complementary reasons that explain the limited work on such designs, and suggest a new direction in this area

    A BROAD SYMMETRY CRITERION FOR NONPARAMETRIC VALIDITY OF PARAMETRICALLY-BASED TESTS IN RANDOMIZED TRIALS

    Get PDF
    Summary. Pilot phases of a randomized clinical trial often suggest that a parametric model may be an accurate description of the trial\u27s longitudinal trajectories. However, parametric models are often not used for fear that they may invalidate tests of null hypotheses of equality between the experimental groups. Existing work has shown that when, for some types of data, certain parametric models are used, the validity for testing the null is preserved even if the parametric models are incorrect. Here, we provide a broader and easier to check characterization of parametric models that can be used to (a) preserve nonparametric validity of testing the null hypothesis, i.e., even when the models are incorrect, and (b) increase power compared to the non- or semiparametric bounds when the models are close to correct. We demonstrate our results in a clinical trial of depression in Alzheimer\u27s patients

    THE ROLE OF AN EXPLICIT CAUSAL FRAMEWORK IN AFFECTED SIB PAIR DESIGNS WITH COVARIATES

    Get PDF
    The affected sib/relative pair (ASP/ARP) design is often used with covariates to find genes that can cause a disease in pathways other than through those covariates. However, such covariates can themselves have genetic determinants, and the validity of existing methods has so far only been argued under implicit assumptions. We propose an explicit causal formulation of the problem using potential outcomes and principal stratification. The general role of this formulation is to identify and separate the meaning of the different assumptions that can provide valid causal inference in linkage analysis. This separation helps to (a) develop better methods under explicit assumptions, and (b) show the different ways in which these assumptions can fail, which is necessary for developing further specific designs to test these assumptions and confirm or improve the inference. Using this formulation in the specific problem above, we show that, when the covariate (e.g., addiction to smoking) also has genetic determinants, then existing methods, including those previously thought as valid, can declare linkage between the disease and marker loci even when no such linkage exists. We also introduce design strategies to address the problem

    Deductive Derivation and Computerization of Compatible Semiparametric Efficient Estimation

    Get PDF
    Researchers often seek robust inference for a parameter through semiparametric estimation. Efficient semiparametric estimation currently requires theoretical derivation of the efficient influence function (EIF), which can be a challenging and time-consuming task. If this task can be computerized, it can save dramatic human effort, which can be transferred, for example, to the design of new studies. Although the EIF is, in principle, a derivative, simple numerical differentiation to calculate the EIF by a computer masks the EIF\u27s functional dependence on the parameter of interest. For this reason, the standard approach to obtaining the EIF has been the theoretical construction of the space of scores under all possible parametric submodels. This process currently depends on the correctness of conjectures about these spaces, and the correct verification of such conjectures. The correct guessing of such conjectures, though successful in some problems, is a nondeductive process, i.e., is not guaranteed to succeed (e.g., is not computerizable), and the verification of conjectures is generally susceptible to mistakes. We propose a method that can deductively produce semiparametric locally efficient estimators. The proposed method is computerizable, meaning that it does not need either conjecturing for, or otherwise theoretically deriving the functional form of the EIF, and is guaranteed to produce the result. The method is demonstared through an example

    Choosing profile double-sampling designs for survival estimation with application to PEPFAR evaluation

    Get PDF
    Most studies that follow subjects over time are challenged by having some subjects who dropout. Double sampling is a design that selects and devotes resources to intensively pursue and find a subset of these dropouts, then uses data obtained from these to adjust naĂŻve estimates, which are potentially biased by the dropout. Existing methods to estimate survival from double sampling assume a random sample. In limited-resource settings, however, generating accurate estimates using a minimum of resources is important. We propose using double-sampling designs that oversample certain profiles of dropouts as more efficient alternatives to random designs. First, we develop a framework to estimate the survival function under these profile double-sampling designs. We then derive the precision of these designs as a function of the rule for selecting different profiles, in order to identify more efficient designs. We illustrate using data from the United States President's Emergency Plan for AIDS Relief-funded HIV care and treatment program in western Kenya. Our results show why and how more efficient designs should oversample patients with shorter dropout times. Further, our work suggests generalizable practice for more efficient double-sampling designs, which can help maximize efficiency in resource-limited settings

    Estimating effects by combining instrumental variables with case-control designs: the role of principal stratification

    Get PDF
    The instrumental variable framework is commonly used in the estimation of causal effects from cohort samples. In the case of more efficient designs such as the case-control study, however, the combination of the instrumental variable and complex sampling designs requires new methodological consideration. As the prevalence of Mendelian randomization studies is increasing and the cost of genotyping and expression data can be high, the analysis of data gathered from more cost-effective sampling designs is of prime interest. We show that the standard instrumental variable analysis is not applicable to the case-control design and can lead to erroneous estimation and inference. We also propose a method based on principal stratification for the analysis of data arising from the combination of case-control sampling and instrumental variable design and illustrate it with a study in oncology

    PRINCIPAL STRATIFICATION DESIGNS TO ESTIMATE INPUT DATA MISSING DUE TO DEATH

    Get PDF
    We consider studies of cohorts of individuals after a critical event, such as an injury, with the following characteristics. First, the studies are designed to measure “input” variables, which describe the period before the critical event, and to characterize the distribution of the input variables in the cohort. Second, the studies are designed to measure “output” variables, primarily mortality after the critical event, and to characterize the predictive (conditional) distribution of mortality given the input variables in the cohort. Such studies often possess the complication that the input data are missing for those who die shortly after the critical event because the data collection takes place after the event. Standard methods of dealing with the missing inputs, such as imputation or weighting methods based on an assumption of ignorable missingness, are known to be generally invalid when the missingness of inputs is nonignorable, that is, when the distribution of the inputs is different between those who die and those who live. To address this issue, we propose a novel design that obtains and uses information on an additional key variable – a treatment or externally controlled variable, which if set at its “effective” level, could have prevented the death of those who died. We show that the new design can be used to draw valid inferences for the marginal distribution of inputs in the entire cohort, and for the conditional distribution of mortality given the inputs, also in the entire cohort, even under nonignorable missingness. The crucial framework that we use is principal stratification based on the potential outcomes, here mortality under both levels of treatment. We also show using illustrative preliminary injury data, that our approach can reveal results that are more reasonable than the results of standard methods, in relatively dramatic ways. Thus, our approach suggests that the routine collection of data on variables that could be used as possible treatments in such studies of inputs and mortality should become common

    Application of the propensity score in a covariate-based linkage analysis of the Collaborative Study on the Genetics of Alcoholism

    Get PDF
    BACKGROUND: Covariate-based linkage analyses using a conditional logistic model as implemented in LODPAL can increase the power to detect linkage by minimizing disease heterogeneity. However, each additional covariate analyzed will increase the degrees of freedom for the linkage test, and therefore can also increase the type I error rate. Use of a propensity score (PS) has been shown to improve consistently the statistical power to detect linkage in simulation studies. Defined as the conditional probability of being affected given the observed covariate data, the PS collapses multiple covariates into a single variable. This study evaluates the performance of the PS to detect linkage evidence in a genome-wide linkage analysis of microsatellite marker data from the Collaborative Study on the Genetics of Alcoholism. Analytical methods included nonparametric linkage analysis without covariates, with one covariate at a time including multiple PS definitions, and with multiple covariates simultaneously that corresponded to the PS definitions. Several definitions of the PS were calculated, each with increasing number of covariates up to a maximum of five. To account for the potential inflation in the type I error rates, permutation based p-values were calculated. RESULTS: Results suggest that the use of individual covariates may not necessarily increase the power to detect linkage. However the use of a PS can lead to an increase when compared to using all covariates simultaneously. Specifically, PS3, which combines age at interview, sex, and smoking status, resulted in the greatest number of significant markers identified. All methods consistently identified several chromosomal regions as significant, including loci on chromosome 2, 6, 7, and 12. CONCLUSION: These results suggest that the use of a propensity score can increase the power to detect linkage for a complex disease such as alcoholism, especially when multiple important covariates can be used to predict risk and thereby minimize linkage heterogeneity. However, because the PS is calculated as a conditional probability of being affected, it does require the presence of observed covariate data on both affected and unaffected individuals, which may not always be available in real data sets
    • …
    corecore