35 research outputs found

    Improving the causal treatment effect estimation with propensity scores by the bootstrap

    Get PDF
    AbstractWhen observational studies are used to establish the causal effects of treatments, the estimated effect is affected by treatment selection bias. The inverse propensity score weight (IPSW) is often used to deal with such bias. However, IPSW requires strong assumptions whose misspecifications and strategies to correct the misspecifications were rarely studied. We present a bootstrap bias correction of IPSW (BC-IPSW) to improve the performance of propensity score in dealing with treatment selection bias in the presence of failure to the ignorability and overlap assumptions. The approach was motivated by a real observational study to explore the potential of anticoagulant treatment for reducing mortality in patients with end-stage renal disease. The benefit of the treatment to enhance survival was demonstrated; the suggested BC-IPSW method indicated a statistically significant reduction in mortality for patients receiving the treatment. Using extensive simulations, we show that BC-IPSW substantially reduced the bias due to the misspecification of the ignorability and overlap assumptions. Further, we showed that IPSW is still useful to account for the lack of treatment randomization, but its advantages are stringently linked to the satisfaction of ignorability, indicating that the existence of relevant though unmeasured or unused covariates can worsen the selection bias

    A unified principled framework for resampling based on pseudo-populations: Asymptotic theory

    Get PDF
    In this paper, a class of resampling techniques for finite populations under πps sampling design is introduced. The basic idea on which they rest is a two-step procedure consisting in: (i) constructing a “pseudo-population” on the basis of sample data; (ii) drawing a sample from the predicted population according to an appropriate resampling design. From a logical point of view, this approach is essentially based on the plug-in principle by Efron, at the “sampling design level”. Theoretical justifications based on large sample theory are provided. New approaches to construct pseudo populations based on various forms of calibrations are proposed. Finally, a simulation study is performed

    Can Bayesian Network empower propensity score estimation from Real World Data?

    Full text link
    A new method, based on Bayesian Networks, to estimate propensity scores is proposed with the purpose to draw causal inference from real world data on the average treatment effect in case of a binary outcome and discrete covariates. The proposed method ensures maximum likelihood properties to the estimated propensity score, i.e. asymptotic efficiency, thus outperforming other available approach. Two point estimators via inverse probability weighting are then proposed, and their main distributional properties are derived for constructing confidence interval and for testing the hypotheses of absence of the treatment effect. Empirical evidence of the substantial improvements offered by the proposed methodology versus standard logistic modelling of propensity score is provided in simulation settings that mimic the characteristics of a real dataset of prostate cancer patients from Milan San Raffaele Hospital

    La stima della media nel campionamento per centri

    No full text
    The "aggregation points" sampling design applies, for instance, in survey of irregular immigrants i.e. of populations composed by a finite but unknown number of units which do not consent labelling and that can be reached only through a set of known but overlapping frames called "aggregation points". Dealing with the "aggregation points" sampling design, the problem of estimating the mean of a quantitative character is concerned; an estimate of the estimator's variance is also proposed. Some results from a simulation study are presented. Simulations indicate that estimators proposed perform better in case of not too large number of aggregation points but extensively overlapping

    Center sampling: a strategy for elusive population surveys

    No full text
    Center sampling is useful in finite population surveys when exhaustive lists of all units are not available and the target population is naturally clustered into a number of overlapping sites spread over an area of interest such as, for instance, the immigrant population illegally resident in a country. Center sampling has been successfully employed in official European surveys; nevertheless few systematic theoretical results have been given yet to support empirical findings. In this paper a general theory for Center sampling is formalized and an unbiased estimator for the mean of a quantitative or dichotomous characteristic is proposed together with its exact variance. A suitable estimator for the variance, unbiased under simple random sampling, is also derived and the optimum allocation of the sample size among centers subject to linear cost constraints is discussed. Other sampling designs, useful under operational aspects, are also considered

    Confronti fra stimatori per la media nel campionamento per centri

    No full text
    The center sampling technique is well suited when a population is naturally gathered in overlapping groups of units for which the units can not be labeled and the group size as well as the population size are unknown. An unbiased estimator Ym for the mean of a quantitative characteristic of interest has been proposed under the simple hypotheses that the relative weight of each center is known. A second estimator Yc can be deduced from a previous proposal under the same hypothesis. In the present paper the exact variance of Yc together with an estimate of it are given. The two estimators are based on different ways of summarizing data and they coincide in the case of proportional allocation of the overall sample size only. A comparison between the two estimators is accomplished both from the inferential and from the practical point of view. Through a simulation study it is shown that no estimator is uniformly more efficient than the other in the general case of L>_2 centers. Besides it comes out that Yc is more efficient when there is a "small" variability of the center sampling fractions, while Ym is more efficient as this variability increases. Simulation also shows that the proposed estimators for the variance are asymptotically unbiased, consistent and c-consistent
    corecore