5 research outputs found

    Estimation of a regression spline sample selection model

    Get PDF
    It is often the case that an outcome of interest is observed for a restricted non-randomly selected sample of the population. In such a situation, standard statistical analysis yields biased results. This issue can be addressed using sample selection models which are based on the estimation of two regressions: a binary selection equation determining whether a particular statistical unit will be available in the outcome equation. Classic sample selection models assume a priori that continuous regressors have a pre-specified linear or non-linear relationship to the outcome, which can lead to erroneous conclusions. In the case of continuous response, methods in which covariate effects are modeled flexibly have been previously proposed, the most recent being based on a Bayesian Markov chain Monte Carlo approach. A frequentist counterpart which has the advantage of being computationally fast is introduced. The proposed algorithm is based on the penalized likelihood estimation framework. The construction of confidence intervals is also discussed. The empirical properties of the existing and proposed methods are studied through a simulation study. The approaches are finally illustrated by analyzing data from the RAND Health Insurance Experiment on annual health expenditures

    Principal causal effect identification and surrogate endpoint evaluation by multiple trials

    Full text link
    Principal stratification is a causal framework to analyze randomized experiments with a post-treatment variable between the treatment and endpoint variables. Because the principal strata defined by the potential outcomes of the post-treatment variable are not observable, we generally cannot identify the causal effects within principal strata. Motivated by a real data set of phase III adjuvant colon clinical trials, we propose approaches to identifying and estimating the principal causal effects via multiple trials. For the identifiability, we remove the commonly-used exclusion restriction assumption by stipulating that the principal causal effects are homogeneous across these trials. To remove another commonly-used monotonicity assumption, we give a necessary condition for the local identifiability, which requires at least three trials. Applying our approaches to the data from adjuvant colon clinical trials, we find that the commonly-used monotonicity assumption is untenable, and disease-free survival with three-year follow-up is a valid surrogate endpoint for overall survival with five-year follow-up, which satisfies both the causal necessity and the causal sufficiency. We also propose a sensitivity analysis approach based on Bayesian hierarchical models to investigate the impact of the deviation from the homogeneity assumption

    Comparing principal stratification and selection models in parametric causal inference with nonignorable missingness

    No full text
    Two approaches for dealing with ``endogenous selection'' problems when estimating causal effects are considered. They are principal stratification and selection models. The main goal is to highlight similarities and differences between the two approaches, by investigating the different nature of their parametric hypotheses. The principal stratification approach focuses on information contained in specific subgroups of units. The aim is to produce valid inference conditional on such subgroups, without an a priori extension of the results to the whole population. Selection models, on the contrary, aim at estimating parameters that should be valid for the whole population, as if the data come from random sampling. A simulation study is conducted to show their different performances, with data generating processes coming from either approach. It is also argued that principal stratification is able to suggest alternative identification strategies not always easily translatable into assumptions of a selection model

    Comparing principal stratification and selection models in parametric causal inference with nonignorable missingness

    No full text
    Two approaches for dealing with "endogenous selection" problems when estimating causal effects are considered. They are principal stratification and selection models. The main goal is to highlight similarities and differences between the two approaches, by investigating the different nature of their parametric hypotheses. The principal stratification approach focuses on information contained in specific subgroups of units. The aim is to produce valid inference conditional on such subgroups, without an a priori extension of the results to the whole population. Selection models, on the contrary, aim at estimating parameters that should be valid for the whole population, as if the data come from random sampling. A simulation study is conducted to show their different performances, with data generating processes coming from either approach. It is also argued that principal stratification is able to suggest alternative identification strategies not always easily translatable into assumptions of a selection model.
    corecore