104 research outputs found

    Removing Strong Data Assumptions In Causal Inference Via Large-Scale Optimization

    Get PDF
    Many traditional and newly-developed causal inference approaches require imposing strong data assumptions, and if those assumptions were violated in practice, these approaches may be inapplicable, suffer from low statistical power, or lead to misleading causal conclusions. In this dissertation, we present three papers to show how large-scale optimization can sometimes aid in removing strong assumptions about the data generating process or the data collection procedure that are required by some existing causal inference approaches. The first and second papers show how large-scale optimization can sometimes help remove strong assumptions about the data generating process. In the first paper, a new adaptive approach is proposed to combine two test statistics in matched observational studies. The proposed adaptive approach asymptotically uniformly dominates both of the two component test statistics in sensitivity analyses, regardless of the underlying data distribution. In the second paper, a model-free and finite-population-exact framework is proposed to analyze randomized experiments subject to outcome misclassification. This new framework is based on large-scale integer programming and can help researchers analyze a randomized experiment subject to outcome misclassification in a more comprehensive way without imposing any additional assumptions on a randomized experiment. The third paper illustrates how large-scale optimization can help remove strong assumptions about the data collection procedure. Specifically, to study the effect of reducing malaria burden on the low birth weight rate in sub-Saharan Africa, a pair-of-pairs approach to a difference-in-differences study is proposed, which is built on optimal matching (a large-scale network flow problem) and cardinality matching (a large-scale integer programming problem). Unlike the traditional difference-in-differences studies, this pair-of-pairs approach does not require either panel data or repeated cross-sectional data to be collected before the analysis stage

    Design-Based Causal Inference with Missing Outcomes: Missingness Mechanisms, Imputation-Assisted Randomization Tests, and Covariate Adjustment

    Full text link
    Design-based causal inference is one of the most widely used frameworks for testing causal null hypotheses or inferring about causal parameters from experimental or observational data. The most significant merit of design-based causal inference is that its statistical validity only comes from the study design (e.g., randomization design) and does not require assuming any outcome-generating distributions or models. Although immune to model misspecification, design-based causal inference can still suffer from other data challenges, among which missingness in outcomes is a significant one. However, compared with model-based causal inference, outcome missingness in design-based causal inference is much less studied, largely due to the challenge that design-based causal inference does not assume any outcome distributions/models and, therefore, cannot directly adopt any existing model-based approaches for missing data. To fill this gap, we systematically study the missing outcomes problem in design-based causal inference. First, we use the potential outcomes framework to clarify the minimal assumption (concerning the outcome missingness mechanism) needed for conducting finite-population-exact randomization tests for the null effect (i.e., Fisher's sharp null) and that needed for constructing finite-population-exact confidence sets with missing outcomes. Second, we propose a general framework called ``imputation and re-imputation" for conducting finite-population-exact randomization tests in design-based causal studies with missing outcomes. Our framework can incorporate any existing outcome imputation algorithms and meanwhile guarantee finite-population-exact type-I error rate control. Third, we extend our framework to conduct covariate adjustment in an exact randomization test with missing outcomes and to construct finite-population-exact confidence sets with missing outcomes

    Valid Randomization Tests in Inexactly Matched Observational Studies via Iterative Convex Programming

    Full text link
    In causal inference, matching is one of the most widely used methods to mimic a randomized experiment using observational (non-experimental) data. Ideally, treated units are exactly matched with control units for the covariates so that the treatments are as-if randomly assigned within each matched set, and valid randomization tests for treatment effects can then be conducted as in a randomized experiment. However, inexact matching typically exists, especially when there are continuous or many observed covariates or when unobserved covariates exist. Previous matched observational studies routinely conducted downstream randomization tests as if matching was exact, as long as the matched datasets satisfied some prespecified balance criteria or passed some balance tests. Some recent studies showed that this routine practice could render a highly inflated type-I error rate of randomization tests, especially when the sample size is large. To handle this problem, we propose an iterative convex programming framework for randomization tests with inexactly matched datasets. Under some commonly used regularity conditions, we show that our approach can produce valid randomization tests (i.e., robustly controlling the type-I error rate) for any inexactly matched datasets, even when unobserved covariates exist. Our framework allows the incorporation of flexible machine learning models to better extract information from covariate imbalance while robustly controlling the type-I error rate

    Early career women in construction: Are their career expectations being met?

    Get PDF
    The recruitment, retention and development of early career women have always been a challenge in the construction industry. With the focus on early career women or new female construction management degree graduate hires in construction, this study explores: (i) factors influencing their choice of career in construction; (ii) the extent of which their career expectations were met in their first few years of job experience; and (iii) how their met or unmet career expectations are related their overall job satisfaction. Data was collected using an online survey questionnaire. The results show that the top significant factors influencing the respondents’ career choice are career opportunities and belief of getting better pay. Their career expectations, on the other hand, were met or exceeded to a great extent for almost all the measurement items. The results also show that the respondents have a relatively high overall job satisfaction level. Although there is lack of evidence that their overall job satisfaction increased as met career expectations increased, there are statistically significant positive correlations among the career expectation measurement items. These findings have implications for human resource practices of construction employers that aimed to attract early career women into the industry, and to reinforce their met career expectations and job satisfaction
    • …
    corecore