1,607 research outputs found
Matching in Selective and Balanced Representation Space for Treatment Effects Estimation
The dramatically growing availability of observational data is being
witnessed in various domains of science and technology, which facilitates the
study of causal inference. However, estimating treatment effects from
observational data is faced with two major challenges, missing counterfactual
outcomes and treatment selection bias. Matching methods are among the most
widely used and fundamental approaches to estimating treatment effects, but
existing matching methods have poor performance when facing data with high
dimensional and complicated variables. We propose a feature selection
representation matching (FSRM) method based on deep representation learning and
matching, which maps the original covariate space into a selective, nonlinear,
and balanced representation space, and then conducts matching in the learned
representation space. FSRM adopts deep feature selection to minimize the
influence of irrelevant variables for estimating treatment effects and
incorporates a regularizer based on the Wasserstein distance to learn balanced
representations. We evaluate the performance of our FSRM method on three
datasets, and the results demonstrate superiority over the state-of-the-art
methods.Comment: Proceedings of the 29th ACM International Conference on Information
and Knowledge Management (CIKM '20
Debiased Bayesian inference for average treatment effects
Bayesian approaches have become increasingly popular in causal inference
problems due to their conceptual simplicity, excellent performance and in-built
uncertainty quantification ('posterior credible sets'). We investigate Bayesian
inference for average treatment effects from observational data, which is a
challenging problem due to the missing counterfactuals and selection bias.
Working in the standard potential outcomes framework, we propose a data-driven
modification to an arbitrary (nonparametric) prior based on the propensity
score that corrects for the first-order posterior bias, thereby improving
performance. We illustrate our method for Gaussian process (GP) priors using
(semi-)synthetic data. Our experiments demonstrate significant improvement in
both estimation accuracy and uncertainty quantification compared to the
unmodified GP, rendering our approach highly competitive with the
state-of-the-art.Comment: NeurIPS 201
Understanding, Analyzing and Predicting Online User Behavior
abstract: Due to the growing popularity of the Internet and smart mobile devices, massive data has been produced every day, particularly, more and more users’ online behavior and activities have been digitalized. Making a better usage of the massive data and a better understanding of the user behavior become at the very heart of industrial firms as well as the academia. However, due to the large size and unstructured format of user behavioral data, as well as the heterogeneous nature of individuals, it leveled up the difficulty to identify the SPECIFIC behavior that researchers are looking at, HOW to distinguish, and WHAT is resulting from the behavior. The difference in user behavior comes from different causes; in my dissertation, I am studying three circumstances of behavior that potentially bring in turbulent or detrimental effects, from precursory culture to preparatory strategy and delusory fraudulence. Meanwhile, I have access to the versatile toolkit of analysis: econometrics, quasi-experiment, together with machine learning techniques such as text mining, sentiment analysis, and predictive analytics etc. This study creatively leverages the power of the combined methodologies, and apply it beyond individual level data and network data. This dissertation makes a first step to discover user behavior in the newly boosting contexts. My study conceptualize theoretically and test empirically the effect of cultural values on rating and I find that an individualist cultural background are more likely to lead to deviation and more expression in review behaviors. I also find evidence of strategic behavior that users tend to leverage the reporting to increase the likelihood to maximize the benefits. Moreover, it proposes the features that moderate the preparation behavior. Finally, it introduces a unified and scalable framework for delusory behavior detection that meets the current needs to fully utilize multiple data sources.Dissertation/ThesisDoctoral Dissertation Business Administration 201
Neural Score Matching for High-Dimensional Causal Inference
Traditional methods for matching in causal
inference are impractical for high-dimensional
datasets. They suffer from the curse of dimensionality: exact matching and coarsened exact
matching find exponentially fewer matches
as the input dimension grows, and propensity score matching may match highly unrelated units together. To overcome this problem, we develop theoretical results which motivate the use of neural networks to obtain
non-trivial, multivariate balancing scores of a
chosen level of coarseness, in contrast to the
classical, scalar propensity score. We leverage
these balancing scores to perform matching
for high-dimensional causal inference and call
this procedure neural score matching. We
show that our method is competitive against
other matching approaches on semi-synthetic
high-dimensional datasets, both in terms of
treatment effect estimation and reducing imbalanc
Advertising Media and Target Audience Optimization via High-dimensional Bandits
We present a data-driven algorithm that advertisers can use to automate their
digital ad-campaigns at online publishers. The algorithm enables the advertiser
to search across available target audiences and ad-media to find the best
possible combination for its campaign via online experimentation. The problem
of finding the best audience-ad combination is complicated by a number of
distinctive challenges, including (a) a need for active exploration to resolve
prior uncertainty and to speed the search for profitable combinations, (b) many
combinations to choose from, giving rise to high-dimensional search
formulations, and (c) very low success probabilities, typically just a fraction
of one percent. Our algorithm (designated LRDL, an acronym for Logistic
Regression with Debiased Lasso) addresses these challenges by combining four
elements: a multiarmed bandit framework for active exploration; a Lasso penalty
function to handle high dimensionality; an inbuilt debiasing kernel that
handles the regularization bias induced by the Lasso; and a semi-parametric
regression model for outcomes that promotes cross-learning across arms. The
algorithm is implemented as a Thompson Sampler, and to the best of our
knowledge, it is the first that can practically address all of the challenges
above. Simulations with real and synthetic data show the method is effective
and document its superior performance against several benchmarks from the
recent high-dimensional bandit literature.Comment: 39 pages, 8 figure
- …