Decision Making with Limited Data

Abstract

This thesis studies different approaches to decision making with limited data. First, we study the effects of approximate inference on Thompson sampling in the k-armed bandit problems. Thompson sampling is a successful algorithm but requires posterior inference, which often must be approximated in practice. We show that even small constant inference error (in alpha-divergence) can lead to poor performance (linear regret) due to under-exploration (for alpha \u3c 1) or over-exploration (for alpha \u3e 0) by the approximation. While for alpha \u3e 0 this is unavoidable, for alpha \u3c= 0 the regret can be improved by adding a small amount of forced exploration. Second, we consider the problem of designing a randomized experiment on a source population to estimate the Average Treatment Effect (ATE) on a target population. We propose a novel approach which explicitly considers the target when designing the experiment on the source. Under the covariate shift assumption, we design an unbiased importance-weighted estimator for the target population’s ATE. To reduce the variance of our estimator, we design a covariate balance condition (Target Balance) between the treatment and control groups based on the target population. We show that Target Balance achieves a higher variance reduction asymptotically than methods that do not consider the target during the design phase. Our experiments illustrate that Target Balance reduces the variance even for small sample sizes. Finally, we examine confidence intervals. Historically, mean bounds for small sample sizes fall into 2 categories: methods with unrealistic assumptions about the unknown distribution (e.g., Gaussianity) and methods like Hoeffding\u27s inequality that use weaker assumptions but produce much looser intervals. In 1969, Anderson (1969) proposed a mean confidence interval strictly better than or equal to Hoeffding\u27s whose only assumption is that the distribution\u27s support is contained in an interval [a,b]. For the first time since then, we present a new family of upper bounds that compares favorably to Anderson\u27s. We prove that each bound in the family holds with probability at least 1-alpha for all distributions on an interval [a,b]. Furthermore, one of the bounds is tighter than or equal to Anderson\u27s for all samples

    Similar works