12 research outputs found
Informative Features for Model Comparison
Given two candidate models, and a set of target observations, we address the
problem of measuring the relative goodness of fit of the two models. We propose
two new statistical tests which are nonparametric, computationally efficient
(runtime complexity is linear in the sample size), and interpretable. As a
unique advantage, our tests can produce a set of examples (informative
features) indicating the regions in the data domain where one model fits
significantly better than the other. In a real-world problem of comparing GAN
models, the test power of our new test matches that of the state-of-the-art
test of relative goodness of fit, while being one order of magnitude faster.Comment: Accepted to NIPS 201
A Linear-Time Kernel Goodness-of-Fit Test
We propose a novel adaptive test of goodness-of-fit, with computational cost
linear in the number of samples. We learn the test features that best indicate
the differences between observed samples and a reference model, by minimizing
the false negative rate. These features are constructed via Stein's method,
meaning that it is not necessary to compute the normalising constant of the
model. We analyse the asymptotic Bahadur efficiency of the new test, and prove
that under a mean-shift alternative, our test always has greater relative
efficiency than a previous linear-time kernel test, regardless of the choice of
parameters for that test. In experiments, the performance of our method exceeds
that of the earlier linear-time test, and matches or exceeds the power of a
quadratic-time kernel test. In high dimensions and where model structure may be
exploited, our goodness of fit test performs far better than a quadratic-time
two-sample test based on the Maximum Mean Discrepancy, with samples drawn from
the model.Comment: Accepted to NIPS 201
Cost-Effective Incentive Allocation via Structured Counterfactual Inference
We address a practical problem ubiquitous in modern marketing campaigns, in
which a central agent tries to learn a policy for allocating strategic
financial incentives to customers and observes only bandit feedback. In
contrast to traditional policy optimization frameworks, we take into account
the additional reward structure and budget constraints common in this setting,
and develop a new two-step method for solving this constrained counterfactual
policy optimization problem. Our method first casts the reward estimation
problem as a domain adaptation problem with supplementary structure, and then
subsequently uses the estimators for optimizing the policy with constraints. We
also establish theoretical error bounds for our estimation procedure and we
empirically show that the approach leads to significant improvement on both
synthetic and real datasets
Testing Goodness of Fit of Conditional Density Models with Kernels
We propose two nonparametric statistical tests of goodness of fit for
conditional distributions: given a conditional probability density function
and a joint sample, decide whether the sample is drawn from
for some density . Our tests, formulated with a Stein
operator, can be applied to any differentiable conditional density model, and
require no knowledge of the normalizing constant. We show that 1) our tests are
consistent against any fixed alternative conditional model; 2) the statistics
can be estimated easily, requiring no density estimation as an intermediate
step; and 3) our second test offers an interpretable test result providing
insight on where the conditional model does not fit well in the domain of the
covariate. We demonstrate the interpretability of our test on a task of
modeling the distribution of New York City's taxi drop-off location given a
pick-up point. To our knowledge, our work is the first to propose such
conditional goodness-of-fit tests that simultaneously have all these desirable
properties.Comment: In UAI 2020. http://auai.org/uai2020/accepted.ph
Learning Kernel Tests Without Data Splitting
Modern large-scale kernel-based tests such as maximum mean discrepancy (MMD)
and kernelized Stein discrepancy (KSD) optimize kernel hyperparameters on a
held-out sample via data splitting to obtain the most powerful test statistics.
While data splitting results in a tractable null distribution, it suffers from
a reduction in test power due to smaller test sample size. Inspired by the
selective inference framework, we propose an approach that enables learning the
hyperparameters and testing on the full sample without data splitting. Our
approach can correctly calibrate the test in the presence of such dependency,
and yield a test threshold in closed form. At the same significance level, our
approach's test power is empirically larger than that of the data-splitting
approach, regardless of its split proportion.Comment: 24 (10+14) pages, 9 figures. Under Review v2: added missing
references and acknowledgment
A linear-time kernel goodness-of-fit test
We propose a novel adaptive test of goodness-of-fit, with computational cost linear in the number of samples. We learn the test features that best indicate the differences between observed samples and a reference model, by minimizing the false negative rate. These features are constructed via Stein's method, meaning that it is not necessary to compute the normalising constant of the model. We analyse the asymptotic Bahadur efficiency of the new test, and prove that under a mean-shift alternative, our test always has greater relative efficiency than a previous linear-time kernel test, regardless of the choice of parameters for that test. In experiments, the performance of our method exceeds that of the earlier linear-time test, and matches or exceeds the power of a quadratic-time kernel test. In high dimensions and where model structure may be exploited, our goodness of fit test performs far better than a quadratic-time two-sample test based on the Maximum Mean Discrepancy, with samples drawn from the model
Kernel-based distribution features for statistical tests and Bayesian inference
The kernel mean embedding is known to provide a data representation which preserves full information of the data distribution. While typically computationally costly, its nonparametric nature has an advantage of requiring no explicit model specification of the data. At the other extreme are approaches which summarize data distributions into a finite-dimensional vector of hand-picked summary statistics. This explicit finite-dimensional representation offers a computationally cheaper alternative. Clearly, there is a trade-off between cost and sufficiency of the representation, and it is of interest to have a computationally efficient technique which can produce a data-driven representation, thus combining the advantages from both extremes. The main focus of this thesis is on the development of linear-time mean-embedding-based methods to automatically extract informative features of data distributions, for statistical tests and Bayesian inference. In the first part on statistical tests, several new linear-time techniques are developed. These include a new kernel-based distance measure for distributions, a new linear-time nonparametric dependence measure, and a linear-time discrepancy measure between a probabilistic model and a sample, based on a Stein operator. These new measures give rise to linear-time and consistent tests of homogeneity, independence, and goodness of fit, respectively. The key idea behind these new tests is to explicitly learn distribution-characterizing feature vectors, by maximizing a proxy for the probability of correctly rejecting the null hypothesis. We theoretically show that these new tests are consistent for any finite number of features. In the second part, we explore the use of random Fourier features to construct approximate kernel mean embeddings, for representing messages in expectation propagation (EP) algorithm. The goal is to learn a message operator which predicts EP outgoing messages from incoming messages. We derive a novel two-layer random feature representation of the input messages, allowing online learning of the operator during EP inference