630 research outputs found
Learning mixtures of structured distributions over discrete domains
Let be a class of probability distributions over the discrete
domain We show that if satisfies a rather
general condition -- essentially, that each distribution in can
be well-approximated by a variable-width histogram with few bins -- then there
is a highly efficient (both in terms of running time and sample complexity)
algorithm that can learn any mixture of unknown distributions from
We analyze several natural types of distributions over , including
log-concave, monotone hazard rate and unimodal distributions, and show that
they have the required structural property of being well-approximated by a
histogram with few bins. Applying our general algorithm, we obtain
near-optimally efficient algorithms for all these mixture learning problems.Comment: preliminary full version of soda'13 pape
Inference Based on Conditional Moment Inequalities
In this paper, we propose an instrumental variable approach to constructing confidence sets (CS's) for the true parameter in models defined by conditional moment inequalities/equalities. We show that by properly choosing instrument functions, one can transform conditional moment inequalities/equalities into unconditional ones without losing identification power. Based on the unconditional moment inequalities/equalities, we construct CS's by inverting Cramer-von Mises-type or Kolmogorov-Smirnov-type tests. Critical values are obtained using generalized moment selection (GMS) procedures. We show that the proposed CS's have correct uniform asymptotic coverage probabilities. New methods are required to establish these results because an infinite-dimensional nuisance parameter affects the asymptotic distributions. We show that the tests considered are consistent against all fixed alternatives and have power against n^{-1/2}-local alternatives to some, but not all, sequences of distributions in the null hypothesis. Monte Carlo simulations for four different models show that the methods perform well in finite samples.Asymptotic size, Asymptotic power, Conditional moment inequalities, Confidence set, Cramer-von Mises, Generalized moment selection, Kolmogorov-Smirnov, Moment inequalities
Sample-Based High-Dimensional Convexity Testing
In the problem of high-dimensional convexity testing, there is an unknown set S in the n-dimensional Euclidean space which is promised to be either convex or c-far from every convex body with respect to the standard multivariate normal distribution. The job of a testing algorithm is then to distinguish between these two cases while making as few inspections of the set S as possible.
In this work we consider sample-based testing algorithms, in which the testing algorithm only has access to labeled samples (x,S(x)) where each x is independently drawn from the normal distribution. We give nearly matching sample complexity upper and lower bounds for both one-sided and two-sided convexity testing algorithms in this framework. For constant c, our results show that the sample complexity of one-sided convexity testing is exponential in n, while for two-sided convexity testing it is exponential in the square root of n
Inference Based on Conditional Moment Inequalities
In this paper, we propose an instrumental variable approach to constructing confidence sets (CS's) for the true parameter in models defined by conditional moment inequalities/equalities. We show that by properly choosing instrument functions, one can transform conditional moment inequalities/equalities into unconditional ones without losing identification power. Based on the unconditional moment inequalities/equalities, we construct CS's by inverting Cramer-von Mises-type or Kolmogorov-Smirnov-type tests. Critical values are obtained using generalized moment selection (GMS) procedures. We show that the proposed CS's have correct uniform asymptotic coverage probabilities. New methods are required to establish these results because an infinite-dimensional nuisance parameter affects the asymptotic distributions. We show that the tests considered are consistent against all fixed alternatives and have power against some n^{-1/2}-local alternatives, though not all such alternatives. Monte Carlo simulations for three different models show that the methods perform well in finite samples.Asymptotic size, asymptotic power, conditional moment inequalities, confidence set, Cramer-von Mises, generalized moment selection, Kolmogorov-Smirnov, moment inequalities
Nonparametric estimation of an additive quantile regression model
This paper is concerned with estimating the additive components of a nonparametric additive quantile regression model. We develop an estimator that is asymptotically normally distributed with a rate of convergence in probability of n-r/(2r+1) when the additive components are r-times continuously differentiable for some r = 2. This result holds regardless of the dimension of the covariates and, therefore, the new estimator has no curse of dimensionality. In addition, the estimator has an oracle property and is easily extended to a generalized additive quantile regression model with a link function. The numerical performance and usefulness of the estimator are illustrated by Monte Carlo experiments and an empirical example.
Nonparametric estimation of an additive quantile regression model
This paper is concerned with estimating the additive components of a nonparametric
additive quantile regression model. We develop an estimator that is asymptotically
normally distributed with a rate of convergence in probability of n^{-r/(2+10)} when the
additive components are r-times continuously differentiable for some r\geq2. This result
holds regardless of the dimension of the covariates and, therefore, the new estimator
has no curse of dimensionality. In addition, the estimator has an oracle property and is
easily extended to a generalized additive quantile regression model with a link function.
The numerical performance and usefulness of the estimator are illustrated by Monte
Carlo experiments and an empirical example
Optimal testing for properties of distributions
Given samples from an unknown discrete distribution p, is it possible to distinguish whether p belongs to some class of distributions C versus p being far from every distribution in C? This fundamental question has received tremendous attention in statistics, focusing primarily on asymptotic analysis, as well as in information theory and theoretical computer science, where the emphasis has been on small sample size and computational complexity. Nevertheless, even for basic properties of discrete distributions such as monotonicity, independence, logconcavity, unimodality, and monotone-hazard rate, the optimal sample complexity
is unknown. We provide a general approach via which we obtain sample-optimal and computationally efficient testers for all these distribution families. At the core of our approach is an algorithm which solves the following problem: Given samples from an unknown distribution p, and a known distribution q, are p and q close in x[superscript 2]-distance, or far in total variation distance? The optimality of our testers is established by providing matching lower bounds, up to constant factors. Finally, a necessary building block for our testers and an important byproduct of our work are the first known computationally efficient proper learners for discrete log-concave, monotone hazard rate distributions
- …