630 research outputs found

    Learning mixtures of structured distributions over discrete domains

    Full text link
    Let C\mathfrak{C} be a class of probability distributions over the discrete domain [n]={1,...,n}.[n] = \{1,...,n\}. We show that if C\mathfrak{C} satisfies a rather general condition -- essentially, that each distribution in C\mathfrak{C} can be well-approximated by a variable-width histogram with few bins -- then there is a highly efficient (both in terms of running time and sample complexity) algorithm that can learn any mixture of kk unknown distributions from C.\mathfrak{C}. We analyze several natural types of distributions over [n][n], including log-concave, monotone hazard rate and unimodal distributions, and show that they have the required structural property of being well-approximated by a histogram with few bins. Applying our general algorithm, we obtain near-optimally efficient algorithms for all these mixture learning problems.Comment: preliminary full version of soda'13 pape

    Inference Based on Conditional Moment Inequalities

    Get PDF
    In this paper, we propose an instrumental variable approach to constructing confidence sets (CS's) for the true parameter in models defined by conditional moment inequalities/equalities. We show that by properly choosing instrument functions, one can transform conditional moment inequalities/equalities into unconditional ones without losing identification power. Based on the unconditional moment inequalities/equalities, we construct CS's by inverting Cramer-von Mises-type or Kolmogorov-Smirnov-type tests. Critical values are obtained using generalized moment selection (GMS) procedures. We show that the proposed CS's have correct uniform asymptotic coverage probabilities. New methods are required to establish these results because an infinite-dimensional nuisance parameter affects the asymptotic distributions. We show that the tests considered are consistent against all fixed alternatives and have power against n^{-1/2}-local alternatives to some, but not all, sequences of distributions in the null hypothesis. Monte Carlo simulations for four different models show that the methods perform well in finite samples.Asymptotic size, Asymptotic power, Conditional moment inequalities, Confidence set, Cramer-von Mises, Generalized moment selection, Kolmogorov-Smirnov, Moment inequalities

    Sample-Based High-Dimensional Convexity Testing

    Get PDF
    In the problem of high-dimensional convexity testing, there is an unknown set S in the n-dimensional Euclidean space which is promised to be either convex or c-far from every convex body with respect to the standard multivariate normal distribution. The job of a testing algorithm is then to distinguish between these two cases while making as few inspections of the set S as possible. In this work we consider sample-based testing algorithms, in which the testing algorithm only has access to labeled samples (x,S(x)) where each x is independently drawn from the normal distribution. We give nearly matching sample complexity upper and lower bounds for both one-sided and two-sided convexity testing algorithms in this framework. For constant c, our results show that the sample complexity of one-sided convexity testing is exponential in n, while for two-sided convexity testing it is exponential in the square root of n

    Inference Based on Conditional Moment Inequalities

    Get PDF
    In this paper, we propose an instrumental variable approach to constructing confidence sets (CS's) for the true parameter in models defined by conditional moment inequalities/equalities. We show that by properly choosing instrument functions, one can transform conditional moment inequalities/equalities into unconditional ones without losing identification power. Based on the unconditional moment inequalities/equalities, we construct CS's by inverting Cramer-von Mises-type or Kolmogorov-Smirnov-type tests. Critical values are obtained using generalized moment selection (GMS) procedures. We show that the proposed CS's have correct uniform asymptotic coverage probabilities. New methods are required to establish these results because an infinite-dimensional nuisance parameter affects the asymptotic distributions. We show that the tests considered are consistent against all fixed alternatives and have power against some n^{-1/2}-local alternatives, though not all such alternatives. Monte Carlo simulations for three different models show that the methods perform well in finite samples.Asymptotic size, asymptotic power, conditional moment inequalities, confidence set, Cramer-von Mises, generalized moment selection, Kolmogorov-Smirnov, moment inequalities

    Nonparametric estimation of an additive quantile regression model

    Get PDF
    This paper is concerned with estimating the additive components of a nonparametric additive quantile regression model. We develop an estimator that is asymptotically normally distributed with a rate of convergence in probability of n-r/(2r+1) when the additive components are r-times continuously differentiable for some r = 2. This result holds regardless of the dimension of the covariates and, therefore, the new estimator has no curse of dimensionality. In addition, the estimator has an oracle property and is easily extended to a generalized additive quantile regression model with a link function. The numerical performance and usefulness of the estimator are illustrated by Monte Carlo experiments and an empirical example.

    Nonparametric estimation of an additive quantile regression model

    Get PDF
    This paper is concerned with estimating the additive components of a nonparametric additive quantile regression model. We develop an estimator that is asymptotically normally distributed with a rate of convergence in probability of n^{-r/(2+10)} when the additive components are r-times continuously differentiable for some r\geq2. This result holds regardless of the dimension of the covariates and, therefore, the new estimator has no curse of dimensionality. In addition, the estimator has an oracle property and is easily extended to a generalized additive quantile regression model with a link function. The numerical performance and usefulness of the estimator are illustrated by Monte Carlo experiments and an empirical example

    Optimal testing for properties of distributions

    Get PDF
    Given samples from an unknown discrete distribution p, is it possible to distinguish whether p belongs to some class of distributions C versus p being far from every distribution in C? This fundamental question has received tremendous attention in statistics, focusing primarily on asymptotic analysis, as well as in information theory and theoretical computer science, where the emphasis has been on small sample size and computational complexity. Nevertheless, even for basic properties of discrete distributions such as monotonicity, independence, logconcavity, unimodality, and monotone-hazard rate, the optimal sample complexity is unknown. We provide a general approach via which we obtain sample-optimal and computationally efficient testers for all these distribution families. At the core of our approach is an algorithm which solves the following problem: Given samples from an unknown distribution p, and a known distribution q, are p and q close in x[superscript 2]-distance, or far in total variation distance? The optimality of our testers is established by providing matching lower bounds, up to constant factors. Finally, a necessary building block for our testers and an important byproduct of our work are the first known computationally efficient proper learners for discrete log-concave, monotone hazard rate distributions
    corecore