21,219 research outputs found

    Quantile and Probability Curves Without Crossing

    Get PDF
    This paper proposes a method to address the longstanding problem of lack of monotonicity in estimation of conditional and structural quantile functions, also known as the quantile crossing problem. The method consists in sorting or monotone rearranging the original estimated non-monotone curve into a monotone rearranged curve. We show that the rearranged curve is closer to the true quantile curve in finite samples than the original curve, establish a functional delta method for rearrangement-related operators, and derive functional limit theory for the entire rearranged curve and its functionals. We also establish validity of the bootstrap for estimating the limit law of the the entire rearranged curve and its functionals. Our limit results are generic in that they apply to every estimator of a monotone econometric function, provided that the estimator satisfies a functional central limit theorem and the function satisfies some smoothness conditions. Consequently, our results apply to estimation of other econometric functions with monotonicity restrictions, such as demand, production, distribution, and structural distribution functions. We illustrate the results with an application to estimation of structural quantile functions using data on Vietnam veteran status and earnings.Comment: 29 pages, 4 figure

    Classification with unknown class-conditional label noise on non-compact feature spaces

    Get PDF
    We investigate the problem of classification in the presence of unknown class-conditional label noise in which the labels observed by the learner have been corrupted with some unknown class dependent probability. In order to obtain finite sample rates, previous approaches to classification with unknown class-conditional label noise have required that the regression function is close to its extrema on sets of large measure. We shall consider this problem in the setting of non-compact metric spaces, where the regression function need not attain its extrema. In this setting we determine the minimax optimal learning rates (up to logarithmic factors). The rate displays interesting threshold behaviour: When the regression function approaches its extrema at a sufficient rate, the optimal learning rates are of the same order as those obtained in the label-noise free setting. If the regression function approaches its extrema more gradually then classification performance necessarily degrades. In addition, we present an adaptive algorithm which attains these rates without prior knowledge of either the distributional parameters or the local density. This identifies for the first time a scenario in which finite sample rates are achievable in the label noise setting, but they differ from the optimal rates without label noise

    Maximum Lqq-likelihood estimation

    Full text link
    In this paper, the maximum Lqq-likelihood estimator (MLqqE), a new parameter estimator based on nonextensive entropy [Kibernetika 3 (1967) 30--35] is introduced. The properties of the MLqqE are studied via asymptotic analysis and computer simulations. The behavior of the MLqqE is characterized by the degree of distortion qq applied to the assumed model. When qq is properly chosen for small and moderate sample sizes, the MLqqE can successfully trade bias for precision, resulting in a substantial reduction of the mean squared error. When the sample size is large and qq tends to 1, a necessary and sufficient condition to ensure a proper asymptotic normality and efficiency of MLqqE is established.Comment: Published in at http://dx.doi.org/10.1214/09-AOS687 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Cutset Sampling for Bayesian Networks

    Full text link
    The paper presents a new sampling methodology for Bayesian networks that samples only a subset of variables and applies exact inference to the rest. Cutset sampling is a network structure-exploiting application of the Rao-Blackwellisation principle to sampling in Bayesian networks. It improves convergence by exploiting memory-based inference algorithms. It can also be viewed as an anytime approximation of the exact cutset-conditioning algorithm developed by Pearl. Cutset sampling can be implemented efficiently when the sampled variables constitute a loop-cutset of the Bayesian network and, more generally, when the induced width of the networks graph conditioned on the observed sampled variables is bounded by a constant w. We demonstrate empirically the benefit of this scheme on a range of benchmarks
    corecore