10,975 research outputs found

    Vector Quantile Regression: An Optimal Transport Approach

    Full text link
    We propose a notion of conditional vector quantile function and a vector quantile regression. A \emph{conditional vector quantile function} (CVQF) of a random vector YY, taking values in Rd\mathbb{R}^d given covariates Z=zZ=z, taking values in R\mathbb{R}% ^k, is a map uQYZ(u,z)u \longmapsto Q_{Y\mid Z}(u,z), which is monotone, in the sense of being a gradient of a convex function, and such that given that vector UU follows a reference non-atomic distribution FUF_U, for instance uniform distribution on a unit cube in Rd\mathbb{R}^d, the random vector QYZ(U,z)Q_{Y\mid Z}(U,z) has the distribution of YY conditional on Z=zZ=z. Moreover, we have a strong representation, Y=QYZ(U,Z)Y = Q_{Y\mid Z}(U,Z) almost surely, for some version of UU. The \emph{vector quantile regression} (VQR) is a linear model for CVQF of YY given ZZ. Under correct specification, the notion produces strong representation, Y=β(U)f(Z)Y=\beta \left(U\right) ^\top f(Z), for f(Z)f(Z) denoting a known set of transformations of ZZ, where uβ(u)f(Z)u \longmapsto \beta(u)^\top f(Z) is a monotone map, the gradient of a convex function, and the quantile regression coefficients uβ(u)u \longmapsto \beta(u) have the interpretations analogous to that of the standard scalar quantile regression. As f(Z)f(Z) becomes a richer class of transformations of ZZ, the model becomes nonparametric, as in series modelling. A key property of VQR is the embedding of the classical Monge-Kantorovich's optimal transportation problem at its core as a special case. In the classical case, where YY is scalar, VQR reduces to a version of the classical QR, and CVQF reduces to the scalar conditional quantile function. An application to multiple Engel curve estimation is considered

    Kernel methods in machine learning

    Full text link
    We review machine learning methods employing positive definite kernels. These methods formulate learning and estimation problems in a reproducing kernel Hilbert space (RKHS) of functions defined on the data domain, expanded in terms of a kernel. Working in linear spaces of function has the benefit of facilitating the construction and analysis of learning algorithms while at the same time allowing large classes of functions. The latter include nonlinear functions as well as functions defined on nonvectorial data. We cover a wide range of methods, ranging from binary classifiers to sophisticated methods for estimation with structured data.Comment: Published in at http://dx.doi.org/10.1214/009053607000000677 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    AN EXHAUSTIVE COEFFICIENT OF RANK CORRELATION

    Get PDF
    Rank association is a fundamental tool for expressing dependence in cases in which data are arranged in order. Measures of rank correlation have been accumulated in several contexts for more than a century and we were able to cite more than thirty of these coefficients, from simple ones to relatively complicated definitions invoking one or more systems of weights. However, only a few of these can actually be considered to be admissible substitutes for Pearson’s correlation. The main drawback with the vast majority of coefficients is their “resistance-tochange” which appears to be of limited value for the purposes of rank comparisons that are intrinsically robust. In this article, a new nonparametric correlation coefficient is defined that is based on the principle of maximization of a ratio of two ranks. In comparing it with existing rank correlations, it was found to have extremely high sensitivity to permutation patterns. We have illustrated the potential improvement that our index can provide in economic contexts by comparing published results with those obtained through the use of this new index. The success that we have had suggests that our index may have important applications wherever the discriminatory power of the rank correlation coefficient should be particularly strong.Ordinal data, Nonparametric agreement, Economic applications

    High-dimensional estimation with geometric constraints

    Full text link
    Consider measuring an n-dimensional vector x through the inner product with several measurement vectors, a_1, a_2, ..., a_m. It is common in both signal processing and statistics to assume the linear response model y_i = + e_i, where e_i is a noise term. However, in practice the precise relationship between the signal x and the observations y_i may not follow the linear model, and in some cases it may not even be known. To address this challenge, in this paper we propose a general model where it is only assumed that each observation y_i may depend on a_i only through . We do not assume that the dependence is known. This is a form of the semiparametric single index model, and it includes the linear model as well as many forms of the generalized linear model as special cases. We further assume that the signal x has some structure, and we formulate this as a general assumption that x belongs to some known (but arbitrary) feasible set K. We carefully detail the benefit of using the signal structure to improve estimation. The theory is based on the mean width of K, a geometric parameter which can be used to understand its effective dimension in estimation problems. We determine a simple, efficient two-step procedure for estimating the signal based on this model -- a linear estimation followed by metric projection onto K. We give general conditions under which the estimator is minimax optimal up to a constant. This leads to the intriguing conclusion that in the high noise regime, an unknown non-linearity in the observations does not significantly reduce one's ability to determine the signal, even when the non-linearity may be non-invertible. Our results may be specialized to understand the effect of non-linearities in compressed sensing.Comment: This version incorporates minor revisions suggested by referee
    corecore