research

Vector Quantile Regression: An Optimal Transport Approach

Abstract

We propose a notion of conditional vector quantile function and a vector quantile regression. A \emph{conditional vector quantile function} (CVQF) of a random vector YY, taking values in Rd\mathbb{R}^d given covariates Z=zZ=z, taking values in R\mathbb{R}% ^k, is a map uQYZ(u,z)u \longmapsto Q_{Y\mid Z}(u,z), which is monotone, in the sense of being a gradient of a convex function, and such that given that vector UU follows a reference non-atomic distribution FUF_U, for instance uniform distribution on a unit cube in Rd\mathbb{R}^d, the random vector QYZ(U,z)Q_{Y\mid Z}(U,z) has the distribution of YY conditional on Z=zZ=z. Moreover, we have a strong representation, Y=QYZ(U,Z)Y = Q_{Y\mid Z}(U,Z) almost surely, for some version of UU. The \emph{vector quantile regression} (VQR) is a linear model for CVQF of YY given ZZ. Under correct specification, the notion produces strong representation, Y=β(U)f(Z)Y=\beta \left(U\right) ^\top f(Z), for f(Z)f(Z) denoting a known set of transformations of ZZ, where uβ(u)f(Z)u \longmapsto \beta(u)^\top f(Z) is a monotone map, the gradient of a convex function, and the quantile regression coefficients uβ(u)u \longmapsto \beta(u) have the interpretations analogous to that of the standard scalar quantile regression. As f(Z)f(Z) becomes a richer class of transformations of ZZ, the model becomes nonparametric, as in series modelling. A key property of VQR is the embedding of the classical Monge-Kantorovich's optimal transportation problem at its core as a special case. In the classical case, where YY is scalar, VQR reduces to a version of the classical QR, and CVQF reduces to the scalar conditional quantile function. An application to multiple Engel curve estimation is considered

    Similar works