Robustness of censored depth quantiles, PCA and kernel based regression with new tools for model selection

Abstract

In statistics, classical methods often heavily rely on assumptions which cannot always be met in practice. For instance, it is often assumed that the data are generated from a specific underlying distribution. And even if the model assumptions are distribution-free, most methods assume that the sample contains independent and identically distributed observations. However, when outliers are present such methods can perform very poorly. Robust statistics seeks to provide methods that are not unlimitedly affected by outliers. The goal is to learn the structure of the majority of the data, even if a minority of observations disturbs the pattern. In this work robustness is studied in two settings: regression and Principal Component Analysis (PCA). Regression analysis models the relationship between a response variable and a set of explanatory variables (also called covariates). Interest lies in the conditional distribution of the response, conditional on values of the explanatory variables. One can concentrate on estimating certain aspects of this conditional distribution, e.g. the mean, leading to least squares regression. However, in some applications a more detailed description beyond the mean might be useful. Quantile regression aims at estimating all conditional quantiles, thus fully characterizing the conditional distribution. Assuming a linear relationship between response and covariates, linear quantile regression can be performed using anstatus: publishe

    Similar works

    Full text

    thumbnail-image

    Available Versions