18,181 research outputs found

    Making Risk Minimization Tolerant to Label Noise

    Full text link
    In many applications, the training data, from which one needs to learn a classifier, is corrupted with label noise. Many standard algorithms such as SVM perform poorly in presence of label noise. In this paper we investigate the robustness of risk minimization to label noise. We prove a sufficient condition on a loss function for the risk minimization under that loss to be tolerant to uniform label noise. We show that the 010-1 loss, sigmoid loss, ramp loss and probit loss satisfy this condition though none of the standard convex loss functions satisfy it. We also prove that, by choosing a sufficiently large value of a parameter in the loss function, the sigmoid loss, ramp loss and probit loss can be made tolerant to non-uniform label noise also if we can assume the classes to be separable under noise-free data distribution. Through extensive empirical studies, we show that risk minimization under the 010-1 loss, the sigmoid loss and the ramp loss has much better robustness to label noise when compared to the SVM algorithm

    Fast calibrated additive quantile regression

    Full text link
    We propose a novel framework for fitting additive quantile regression models, which provides well calibrated inference about the conditional quantiles and fast automatic estimation of the smoothing parameters, for model structures as diverse as those usable with distributional GAMs, while maintaining equivalent numerical efficiency and stability. The proposed methods are at once statistically rigorous and computationally efficient, because they are based on the general belief updating framework of Bissiri et al. (2016) to loss based inference, but compute by adapting the stable fitting methods of Wood et al. (2016). We show how the pinball loss is statistically suboptimal relative to a novel smooth generalisation, which also gives access to fast estimation methods. Further, we provide a novel calibration method for efficiently selecting the 'learning rate' balancing the loss with the smoothing priors during inference, thereby obtaining reliable quantile uncertainty estimates. Our work was motivated by a probabilistic electricity load forecasting application, used here to demonstrate the proposed approach. The methods described here are implemented by the qgam R package, available on the Comprehensive R Archive Network (CRAN)

    Conditional Transformation Models

    Full text link
    The ultimate goal of regression analysis is to obtain information about the conditional distribution of a response given a set of explanatory variables. This goal is, however, seldom achieved because most established regression models only estimate the conditional mean as a function of the explanatory variables and assume that higher moments are not affected by the regressors. The underlying reason for such a restriction is the assumption of additivity of signal and noise. We propose to relax this common assumption in the framework of transformation models. The novel class of semiparametric regression models proposed herein allows transformation functions to depend on explanatory variables. These transformation functions are estimated by regularised optimisation of scoring rules for probabilistic forecasts, e.g. the continuous ranked probability score. The corresponding estimated conditional distribution functions are consistent. Conditional transformation models are potentially useful for describing possible heteroscedasticity, comparing spatially varying distributions, identifying extreme events, deriving prediction intervals and selecting variables beyond mean regression effects. An empirical investigation based on a heteroscedastic varying coefficient simulation model demonstrates that semiparametric estimation of conditional distribution functions can be more beneficial than kernel-based non-parametric approaches or parametric generalised additive models for location, scale and shape

    Selective machine learning of doubly robust functionals

    Full text link
    While model selection is a well-studied topic in parametric and nonparametric regression or density estimation, selection of possibly high-dimensional nuisance parameters in semiparametric problems is far less developed. In this paper, we propose a selective machine learning framework for making inferences about a finite-dimensional functional defined on a semiparametric model, when the latter admits a doubly robust estimating function and several candidate machine learning algorithms are available for estimating the nuisance parameters. We introduce two new selection criteria for bias reduction in estimating the functional of interest, each based on a novel definition of pseudo-risk for the functional that embodies the double robustness property and thus is used to select the pair of learners that is nearest to fulfilling this property. We establish an oracle property for a multi-fold cross-validation version of the new selection criteria which states that our empirical criteria perform nearly as well as an oracle with a priori knowledge of the pseudo-risk for each pair of candidate learners. We also describe a smooth approximation to the selection criteria which allows for valid post-selection inference. Finally, we apply the approach to model selection of a semiparametric estimator of average treatment effect given an ensemble of candidate machine learners to account for confounding in an observational study

    Choosing the Right Spatial Weighting Matrix in a Quantile Regression Model

    Get PDF
    This paper proposes computationally tractable methods for selecting the appropriate spatial weighting matrix in the context of a spatial quantile regression model. This selection is a notoriously difficult problem even in linear spatial models and is even more difficult in a quantile regression setup. The proposal is illustrated by an empirical example and manages to produce tractable models. One important feature of the proposed methodology is that by allowing different degrees and forms of spatial dependence across quantiles it further relaxes the usual quantile restriction attributable to the linear quantile regression. In this way we can obtain a more robust, with regard to potential functional misspecification, model, but nevertheless preserve the parametric rate of convergence and the established inferential apparatus associated with the linear quantile regression approach
    corecore