1,611 research outputs found

    Kernel conditional quantile estimation via reduction revisited

    Get PDF
    Quantile regression refers to the process of estimating the quantiles of a conditional distribution and has many important applications within econometrics and data mining, among other domains. In this paper, we show how to estimate these conditional quantile functions within a Bayes risk minimization framework using a Gaussian process prior. The resulting non-parametric probabilistic model is easy to implement and allows non-crossing quantile functions to be enforced. Moreover, it can directly be used in combination with tools and extensions of standard Gaussian Processes such as principled hyperparameter estimation, sparsification, and quantile regression with input-dependent noise rates. No existing approach enjoys all of these desirable properties. Experiments on benchmark datasets show that our method is competitive with state-of-the-art approaches.

    The absolute health income hypothesis revisited: A Semiparametric Quantile Regression Approach.

    Get PDF
    This paper uses the 1998-99 Canadian National Population Health Survey (NPHS) data to examine the health-income relationship that underlies the absolute income hypothesis. To allow for nonlinearity and data heterogeneity, we use a partially linear semiparametric quantile regression model. The “absolute income hypothesis” is partially true; the negative aging effects appear more pronounced for the illhealthy population than for the healthy population and when annual income is below 40,000 Canadian dollars.Absolute income hypothesis · Partially linear quantile

    Interpretable statistics for complex modelling: quantile and topological learning

    Get PDF
    As the complexity of our data increased exponentially in the last decades, so has our need for interpretable features. This thesis revolves around two paradigms to approach this quest for insights. In the first part we focus on parametric models, where the problem of interpretability can be seen as a “parametrization selection”. We introduce a quantile-centric parametrization and we show the advantages of our proposal in the context of regression, where it allows to bridge the gap between classical generalized linear (mixed) models and increasingly popular quantile methods. The second part of the thesis, concerned with topological learning, tackles the problem from a non-parametric perspective. As topology can be thought of as a way of characterizing data in terms of their connectivity structure, it allows to represent complex and possibly high dimensional through few features, such as the number of connected components, loops and voids. We illustrate how the emerging branch of statistics devoted to recovering topological structures in the data, Topological Data Analysis, can be exploited both for exploratory and inferential purposes with a special emphasis on kernels that preserve the topological information in the data. Finally, we show with an application how these two approaches can borrow strength from one another in the identification and description of brain activity through fMRI data from the ABIDE project

    Gaussian process quantile regression using expectation propagation

    Get PDF
    Direct quantile regression involves estimating a given quantile of a response variable as a function of input variables. We present a new framework for direct quantile regression where a Gaussian process model is learned, minimising the expected tilted loss function. The integration required in learning is not analytically tractable so to speed up the learning we employ the Expectation Propagation algorithm. We describe how this work relates to other quantile regression methods and apply the method on both synthetic and real data sets. The method is shown to be competitive with state of the art methods whilst allowing for the leverage of the full Gaussian process probabilistic framework

    MOMENT KERNELS FOR T-CENTRAL SUBSPACE

    Get PDF
    The T-central subspace allows one to perform sufficient dimension reduction for any statistical functional of interest. We propose a general estimator using a third moment kernel to estimate the T-central subspace. In particular, in this dissertation we develop sufficient dimension reduction methods for the central mean subspace via the regression mean function and central subspace via Fourier transform, central quantile subspace via quantile estimator and central expectile subsapce via expectile estima- tor. Theoretical results are established and simulation studies show the advantages of our proposed methods

    Deriving probabilistic short-range forecasts from a deterministic high-resolution model

    Get PDF
    In order to take full advantage of short-range forecasts from deterministic high-resolution NWP models, the direct model output must be addressed in a probabilistic framework. A promising approach is mesoscale ensemble prediction. However, its operational use is still hampered by conceptual deficiencies and large computational costs. This study tackles two relevant issues: (1) the representation of model-related forecast uncertainty in mesoscale ensemble prediction systems and (2) the development of post-processing procedures that retrieve additional probabilistic information from a single model simulation. Special emphasis is laid on mesoscale forecast uncertainty of summer precipitation and 2m-temperature in Europe. Source of forecast guidance is the deterministic high-resolution model Lokal-Modell (LM) of the German Weather Service. This study gains more insight into the effect and usefulness of stochastic parametrisation schemes in the representation of short-range forecast uncertainty. A stochastic parametrisation scheme is implemented into the LM in an attempt to simulate the stochastic effect of sub-grid scale processes. Experimental ensembles show that the scheme has a substantial effect on the forecast of precipitation amount. However, objective verification reveals that the ensemble does not attain better forecast goodness than a single LM simulation. Urgent issues for future research are identified. In the context of statistical post-processing, two schemes are designed: the neighbourhood method and wavelet smoothing. Both approaches fall under the framework of estimating a large array of statistical parameters on the basis of a single realisation on each parameter. The neighbourhood method is based on the notion of spatio-temporal ergodicity including explicit corrections for enhanced predictability from topographic forcing. The neighbourhood method derives estimates of quantiles, exceedance probabilities and expected values at each grid point of the LM. If the post-processed precipitation forecast is formulated in terms of probabilities or quantiles, it attains clear superiority in comparison to the raw model output. Wavelet smoothing originates from the field of image denoising and includes concepts of multiresolution analysis and non-parametric regression. In this study, the method is used to produce estimates of the expected value, but it may be easily extended to the additional estimation of exceedance probabilities. Wavelet smoothing is not only computationally more efficient than the neighbourhood method, but automatically adapts the amount of spatial smoothing to local properties of the underlying data. The method apparently detects deterministically predictable temperature patterns on the basis of statistical guidance only
    corecore