143 research outputs found

    Modelling beyond Regression Functions: an Application of Multimodal Regression to Speed-Flow Data

    Get PDF
    An enormous amount of publications deals with smoothing in the sense of nonparametric regression. However, nearly all of the literature treats the case where predictors and response are related in the form of a function y=m(x)+noise. In many situations this simple functional model does not capture adequately the essential relation between predictor and response. We show by means of speed-flow diagrams, that a more general setting may be required, allowing for multifunctions instead of only functions. It turns out that in this case the conditional modes are more appropriate for the estimation of the underlying relation than the commonly used mean or the median. Estimation is achieved using a conditional mean-shift procedure, which is adapted to the present situation

    On weighted local fitting and its relation to the Horvitz-Thompson estimator

    Get PDF
    Weighting is a largely used concept in many fields of statistics and has frequently caused controversies on its justification and profit. In this paper, we analyze a weighted version of the well-known local polynomial regression estimators, derive their asymptotic bias and variance, and find that the conflict between the asymptotically optimal weighting scheme and the practical requirements has a surprising counterpart in sampling theory, leading us back to the discussion on Basu's (1971) elephants

    Local Fitting with General Basis Functions

    Get PDF
    Local polynomial modelling can be seen as a local fit of the data against the basis functions 1, x, ... , x^p. In this paper we extend this method to a wide range of other basis functions. We will focus on the power basis, i.e. a basis which consists of the powers of an arbitrary function, and derive an extended Taylor theorem for this basis. We describe the estimation procedure and calculate asymptotic expressions for bias and variance of this local basis estimator. We apply this method to a simulated data set for various basis functions and propose a data-driven method to find a suitable basis function in each situation

    Localized Regression on Principal Manifolds

    Get PDF
    No abstract available

    Weighted Repeated Median Smoothing and Filtering

    Get PDF
    We propose weighted repeated median filters and smoothers for robust non-parametric regression in general and for robust signal extraction from time series in particular. The proposed methods allow to remove outlying sequences and to preserve discontinuities (shifts) in the underlying regression function (the signal) in the presence of local linear trends. Suitable weighting of the observations according to their distances in the design space reduces the bias arising from non-linearities. It also allows to improve the efficiency of (unweighted) repeated median filters using larger bandwidths, keeping their properties for distinguishing between outlier sequences and long-term shifts. Robust smoothers based on weighted L1- regression are included for the reason of comparison. --Signal extraction ; Robust regression ; Outliers ; Breakdown point

    A number-of-modes reference rule for density estimation under multimodality

    Get PDF
    We consider kernel density estimation for univariate distributions. The question of interest is as follows: given that the data analyst has some background knowledge on the modality of the data (for instance, ‘data of this type are usually bimodal’), what is the adequate bandwidth to choose? We answer this question by extending Silverman's idea of ‘normal-reference’ to that of ‘reference to a Gaussian mixture’. The concept is illustrated in the light of real data examples

    League tables for literacy survey data based on random effect models.

    Get PDF
    Data from the International Adult Literacy Survey are used to illustrate how league tables can be obtained from summary data, consisting of percentages and their standard errors, using random effects models estimated by nonparametric maximum likelihood

    Generative linear mixture modelling.

    Get PDF
    For multivariate data with a low–dimensional latent structure, a novel approach to linear dimension reduction based on Gaussian mixture models is pro- posed. A generative model is assumed for the data, where the mixture centres (or ‘mass points’) are positioned along lines or planes spanned through the data cloud. All involved parameters are estimated simultaneously through the EM al- gorithm, requiring an additional iteration within each M-step. Data points can be projected onto the low–dimensional space by taking the posterior mean over the estimated mass points. The compressed data can then be used for further pro- cessing, for instance as a low–dimensional predictor in a multivariate regression problem

    Challenging the curse of dimensionality in multivariate local linear regression

    Get PDF
    Local polynomial fitting for univariate data has been widely studied and discussed, but up until now the multivariate equivalent has often been deemed impractical, due to the so-called curse of dimensionality. Here, rather than discounting it completely, we use density as a threshold to determine where over a data range reliable multivariate smoothing is possible, whilst accepting that in large areas it is not. The merits of a density threshold derived from the asymptotic influence function are shown using both real and simulated data sets. Further, the challenging issue of multivariate bandwidth selection, which is known to be affected detrimentally by sparse data which inevitably arise in higher dimensions, is considered. In an effort to alleviate this problem, two adaptations to generalized cross-validation are implemented, and a simulation study is presented to support the proposed method. It is also discussed how the density threshold and the adapted generalized cross-validation technique introduced herein work neatly together

    The fitting of multifunctions : an approach to nonparametric multimodal regression.

    Get PDF
    In the last decades a lot of research has been devoted to smoothing in the sense of nonparametric regression. However, this work has nearly exclusively concentrated on fitting regression functions. When the conditional distribution of y|x is multimodal, the assumption of a functional relationship y = m(x) + noise might be too restrictive. We introduce a nonparametric approach to fit multifunctions, allowing to assign a set of output values to a given x. The concept is based on conditional mean shift, which is an easily implemented tool to detect the local maxima of a conditional density function. The methodology is illustrated by environmental data examples
    • …
    corecore