787,036 research outputs found

    Truthful Linear Regression

    Get PDF
    We consider the problem of fitting a linear model to data held by individuals who are concerned about their privacy. Incentivizing most players to truthfully report their data to the analyst constrains our design to mechanisms that provide a privacy guarantee to the participants; we use differential privacy to model individuals' privacy losses. This immediately poses a problem, as differentially private computation of a linear model necessarily produces a biased estimation, and existing approaches to design mechanisms to elicit data from privacy-sensitive individuals do not generalize well to biased estimators. We overcome this challenge through an appropriate design of the computation and payment scheme.Comment: To appear in Proceedings of the 28th Annual Conference on Learning Theory (COLT 2015

    Bayesian Linear Regression

    Get PDF
    The paper is concerned with Bayesian analysis under prior-data conflict, i.e. the situation when observed data are rather unexpected under the prior (and the sample size is not large enough to eliminate the influence of the prior). Two approaches for Bayesian linear regression modeling based on conjugate priors are considered in detail, namely the standard approach also described in Fahrmeir, Kneib & Lang (2007) and an alternative adoption of the general construction procedure for exponential family sampling models. We recognize that - in contrast to some standard i.i.d. models like the scaled normal model and the Beta-Binomial / Dirichlet-Multinomial model, where prior-data conflict is completely ignored - the models may show some reaction to prior-data conflict, however in a rather unspecific way. Finally we briefly sketch the extension to a corresponding imprecise probability model, where, by considering sets of prior distributions instead of a single prior, prior-data conflict can be handled in a very appealing and intuitive way

    Current status linear regression

    Full text link
    We construct n\sqrt{n}-consistent and asymptotically normal estimates for the finite dimensional regression parameter in the current status linear regression model, which do not require any smoothing device and are based on maximum likelihood estimates (MLEs) of the infinite dimensional parameter. We also construct estimates, again only based on these MLEs, which are arbitrarily close to efficient estimates, if the generalized Fisher information is finite. This type of efficiency is also derived under minimal conditions for estimates based on smooth non-monotone plug-in estimates of the distribution function. Algorithms for computing the estimates and for selecting the bandwidth of the smooth estimates with a bootstrap method are provided. The connection with results in the econometric literature is also pointed out.Comment: 64 pages, 6 figure

    Sublinear expectation linear regression

    Full text link
    Nonlinear expectation, including sublinear expectation as its special case, is a new and original framework of probability theory and has potential applications in some scientific fields, especially in finance risk measure and management. Under the nonlinear expectation framework, however, the related statistical models and statistical inferences have not yet been well established. The goal of this paper is to construct the sublinear expectation regression and investigate its statistical inference. First, a sublinear expectation linear regression is defined and its identifiability is given. Then, based on the representation theorem of sublinear expectation and the newly defined model, several parameter estimations and model predictions are suggested, the asymptotic normality of estimations and the mini-max property of predictions are obtained. Furthermore, new methods are developed to realize variable selection for high-dimensional model. Finally, simulation studies and a real-life example are carried out to illustrate the new models and methodologies. All notions and methodologies developed are essentially different from classical ones and can be thought of as a foundation for general nonlinear expectation statistics

    Linear Regression Diagnostics

    Get PDF
    This paper attempts to provide the user of linear multiple regression with a battery of diagnostic tools to determine which, if any, data points have high leverage or influence on the estimation process and how these possibly discrepant data points differ from the patterns set by the majority of the data. The point of view taken is that when diagnostics indicate the presence of anomolous data, the choice is open as to whether these data are in fact unusual and helpful, or possibly harmful and thus in need of modifications or deletion. The methodology developed depends on differences, derivatives, and decompositions of basic regression statistics. There is also a discussion of how these techniques can be used with robust and ridge estimators. An example is given showing the use of diagnostic methods in the estimation of a cross-country savings rate model.

    Functional linear regression that's interpretable

    Full text link
    Regression models to relate a scalar YY to a functional predictor X(t)X(t) are becoming increasingly common. Work in this area has concentrated on estimating a coefficient function, β(t)\beta(t), with YY related to X(t)X(t) through β(t)X(t)dt\int\beta(t)X(t) dt. Regions where β(t)0\beta(t)\ne0 correspond to places where there is a relationship between X(t)X(t) and YY. Alternatively, points where β(t)=0\beta(t)=0 indicate no relationship. Hence, for interpretation purposes, it is desirable for a regression procedure to be capable of producing estimates of β(t)\beta(t) that are exactly zero over regions with no apparent relationship and have simple structures over the remaining regions. Unfortunately, most fitting procedures result in an estimate for β(t)\beta(t) that is rarely exactly zero and has unnatural wiggles making the curve hard to interpret. In this article we introduce a new approach which uses variable selection ideas, applied to various derivatives of β(t)\beta(t), to produce estimates that are both interpretable, flexible and accurate. We call our method "Functional Linear Regression That's Interpretable" (FLiRTI) and demonstrate it on simulated and real-world data sets. In addition, non-asymptotic theoretical bounds on the estimation error are presented. The bounds provide strong theoretical motivation for our approach.Comment: Published in at http://dx.doi.org/10.1214/08-AOS641 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore