31 research outputs found
Improving the estimation of the odds ratio in sampling surveys using auxiliary information
The odds-ratio measure is widely used in Health and Social surveys where the aim is to compare the odds of a certain event between a population at risk and a population not at risk. It can be defined using logistic regression through an estimating equation that allows a generalization to continuous risk variable. Data from surveys need to be analyzed in a proper way by taking into account the survey weights. Because the odds-ratio is a complex parameter, the analyst has to circumvent some difficulties when estimating confidence intervals. The present paper suggests a nonparametric approach that can take advantage of some auxiliary information in order to improve on the precision of the odds-ratio estimator. The approach consists in B-spline modelling which can handle the nonlinear structure of the parameter in a exible way and is easy to implement. The variance estimation issue is solved through a linearization approach and confidence intervals are derived. Two small applications are discussed
Improving the estimation of the odds ratio in sampling surveys using auxiliary information
The odds-ratio measure is widely used in Health and Social surveys where the aim is to compare the odds of a certain event between a population at risk and a population not at risk. It can be defined using logistic regression through an estimating equation that allows a generalization to continuous risk variable. Data from surveys need to be analyzed in a proper way by taking into account the survey weights. Because the odds-ratio is a complex parameter, the analyst has to circumvent some difficulties when estimating confidence intervals. The present paper suggests a nonparametric approach that can take advantage of some auxiliary information in order to improve on the precision of the odds-ratio estimator. The approach consists in B-spline modelling which can handle the nonlinear structure of the parameter in a exible way and is easy to implement. The variance estimation issue is solved through a linearization approach and confidence intervals are derived. Two small applications are discussed
Uniform convergence and asymptotic confidence bands for model-assisted estimators of the mean of sampled functional data
Revised version for the Electronic Journal of StatisticsInternational audienceWhen the study variable is functional and storage capacities are limited or transmission costs are high, selecting with survey sampling techniques a small fraction of the observations is an interesting alternative to signal compression techniques, particularly when the goal is the estimation of simple quantities such as means or totals. We extend, in this functional framework, model-assisted estimators with linear regression models that can take account of auxiliary variables whose totals over the population are known. We first show, under weak hypotheses on the sampling design and the regularity of the trajectories, that the estimator of the mean function as well as its variance estimator are uniformly consistent. Then, under additional assumptions, we prove a functional central limit theorem and we assess rigorously a fast technique based on simulations of Gaussian processes which is employed to build asymptotic confidence bands. The accuracy of the variance function estimator is evaluated on a real dataset of sampled electricity consumption curves measured every half an hour over a period of one week
Estimating with kernel smoothers the mean of functional data in a finite population setting. A note on variance estimation in presence of partially observed trajectories
In the near future, millions of load curves measuring the electricity
consumption of French households in small time grids (probably half hours) will
be available. All these collected load curves represent a huge amount of
information which could be exploited using survey sampling techniques. In
particular, the total consumption of a specific cus- tomer group (for example
all the customers of an electricity supplier) could be estimated using unequal
probability random sampling methods. Unfortunately, data collection may undergo
technical problems resulting in missing values. In this paper we study a new
estimation method for the mean curve in the presence of missing values which
consists in extending kernel estimation techniques developed for longitudinal
data analysis to sampled curves. Three nonparametric estimators that take
account of the missing pieces of trajectories are suggested. We also study
pointwise variance estimators which are based on linearization techniques. The
particular but very important case of stratified sampling is then specifically
studied. Finally, we discuss some more practical aspects such as choosing the
bandwidth values for the kernel and estimating the probabilities of observation
of the trajectories.Comment: Version revised for Statistics and Probability Letter