Search CORE

99 research outputs found

Efficiency of propensity score adjustment and calibration on the estimation from non-probabilistic online surveys

Author: Ferri-García Ramón
Rueda María del Mar
Publication venue: Institut d'Estadística de Catalunya
Publication date: 21/12/2018
Field of study

Peer Reviewe

UPCommons. Portal del coneixement obert de la UPC

Randomized response estimation in multiple frame surveys

Author: Rueda García María Del Mar
Publication venue: 'Informa UK Limited'
Publication date: 31/05/2018
Field of study

Large scale surveys are increasingly delving into sensitive topics such as gambling, alcoholism, drug use, sexual behavior, domestic violence. Sensitive, stigmatizing or even incriminating themes are difficult to investigate by using standard datacollection techniques since respondents are generally reluctant to release information which concern their personal sphere. Further, such topics usually pertain elusive population (e.g., irregular immigrants and homeless, alcoholics, drug users, rape and sexual assault victims) which are difficult to sample since not adequately covered in a single sampling frame. On the other hand, researchers often utilize more than one data-collection mode (i.e., mixed-mode surveys) in order to increase response rates and/or improve coverage of the population of interest. Surveying sensitive and elusive populations and mixed-mode researches are strictly connected with multiple frame surveys which are becoming widely used to decrease bias due to undercoverage of the target population. In this work, we combine sensitive research and multiple frame surveys. In particular, we consider statistical techniques for handling sensitive data coming from multiple frame surveys using complex sampling designs. Our aim is to estimate the mean of a sensitive variable connected to undesirable behaviors when data are collected by using the randomized response theory. Some estimators are constructed and their properties theoretically investigated. Variance estimation is also discussed by means of the jackknife technique. Finally, a Monte Carlo simulation study is conducted to evaluate the performance of the proposed estimators and the accuracy of variance estimation..Ministerio de Economía y CompetitividadFPU grant programConsejería de Empleo, Empresa y Comercio, Junta de Andalucí

Repositorio Institucional Universidad de Granada

Comments on: Deville and Särndal’s calibration: revisiting a 25 years old successful optimization problem

Author: Rueda García María Del Mar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2019
Field of study

Ministerio de Economía y Competitivida

Repositorio Institucional Universidad de Granada

Propensity score adjustment using machine learning classification algorithms to control selection bias in online surveys

Author: Ferri García Ramón
Rueda García María Del Mar
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2020
Field of study

Modern survey methods may be subject to non-observable bias, from various sources. Among online surveys, for example, selection bias is prevalent, due to the sampling mechanism commonly used, whereby participants self-select from a subgroup whose characteristics differ from those of the target population. Several techniques have been proposed to tackle this issue. One such is Propensity Score Adjustment (PSA), which is widely used and has been analysed in various studies. The usual method of estimating the propensity score is logistic regression, which requires a reference probability sample in addition to the online nonprobability sample. The predicted propensities can be used for reweighting using various estimators. However, in the online survey context, there are alternatives that might outperform logistic regression regarding propensity estimation. The aim of the present study is to determine the efficiency of some of these alternatives, involving Machine Learning (ML) classification algorithms. PSA is applied in two simulation scenarios, representing situations commonly found in online surveys, using logistic regression and ML models for propensity estimation. The results obtained show that ML algorithms remove selection bias more effectively than logistic regression when used for PSA, but that their efficacy depends largely on the selection mechanism employed and the dimensionality of the data.This study was partially supported by Ministerio de Economía y Competitividad, Spain [grant number MTM2015-63609-R] and, in terms of the first author, a FPU grant from the Ministerio de Ciencia, Innovacio´n y Universidades, Spain. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

Directory of Open Access Journals

Repositorio Institucional Universidad de Granada

Efficiency of propensity score adjustment and calibration on the estimation from non-probabilistic online surveys

Author: Ferri García Ramón
Rueda García María Del Mar
Publication venue
Publication date: 01/01/2018
Field of study

One of the main sources of inaccuracy in modern survey techniques, such as online and smartphone surveys, is the absence of an adequate sampling frame that could provide a probabilistic sampling. This kind of data collection leads to the presence of high amounts of bias in final estimates of the survey, specially if the estimated variables (also known as target variables) have some influence on the decision of the respondent to participate in the survey. Various correction techniques, such as calibration and propensity score adjustment or PSA, can be applied to remove the bias. This study attempts to analyse the efficiency of correction techniques in multiple situations, applying a combination of propensity score adjustment and calibration on both types of variables (correlated and not correlated with the missing data mechanism) and testing the use of a reference survey to get the population totals for calibration variables. The study was performed using a simulation of a fictitious population of potential voters and a real volunteer survey aimed to a population for which a complete census was available. Results showed that PSA combined with calibration results in a bias removal considerably larger when compared with calibration with no prior adjustment. Results also showed that using population totals from the estimates of a reference survey instead of the available population data does not make a difference in estimates accuracy, although it can contribute to slightly increment the variance of the estimator

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Repositorio Institucional Universidad de Granada

Revistes Catalanes amb Accés Obert

Repositori Institucional URV

Diposit Digital de Documents de la UAB

Variable selection in Propensity Score Adjustment to mitigate selection bias in online surveys

Author: Ferri García Ramón
Rueda García María Del Mar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/02/2022
Field of study

The development of new survey data collection methods such as online surveys has been particularly advantageous for social studies in terms of reduced costs, immediacy and enhanced questionnaire possibilities. However, many such methods are strongly affected by selection bias, leading to unreliable estimates. Calibration and Propensity Score Adjustment (PSA) have been proposed as methods to remove selection bias in online nonprobability surveys. Calibration requires population totals to be known for the auxiliary variables used in the procedure, while PSA estimates the volunteering propensity of an individual using predictive modelling. The variables included in these models must be carefully selected in order to maximise the accuracy of the final estimates. This study presents an application, using synthetic and real data, of variable selection techniques developed for knowledge discovery in data to choose the best subset of variables for propensity estimation.We also compare the performance of PSA using different classification algorithms, after which calibration is applied. We also present an application of this methodology in a real-world situation, using it to obtain estimates of population parameters. The results obtained show that variable selection using appropriate methods can provide less biased and more efficient estimates than using all available covariatesMinisterio de Ciencia e Innovación, Spain [Grant No. PID2019-106861RBI00/AEI/10.13039/501100011033]. FPU grant from Ministerio de Ciencia, Innovación y Universidades. Funding for open access charge: Universidad de Granada / CBUA Spain. IMAG-Maria de Maeztu CEX2020-001105-M/AEI/10.13039/50110001103

Repositorio Institucional Universidad de Granada

Treating nonresponse in the estimation of the distribution function

Author: Illescas María
Martínez Sergio
Rueda García María Del Mar
Publication venue: 'Elsevier BV'
Publication date: 01/02/2021
Field of study

The estimation of a finite population distribution function is considered when there are missing data. Calibration adjustment is used for dealing with nonresponse at the estimation stage. Several procedures are proposed and compared. A numerical study is carried out to evaluate the performances of estimators. Computational problems with the implementation of the proposed calibration estimators are also considered.Ministerio de Economía y Competitividad of Spai

Repositorio Institucional Universidad de Granada

Repositorio Institucional de la Universidad de Almería (Spain)

Reduction of optimal calibration dimension with a new optimal auxiliary vector for calibrated estimators of the distribution function

Author: Illescas María
Martínez Sergio
Rueda García María Del Mar
Publication venue
Publication date: 01/01/2022
Field of study

The calibration method has been widely used to incorporate auxiliary information in the estimation of various parameters. Specifically, adapted this method to estimate the distribution function, although their proposal is computationally simple, its efficiency depends on the selection of an auxiliary vector of points. This work deals with the problem of selecting the calibration auxiliary vector that minimize the asymptotic variance of the calibration estimator of distribution function. The optimal dimension of the optimal auxiliary vector is reduced considerably with respect to previous studies so that with a smaller set of points the minimum of the asymptotic variance can be reached, which in turn allows to improve the efficiency of the estimates

Repositorio Institucional Universidad de Granada

Repositorio Institucional de la Universidad de Almería (Spain)

The optimization problem of quantile and poverty measures estimation based on calibration

Author: Illescas María
Martínez Sergio
Rueda García María Del Mar
Publication venue: 'Elsevier BV'
Publication date: 12/06/2020
Field of study

New calibrated estimators of quantiles and poverty measures are proposed. These estimators combine the incorporation of auxiliary information provided by auxiliary variables related to the variable of interest by calibration techniques with the selection of optimal calibration points under simple random sampling without replacement. The problem of selecting calibration points that minimize the asymptotic variance of the quantile estimator is addressed. Once the problem is solved, the definition of the new quantile estimator requires that the optimal estimator of the distribution function on which it is based verifies the properties of the distribution function. Through a theorem, the nondecreasing monotony property for the optimal estimator of the distribution function is established and the corresponding optimal estimator can be defined. This optimal quantile estimator is also used to define new estimators for poverty measures. Simulation studies with real data from the Spanish living conditions survey compares the performance of the new estimators against various methods proposed previously, where some resampling techniques are used for the variance estimation. Based on the results of the simulation study, the proposed estimators show a good performance and are a reasonable alternative to other estimators.Ministerio de Educacion y Cienci

Repositorio Institucional Universidad de Granada

Repositorio Institucional de la Universidad de Almería (Spain)

Methods to Counter Self-Selection Bias in Estimations of the Distribution Function and Quantiles

Author: Castro Martín Luis
Rueda García María Del Mar
Publication venue: 'MDPI AG'
Publication date: 12/12/2022
Field of study

Many surveys are performed using non-probability methods such as web surveys, social networks surveys, or opt-in panels. The estimates made from these data sources are usually biased and must be adjusted to make them representative of the target population. Techniques to mitigate this selection bias in non-probability samples often involve calibration, propensity score adjustment, or statistical matching. In this article, we consider the problem of estimating the finite population distribution function in the context of non-probability surveys and show how some methodologies formulated for linear parameters can be adapted to this functional parameter, both theoretically and empirically, thus enhancing the accuracy and efficiency of the estimates made.Spanish Government PID2019-106861RB-I00IMAG-Maria de Maeztu CEX2020-001105-M/AEI/10.13039/501100011033FEDER/Junta de Andalucia-Consejeria de Transformacion Economica, Industria, Conocimiento y Universidades FQM170-UGR2

Repositorio Institucional Universidad de Granada