138 research outputs found
Alternative approaches to multilevel modelling of survey noncontact and refusal
We review three alternative approaches to modelling survey noncontact and refusal: multinomial, sequential and sample selection (bivariate probit) models. We then propose a multilevel extension of the sample selection model to allow for both interviewer effects and dependency between noncontact and refusal rates at the household and interviewer level. All methods are applied and compared in an analysis of household nonresponse in the UK, using a dataset with unusually rich information on both respondents and nonrespondents from six major surveys. After controlling for household characteristics, there is little evidence of residual correlation between the unobserved characteristics affecting noncontact and refusal propensities at either the household or the interviewer level. We also find that the estimated coefficients of the multinomial and sequential models are surprisingly similar, which further investigation via a simulation study suggests is due to there being little overlap between the predictors of noncontact and refusal
Multilevel modelling of refusal and noncontact nonresponse in household surveys: evidence from six UK government surveys
This paper analyses household unit nonresponse and interviewer effects in six major UK government surveys using a multilevel multinomial modelling approach. The models are guided by current conceptual frameworks and theories of survey participation. One key feature of the analysis is the investigation of survey dependent and independent effects of household and interviewer characteristics, providing an empirical exploration of the leverage-salience theory. The analysis is based on the 2001 UK Census Link Study, a unique data source containing an unusually rich set of auxiliary variables, linking the response outcome of six surveys to census data, interviewer observation data and interviewer information, available for respondents and nonrespondents
Estimation of the Distribution of Hourly Pay from Household Survey Data: The Use of Missing Data Methods to Handle Measurement Error
Measurement errors in survey data on hourly pay may lead to serious upward bias in low pay estimates. We consider how to correct for this bias when auxiliary accurately measured data are available for a subsample. An application to the UK Labour Force Survey is described. The use of fractional imputation, nearest neighbour imputation, predictive mean matching and propensity score weighting are considered. Properties of point estimators are compared both theoretically and by simulation. A fractional predictive mean matching imputation approach is advocated. It performs similarly to propensity score weighting, but displays slight advantages of robustness and efficiency.
Contraceptive confidence and timing of first birth in Moldova: an event history analysis of retrospective data
Objectives: To test the contraceptive confidence hypothesis in a modern context. The hypothesis is that women using effective or modern contraceptive methods have increased contraceptive confidence and hence a shorter interval between marriage and first birth than users of ineffective or traditional methods. We extend the hypothesis to incorporate the role of abortion, arguing that it acts as a substitute for contraception in the study context.Setting: Moldova, a country in South-East Europe. Moldova exhibits high use of traditional contraceptive methods and abortion compared with other European countries.Participants: Data are from a secondary analysis of the 2005 Moldovan Demographic and Health Survey, a nationally representative sample survey. 5377 unmarried women were selected.Primary and secondary outcome measures: The outcome measure was the interval between marriage and first birth. This was modelled using a piecewise-constant hazard regression, with abortion and contraceptive method types as primary variables along with relevant sociodemographic controls.Results: Women with high contraceptive confidence (modern method users) have a higher cumulative hazard of first birth 36?months following marriage (0.88 (0.87 to 0.89)) compared with women with low contraceptive confidence (traditional method users, cumulative hazard: 0.85 (0.84 to 0.85)). This is consistent with the contraceptive confidence hypothesis. There is a higher cumulative hazard of first birth among women with low (0.80 (0.79 to 0.80)) and moderate abortion propensities (0.76 (0.75 to 0.77)) than women with no abortion propensity (0.73 (0.72 to 0.74)) 24?months after marriage.Conclusions: Effective contraceptive use tends to increase contraceptive confidence and is associated with a shorter interval between marriage and first birth. Increased use of abortion also tends to increase contraceptive confidence and shorten birth duration, although this effect is non-linear—women with a very high use of abortion tend to have lengthy intervals between marriage and first birth
The correlates of natural method use in Moldova: is natural method use associated with poverty and isolation?
Natural method use is often associated with high levels of unwanted births and induced abortions. This study investigates the correlates of natural method use in Moldova, a country with one of the highest proportions of natural contraceptive users in Europe. We hypothesize that economic and spatial disadvantage increase the reliance on natural methods whereas exposure to FP programs decreases the probability of natural method use. The analysis considers a sub-sample of 5860 sexually-active women from the 2005 Demographic and Health Survey. Results from multilevel multinomial models, controlling for relevant characteristics and data structure, show that economic disadvantage increases the probability of natural method use; but the overall effect is small. Higher FP media exposure reduces natural method use; however this effect attenuates with age. We conclude that FP efforts directed towards the poorest may have limited impact, but interventions targeted at older women could reduce the burden of unwanted pregnancies
Which Schools and Pupils Respond to Educational Achievement Surveys? A Focus on the English PISA Sample
Using logistic and multilevel logistic modelling we examine non-response at the school and pupil level to the important educational achievement survey Programme for International Student Assessment (PISA) for England. The analysis exploits unusually rich auxiliary information on all schools and pupils sampled for PISA whether responding or not, including data from two large-scale administrative sources on pupils’ results in national public exams, which correlate highly with the PISA target variable. Results show that characteristics associated with non-response differ between the school and pupil levels. The findings have important implications for the survey design of education data.JRC.DDG.01-Econometrics and applied statistic
Modelling final outcome and length of call sequence to improve efficiency in interviewer call scheduling
Survey practitioners are increasingly interested in how best to use paradata to improve data collection processes. One particular question is if it is possible to identify early on during fieldwork sample cases that may require a long time, and therefore a lot of financial and staff resources, until interviewing is completed. More specifically, we aim to identify cases with long unsuccessful call sequences. This paper models call record data predicting final call outcome and length of a call sequence. Separate binary and joint multinomial logistic models for the two outcomes are presented, accounting for the clustering of households within interviewers. Of particular interest is to identify explanatory variables that predict final outcome and length of a call sequence. The study uses data from Understanding Society, a large-scale UK longitudinal survey. The work has implications for responsive and adaptive survey designs. The results indicate that modelling outcome and length of a call sequence jointly improves the fit of the model. Outcomes of previous calls, in particular from the most recent call, are highly predictive. The timing of calls and interviewer observation variables, although significant in the models, only slightly improve the predictive power
Using prior wave information and paradata: Can they help to predict response outcomes and call sequence length in a longitudinal study?
In recent years the use of paradata for nonresponse investigations has risen significantly. One key question is how useful paradata, including call record data and interviewer observations, from the current and previous waves of a longitudinal study, as well as previous wave survey information, are in predicting response outcomes in a longitudinal context. This paper aims to address this question. Final response outcome and sequence length (the number of calls/visits to a household) are modelled both separately and jointly for a longitudinal study. Being able to predict length of call sequence and response can help to improve both adaptive and responsive survey designs and to increase efficiency and effectiveness of call scheduling. The paper also identifies the impact of different methodological specifications of the models, for example different specifications of the response outcomes. Latent class analysis is used as one of the approaches to summarise call outcomes in sequences. To assess and compare the models in their ability to predict, indicators derived from classification tables, ROC (Receiver Operating Curves), discrimination and prediction are proposed in addition to the standard approach of using the pseudo R2 value, which is not a sufficient indicator on its own. The study uses data from Understanding Society, a large-scale longitudinal survey in the UK. The findings indicate that basic models (including geographic, design and survey data from the previous wave), although commonly used in predicting and adjusting for nonresponse, do not predict the response outcome well. Conditioning on previous wave paradata, including call record data, interviewer observation data and indicators of change, improve the fit of the models. A significant improvement can be observed when conditioning on the most recent call outcome, which may indicate that the nonresponse process predominantly depends on the most current circumstances of a sample unit
NCRM Methods Review Papers, NCRM/002. Imputation Methods for Handling Item - Nonresponse in the Social Sciences: A Methodological Review
Missing data are often a problem in social science data. Imputation methods fill in the missing responses and lead, under certain conditions, to valid inference. This article reviews several imputation methods used in the social sciences and discusses advantages and disadvantages of these methods in practice. Simpler imputation methods as well as more advanced methods, such as fractional and multiple imputation, are considered. The paper introduces the reader new to the imputation literature to key ideas and methods. For those already familiar with imputation methods the paper highlights some new developments and clarifies some recent misconceptions in the use of
imputation methods. The emphasis is on efficient hot deck imputation methods, implemented in either multiple or fractional imputation approaches. Software packages for using imputation methods in practice are reviewed highlighting newer developments. The paper discusses an
example from the social sciences in detail, applying several imputation methods to a missing earnings variable. The objective is to illustrate how to choose between methods in a real data example. A simulation study evaluates various imputation methods, including predictive mean matching, fractional and multiple imputation. Certain forms of fractional and multiple hot deck methods are found to perform well with regards to bias and efficiency of a point estimator and robustness against model misspecifications. Standard parametric imputation methods are not found adequate for the application considered
The interviewer contribution to variability in response times in face-to-face interview surveys
Survey researchers have consistently found that interviewers make a small but systematic contribution to variability in response times. However, we know little about what the characteristics of interviewers are that lead to this effect. In this study, we address this gap in understanding by linking item-level response times from wave 3 of the UK Household Longitudinal Survey (UKHLS) to data from an independently conducted survey of interviewers. The linked data file contains over three million records and has a complex, hierarchical structure with response latencies nested within respondents and questions, which are themselves nested within interviewers and areas. We propose the use of a cross-classified mixed-effects location scale model to allow for the decomposition of the joint effects on response times of interviewers, areas, questions, and respondents. We evaluate how interviewer demographic characteristics, personality, and attitudes to surveys and to interviewing affect the length of response latencies and present a new method for producing interviewer-specific intra-class correlations of response times. Hence, the study makes both methodological and substantive contributions to the investigation of response times
- …